- 论坛徽章:
- 0
|
==============================================================================
from: http://www.ibm.com/developerworks/linux/library/l-debug/
Mastering Linux debugging techniques
Key strategies to locate and stomp bugs on Linux
developerWorks
Document options
Set printer orientation to landscape mode
Print this page
Email this page
E-mail this page
Rate this page
Help us improve this content
Level: Intermediate
Steve Best (sbest@us.ibm.com), JFS core team member, IBM
01 Aug 2002
There are various ways to watch a running user-space program: you can run a debugger on it and step through the program, add print statements, or add a tool to analyze the program. This article describes methods you can use to debug programs that run on Linux. We review four scenarios for debugging problems, including segmentation faults, memory overruns and leaks, and hangs. (This article appears in the August 2002 issue of the IBM developerWorks journal. )
This article presents four scenarios for debugging Linux programs. For Scenario 1, we use two sample programs with memory allocation problems that we debug using the MEMWATCH and Yet Another Malloc Debugger (YAMD) tools. Scenario 2 uses the strace utility in Linux that enables the tracing of system calls and signals to identify where a program is failing. Scenario 3 uses the Oops functionality of the Linux kernel to solve a segmentation fault problem and shows you how to set up the kernel source level debugger (kgdb) to solve the same problem using the GNU debugger (gdb); the kgdb program is the Linux kernel remote gdb via serial connection. Scenario 4 displays information for the component that is causing a hang by using a magic key sequence available on Linux.
General debugging strategies
When your program contains a bug, it is likely that somewhere in the code, a condition that you believe to be true is actually false. Finding your bug is a process of confirming what you believe is true until you find something that is false.
The following are examples of the types of things you may believe to be true:
* At a certain point in the source code, a variable has a certain value.
* At a given point, a structure has been set up correctly.
* At a given if-then-else statement, the if part is the path that was executed.
* When the subroutine is called, the routine receives its parameters correctly.
Finding the bug involves confirming all of these things. If you believe that a certain variable should have a specific value when a subroutine is called, check it. If you believe that an if construct is executed, check it. Usually you will confirm your assumptions, but eventually you will find a case where your belief is wrong. As a result, you will know the location of the bug.
Debugging is something that you cannot avoid. There are many ways to go about debugging, such as printing out messages to the screen, using a debugger, or just thinking about the program execution and making an educated guess about the problem.
Before you can fix a bug, you must locate its source. For example, with segmentation faults, you need to know on which line of code the seg fault occurred. Once you find the line of code in question, determine the value of the variables in that method, how the method was called, and specifically why the error occurred. Using a debugger makes finding all of this information simple. If a debugger is not available, there are other tools to use. (Note that a debugger may not be available in a production environment, and the Linux kernel does not have a debugger built in.)
Useful memory and kernel debugging tools
There are various ways to track down user-space and kernel problems using debug tools on Linux. Build and debug your source code with these tools and techniques:
User-space tools:
* Memory tools: MEMWATCH and YAMD
* strace
* GNU debugger (gdb)
* Magic key sequence
Kernel tools:
* Kernel source level debugger (kgdb)
* Built-in kernel debugger (kdb)
* Oops
This article looks at a class of problems that can be difficult to find by visually inspecting code, and these problems may occur only under rare circumstances. Often, a memory error occurs only in a combination of circumstances, and sometimes you can discover memory bugs only after you deploy your program.
Back to top
Scenario 1: Memory debugging tools
As the standard programming language on Linux systems, the C language gives you a great deal of control over dynamic memory allocation. This freedom, however, can lead to significant memory management problems, and these problems can cause programs to crash or degrade over time.
Memory leaks (in which malloc() memory is never released with corresponding free() calls) and buffer overruns (writing past memory that has been allocated for an array, for example) are some of the common problems and can be difficult to detect. This section looks at a few debugging tools that greatly simplify detecting and isolating memory problems.
Back to top
MEMWATCH
MEMWATCH, written by Johan Lindh, is an open source memory error detection tool for C that you can download (see the Resources later in this article). By simply adding a header file to your code and defining MEMWATCH in your gcc statement, you can track memory leaks and corruptions in your program. MEMWATCH supports ANSI C, provides a log of the results, and detects double-frees, erroneous frees, unfreed memory, overflow and underflow, and so on.
Listing 1. Memory sample (test1.c)
#include
#include
#include "memwatch.h"
int main(void)
{
char *ptr1;
char *ptr2;
ptr1 = malloc(512);
ptr2 = malloc(512);
ptr2 = ptr1;
free(ptr2);
free(ptr1);
}
The code in Listing 1 allocates two 512-byte blocks of memory, and then the pointer to the first block is set to the second block. As a result, the address of the second block is lost, and there is a memory leak.
Now compile memwatch.c with Listing 1. The following is an example makefile:
test1
gcc -DMEMWATCH -DMW_STDIO test1.c memwatch
c -o test1
When you run the test1 program, it produces a report of leaked memory. Listing 2 shows the example memwatch.log output file.
Listing 2. test1 memwatch.log file
MEMWATCH 2.67 Copyright (C) 1992-1999 Johan Lindh
...
double-free: test1.c(15), 0x80517b4 was freed from test1.c(14)
...
unfreed: test1.c(11), 512 bytes at 0x80519e4
{FE FE FE FE FE FE FE FE FE FE FE FE ..............}
Memory usage statistics (global):
N)umber of allocations made: 2
L)argest memory usage : 1024
T)otal of all alloc() calls: 1024
U)nfreed bytes totals : 512
MEMWATCH gives you the actual line that has the problem. If you free an already freed pointer, it tells you. The same goes for unfreed memory. The section at the end of the log displays statistics, including how much memory was leaked, how much was used, and the total amount allocated.
Back to top
YAMD
Written by Nate Eldredge, the YAMD package finds dynamic, memory allocation related problems in C and C++. The latest version of YAMD at the time of writing this article was 0.32. Download yamd-0.32.tar.gz (see Resources). Execute a make command to build the program; then execute a make install command to install the program and set up the tool.
Once you have downloaded YAMD, use it on test1.c. Remove the #include memwatch.h and make a small change to the makefile, as shown below:
test1 with YAMD
gcc -g test1.c -o test1
Listing 3 shows the output from YAMD on test1.
Listing 3. test1 output with YAMD
YAMD version 0.32
Executable: /usr/src/test/yamd-0.32/test1
...
INFO: Normal allocation of this block
Address 0x40025e00, size 512
...
INFO: Normal allocation of this block
Address 0x40028e00, size 512
...
INFO: Normal deallocation of this block
Address 0x40025e00, size 512
...
ERROR: Multiple freeing At
free of pointer already freed
Address 0x40025e00, size 512
...
WARNING: Memory leak
Address 0x40028e00, size 512
WARNING: Total memory leaks:
1 unfreed allocations totaling 512 bytes
*** Finished at Tue ... 10:07:15 2002
Allocated a grand total of 1024 bytes 2 allocations
Average of 512 bytes per allocation
Max bytes allocated at one time: 1024
24 K alloced internally / 12 K mapped now / 8 K max
Virtual program size is 1416 K
End.
YAMD shows that we have already freed the memory, and there is a memory leak. Let's try YAMD on another sample program in Listing 4.
Listing 4. Memory code (test2.c)
#include
#include
int main(void)
{
char *ptr1;
char *ptr2;
char *chptr;
int i = 1;
ptr1 = malloc(512);
ptr2 = malloc(512);
chptr = (char *)malloc(512);
for (i; i ]...
... 15:59:37 sfb1 kernel: [>EIP; c01588fc
Code; c01588fc
00000000 :
Code; c01588fc
6: 55 push %ebp
Next, you need to determine which line is causing the problem in jfs_mount. The Oops message tells us that the problem is caused by the instruction at offset 3c. One way you can do this is to use the objdump utility on the jfs_mount.o file and look at offset 3c. Objdump is used to disassemble a module function and see what assembler instructions are created by your C source code. Listing 11 shows what you would see from objdump, and next, we can look at C code for jfs_mount and see that the null was caused by line 109. Offset 3c is important because that is the location that the Oops message identified as the cause of the problem.
Listing 11. Assembler listing of jfs_mount
109 printk("%d\n",*ptr);
objdump jfs_mount.o
jfs_mount.o: file format elf32-i386
Disassembly of section .text:
00000000 :
0:55 push %ebp
...
2c: e8 cf 03 00 00 call 400
31: 89 c3 mov %eax,%ebx
33: 58 pop %eax
34: 85 db test %ebx,%ebx
36: 0f 85 55 02 00 00 jne 291
3c: 8b 2d 00 00 00 00 mov 0x0,%ebp followed by . The magic keystrokes will give a stack trace of the currently running processes and all processes, respectively.
3. Look in your /var/log/messages. If you have everything set up correctly, the system should have converted the symbolic kernel addresses for you. The back trace will be written to the /var/log/messages file.
Back to top
Conclusion
There are many different tools available to help debug programs on Linux. The tools in this article can help you solve many coding problems. Tools that show the location of memory leaks, overruns, and the like can solve memory management problems, and I find MEMWATCH and YAMD helpful.
Using a Linux kernel patch to allow gdb to work on the Linux kernel helped in solving problems on the filesystem that I work on in Linux. In addition, the strace utility helped determine where a filesystem utility had a failure during a system call. Next time you are faced with squashing a bug in Linux, try one of these tools.
Resources
* Download MEMWATCH.
* Read the article "Linux software debugging with GDB". (developerWorks, February 2001)
* Visit the IBM Linux Technology Center.
* Find more Linux articles in the developerWorks Linux zone.
About the author
Steve Best works in the Linux Technology Center of IBM in Austin, Texas. He currently is working on the Journaled File System (JFS) for Linux project. Steve has done extensive work in operating system development with a focus in the areas of file systems, internationalization, and security.
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
==============================================================================
本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u1/47395/showart_1659520.html |
|