Using the command line interface of debug

Session 9: what can I do without -g?

After fixing all of the known bugs in sget you can recompile it without -g:
```
   $ cc -O -o sget sget.c
```
Use it to copy a file into your directory:
```
   $ sget
```
The program is waiting for input. Type in the following:
```
   src/testfile
```

Press <Ctrl-D.> The screen displays the following:

   testfile: 0 characters, 0 words, 2 lines
             average characters per line: 0
             average words per line: 0

How can the file have two lines but zero characters?

Check to see if there is something in testfile:

   $ wc testfile
     2      16      58	testfile
   $ cat testfile
     this is a test case - line 1
     this is a test case - line 2

debug sget as is, even though it was not compiled with -g. debug will warn you that sget doesn't have any debugging information in it:
```
   $ debug sget
```
The debugger displays the following:
```
   Warning: No debugging information in sget
   New program sget (process p1) created
   HALTED p1 [main]
           0x804895d (main+5:)               pushl    $0x8048920
```
Without debugging information, you can't set breakpoints on statements (debug doesn't have the information telling it where the code making up a particular line starts and ends), but you can still set breakpoints on functions.
NOTE: See ``What -g really means'' for more information. Stripping your a.out [either with strip(1) or with the -s option to cc(1)] removes even this basic level of symbolic information.
Set breakpoints on copy (the function that reads the file) and scan_line (the function that counts the characters in the file):
```
   debug> stop copy
```
The debugger displays the following:
```
   Error: No entry "copy" exists
```
debug didn't find copy since it is a static function. You have to tell debug where to look:
```
   debug> stop sget.c@copy
```
The debugger displays the following:
```
   EVENT [1] assigned
   debug> stop sget.c@scan_line
   EVENT [2] assigned
```
NOTE: See ``Global vs. static functions'' for more information.
When you let the program run, it will wait for input. Type in the name of the file you are trying to copy:
```
   debug> run
   src/testfile
```
When it reaches the breakpoint, debug prints the assembly language instruction that is being executed instead of the source line:
```
   STOP EVENT TRIGGERED: sget.c@copy  in p1 [copy in sget.c]
           0x80487e9 (copy+5:)                   pushl  $08048adc
```

Stop before and after reading the file to see if sget is reading it correctly. Set a breakpoint on fgets:

   debug> stop fgets
   EVENT [3] assigned
   debug> run

The debugger displays the following:

   STOP EVENT TRIGGERED: fgets  in p1 [fgets]
           0xbffb7190 (fgets+0:)                  subl    $0x20,%esp

Do a stack trace:

   debug> stack

The debugger displays the following:

Stack Trace for p1, Program sget
*[0] fgets(0x80473d4, 0x400, 0x804a574) [0xbffb7190]
 [1] copy(0x8049c84, 0x8049c84) [0x804887c]
 [2] main(0x1, 0x804780c, 0x8047814)    [0x8048a47]
 [3] _start()   [0x8048690]

Write down the first argument to fgets; that is the address of the buffer where the data is stored. You will need to look at that later to see if the data is correct. Stop after the return from the system call and see if the correct number of bytes were read.
```
   debug> syscall -x read {print -f"bytes read = %d\n" %eax}
```
The debugger displays the following:
```
   EVENT [4] assigned
```
NOTE: See ``System call events'' for more information.
The event you just defined will make debug print the return value right after the system call. The return value from read(2) is the number of bytes read:
```
   debug> run
```
The debugger displays the following:
```
   SYSTEM CALL EXIT 3 (read) in p1 [_read]
           0xbffc9509 (_read+12:)    jae      +0x9 <bffc9514> [ _read ]
   bytes read = 58
```
NOTE: See ``Getting down to the machine level'' for more information.

It read the correct number of bytes. Let it run until it gets to scan_line (and out of fgets). Look at the buffer to see if it is correct:

   debug> run

The debugger displays the following:

STOP EVENT TRIGGERED: sget.c@scan_line in p1 [scan_line in sget.c]
        0x80486b2 (scan_line+2:)       xorl     %edi, %edi
debug> print -f"%s" 0x8047a5c
this is a test case - line 1

The buffer is correct. The program is now at the beginning of scan_line, where it should be counting the number of characters and words in each
line. Indeed, if you disassemble scan_line, you see that it includes code that increments the static variable char_cnt:

   debug> dis

The debugger displays the following:

Disassembly for p1, Program sget
0x80486b2 (scan_line+2:)       xorl     %edi, %edi
0x80486b4 (scan_line+4:)       movl     $0x804a094, %esi
0x80486b9 (scan_line+9:)       jmp      +0x23 <80486de> [ scan_line ]
0x80486bb (scan_line+11:)      incl     0x804a090 [ char_cnt ]
0x80486c1 (scan_line+17:)      cmpb     $0x20, %bl
0x80486c4 (scan_line+20:)      je       +0x5 <80486cb> [ scan_line ]
0x80486c6 (scan_line+22:)      cmpb     $0x9, %bl
0x80486c9 (scan_line+25:)      jne      +0x4 <80486cf> [ scan_line ]
0x80486cb (scan_line+27:)      xorl     %edi, %edi
0x80486cd (scan_line+29:)      jmp      +0xf <80486de> [ scan_line ]

Step through scan_line and see what happens:

   debug> set %verbose = source
   debug> step -i -c 10

The debugger displays the following:

0x80486b4 (scan_line+4:)      movl     $0x804a094, %esi
0x80486b9 (scan_line+9:)      jmp      +0x23 <80486de> [ scan_line ]
0x80486de (scan_line+46:)     movb     (%esi), %bl
0x80486e0 (scan_line+48:)     incl     %esi
0x80486e1 (scan_line+49:)     testb    %bl, %bl
0x80486e3 (scan_line+51:)     jne      +0xffffffd6 <80486bb> [ scan_line ]
0x80486e5 (scan_line+53:)     incl     0x804a088 [ line_cnt ]
0x80486eb (scan_line+59:)     popl     %ebx
0x80486ec (scan_line+60:)     popl     %esi
0x80486ed (scan_line+61:)     popl     %edi

It reached the end of scan_line without executing the instruction to increment char_cnt. It doesn't appear that scan_line even looked at the buffer passed to fgets. It appears that scan_line was looking at data at another address. Check to see what's at that address:
```
   debug> dump -c40 0x804a0a0
```
The debugger displays the following:
```
Raw Dump for p1, Program sget
0x804a094:   ........ 0x00000000 0x00000000 0x00000000  ................
0x804a0a0: 0x00000000 0x00000000 0x00000000 0x00000000  ................
0x804a0b0: 0x00000000 0x00000000 0x00000000             ............
```
NOTE: See ``Examining data with no type information'' for more information.
There's nothing at that address. That appears to be the problem; copy is passing the address of a different buffer to fgets than the one scan_line is reading from.

What -g really means

If you compile without -g, your object file still has to have a minimal level of symbolic information for the link editor's use. Link editing is the step that combines object files to produce an executable. i.e., cc -o macros main.o macro.o. This basic information is limited to the name and address of global and static functions and variables. It doesn't include the types of the symbols, but it does provide enough information to determine if a symbol is a data symbol (variable) or a text symbol (function). Compiling with -g doesn't change any of the basic information but it adds:

type information
names and locations of local variables
line number tables

If you debug a file with just the minimal level of symbol information, most of the debugger's commands will still work, but they may not be as helpful as you expected. The differences are:

symbols -fg will print file static and global symbols, but symbols (with no options) will not print anything, since debug doesn't know about the local variables.
You can't use local variables in expressions such as print, set, or stop expressions. You can still use global and static variables, but the first time you reference a variable, debug will warn you that it doesn't know the type. Without the type, it will assume the variable is an int.
Function arguments are printed simply as hexadecimal numbers in stack traces.
list with no arguments won't work because there will be no current line number or file (%line is null). You can still use list if you give it the file and line number.
You can't use line numbers as arguments to the commands that require locations (stop, jump, run -u location, dis) but you can still give them function names and addresses.
When the program stops, debug prints the current assembly language instruction instead of the current source level statement.
step runs the program until it gets to the next statement. Since there is no information about the boundaries of statements, the program will run until another event triggers or until the program exits. To step through the machine level instructions, use step -i (or the alias si).
Everything else including event commands, create, grab, and process lists, is unchanged.

You can combine object files compiled with and without -g. For example, none of the library functions -- like malloc -- are compiled with -g. What you will be able to do or see at any time depends on which compilation unit the program is in when it stops. You will be able to tell immediately if the compilation unit was compiled with -g or not from the event notification (the source line is displayed with -g -- you will see the assembly instruction otherwise). The warning that debug printed (No debugging information) is only displayed if none of the files in the executable were compiled with -g.

NOTE: cc will not let you compile with both -g and -O. Optimization can move code around and move variables into and out of registers in ways that are very hard for the debugger to track. If you have to debug optimized code -- if, for example, optimizing the program reveals a latent bug that doesn't show up without optimization -- it will generally be less confusing to debug at the assembly language level.

Stopping on a function vs. stopping on an address

When you stop the process at a function, you may notice that the process stopped several instructions into the function. The instructions skipped constitute the function prologue. The nature and existence of the function prologue is processor specific, but it is usually needed to set up the stack frame and registers for the new function.

NOTE: See the Function Calling Sequence section of the appropriate Processor Supplement to the System V Application Binary Interface (ABI) for further information.

The important point to note for debugging purposes is that if the program is stopped in the function prologue, the values in the registers and the arguments in the stack trace may not be consistent. To minimize confusion, debug normally skips the function prologue (with or without -g).

If you don't want to skip the function prologue, set the breakpoint on the address (a hex number) rather than the name of the function:

   debug> print func
   0x8048924
   debug> stop 0x8048924
   EVENT [1] assigned
   debug> run
   STOP EVENT TRIGGERED: 0x8048924  in p2 [func]
   0x8048924 (func+0:)  jmp  +0x0000012d <8048a56> [ func ]

Global vs. static functions

There is a subtle difference in the scoping information available with and without -g. In both cases, static functions and variables are associated with the name of the compilation unit where they are defined, but the name of the compilation unit is only available for global variables if the file was compiled with -g. If the program is stopped in a static function, whether or not you used -g, or if it is stopped in a global function that was compiled with -g, the current file (%file) will be set to the name of the file, and you can access static variables and other static functions defined in the same compilation unit without having to use qualified names (for example, file@func). On the other hand, if the program is stopped in a global function compiled without -g, %file will be null, and you will have to use a qualified name to access any static function or variable, even though they were defined in the same compilation unit.

For example, even though main and copy are both defined in sget.c, when the program is stopped in main you have to type stop sget.c@copy to set the breakpoint. symbols -f can't show you any of the static variables, either:

   debug> print %func, %file
   "main" 0
   debug> symbols -f
   Symbols for p2, Program sget
   Name            Location    Line
   Warning: No current source file

When the program is stopped in copy, you can see the local variables, and you can set a breakpoint on another static function without qualification:

   debug> print %func, %file
   "copy" "sget.c"
   debug> symbols -f
   Symbols for p2, Program sget
   Name            Location    Line
   base_name       sget.c          
   buffer          sget.c          
   char_cnt        sget.c          
   copy            sget.c          
   exit_code       sget.c          
   get             sget.c          
   getstats        sget.c          
   handler         sget.c          
   line_cnt        sget.c          
   path            sget.c          
   scan_line       sget.c          
   word_cnt        sget.c          
   debug> stop scan_line
   EVENT [3] assigned

System call events

The syscall (or sys) command creates an event that triggers when the program uses the specified system call. If no options were specified, debug will stop the program when it gets to the system call. The -e option (for enter) has the same effect. The -x option (for exit) will stop the program when it leaves the system call. Specifying both options (-ex) will make it stop at both places. Like stop, you can give syscall a count (-c count) and an optional command list. You can also specify more than one system call in one syscall command:

   debug> syscall -x chown mknod chroot {stack}

The help command will show you the set of system call names that syscall recognizes:

   debug> help sysnames
   Valid system call names:
   1 exit           2 fork         3 read         4 write
   . . .

System call events work even on completely stripped object files.

Getting down to the machine level

The return value of a function returning a scalar object is stored in a special register whose designation depends on the machine you are using.

NOTE: For details about finding return values on your machine or for other useful processor registers, see the appropriate Processor Supplement of the System V Application Binary Interface.

On any machine, you will be able to print the individual registers with print %rn, or you can see all of them at once with the regs command:

   debug> regs
   Register Contents for p1, program sget
   %r1       r1_value     %r2      r2_value    %r3      r3_value
   . . .

where %rn is the register name and rn_value is its value.

You may also look at the individual machine instructions with the dis command. Like the shell-level command dis(1), dis interprets the machine instructions in the program and prints them in a human-readable format similar to the input to the assembler; therefore, this is called a ``dis-assembly''. The disassembly is static like a source listing, and shows the instructions as they are laid out in the object file, not the order in which they are executed.

Instruction stepping, on the other hand, shows you the actual path of execution as each instruction is executed. You may use step -i (-i for instructions) anywhere -- it works the same with or without debugging information. Note that you must use step -i to step at the instruction level; step (without the -i) does not automatically instruction step instead of statement step if you do not have line number information. Without any line number information, typing step is equivalent to typing run.

Examining data with no type information

Without -g, debug doesn't know the type of any object in the program, so it assumes variables are all ints unless you tell it otherwise. You can print a value as any of the basic types (char, long, double, for example) with a cast or the appropriate format string:

   debug> print (char *)0x8047114
   "this is a test case - line 1\n"

You can not, however, cast a value to any user-defined type such as structures and unions, because the debugger does not have the information describing those types. However, you may dump the contents of a structure (or anything else) as a series of bytes and interpret the bytes yourself (assuming you know the layout of the structure). The dump command displays an area of memory as both hex bytes and as characters, if printable. dump prints 16 bytes (4 words) at a time. If the location given isn't on a 16-byte boundary, it will print dots (..) up to the first byte that you asked for:

     debug> dump 0x80473cc
     Raw Dump for p1, Program sget
     0x80473cc:   ........   ........   ........ 0x73696874  ............this
     0x80473d0: 0x20736920 0x65742061 0x63207473 0x20657361   is a test case
     0x80473e0: 0x696c202d 0x3120656e 0x0000000a 0x00000000  - line 1........
     . . .

dump normally tries to organize its output into words, of a size appropriate to the target architecture. For little-endian architectures, this means that the hexadecimal values for each byte will appear in a different order than the values actually appear in memory. You can suppress this with the -b option, which will make dump put out each byte individually, as it is laid out in memory.

     debug> dump -b 0x80473cc
     Raw Dump for p1, Program sget
     0x80473cc: .. .. .. .. .. .. .. .. .. .. .. .. 74 68 69 73  ............this
     0x80473d0: 20 69 73 20 61 20 74 65 73 74 20 63 61 73 65 20   is a test case
     0x80473e0: 2d 20 6c 69 6e 65 20 31 0a 00 00 00 00 00 00 00  - line 1........
     0x80473f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     . . .

Normally dump displays 256 bytes, but you can control the number of bytes dumped with the -c option or by setting the debugger variable %num_bytes.

Next topic: Debugging a multithreaded program
Previous topic: Debugging without -g

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 27 April 2004