Documentation/BUG-HUNTING: convert to ReST markup
- Add a document title and remove its own index; - use monotonic fonts for paths; - use quote blocks where needed; - adjust/use spaces to properly format paragraphs; - add it to the user book. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This commit is contained in:
parent
8e7fbec662
commit
953ab835a9
@ -1,18 +1,8 @@
|
||||
Table of contents
|
||||
=================
|
||||
Bug hunting
|
||||
+++++++++++
|
||||
|
||||
Last updated: 20 December 2005
|
||||
|
||||
Contents
|
||||
========
|
||||
|
||||
- Introduction
|
||||
- Devices not appearing
|
||||
- Finding patch that caused a bug
|
||||
-- Finding using git-bisect
|
||||
-- Finding it the old way
|
||||
- Fixing the bug
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
@ -24,7 +14,8 @@ Finding bugs is not always easy. Have a go though. If you can't find it don't
|
||||
give up. Report as much as you have found to the relevant maintainer. See
|
||||
MAINTAINERS for who that is for the subsystem you have worked on.
|
||||
|
||||
Before you submit a bug report read REPORTING-BUGS.
|
||||
Before you submit a bug report read
|
||||
:ref:`Documentation/REPORTING-BUGS <reportingbugs>`.
|
||||
|
||||
Devices not appearing
|
||||
=====================
|
||||
@ -37,15 +28,16 @@ Finding patch that caused a bug
|
||||
|
||||
|
||||
|
||||
Finding using git-bisect
|
||||
------------------------
|
||||
Finding using ``git-bisect``
|
||||
----------------------------
|
||||
|
||||
Using the provided tools with git makes finding bugs easy provided the bug is
|
||||
reproducible.
|
||||
Using the provided tools with ``git`` makes finding bugs easy provided the bug
|
||||
is reproducible.
|
||||
|
||||
Steps to do it:
|
||||
|
||||
- start using git for the kernel source
|
||||
- read the man page for git-bisect
|
||||
- read the man page for ``git-bisect``
|
||||
- have fun
|
||||
|
||||
Finding it the old way
|
||||
@ -58,22 +50,22 @@ It's a brute force approach but it works pretty well.
|
||||
|
||||
You need:
|
||||
|
||||
. A reproducible bug - it has to happen predictably (sorry)
|
||||
. All the kernel tar files from a revision that worked to the
|
||||
- A reproducible bug - it has to happen predictably (sorry)
|
||||
- All the kernel tar files from a revision that worked to the
|
||||
revision that doesn't
|
||||
|
||||
You will then do:
|
||||
|
||||
. Rebuild a revision that you believe works, install, and verify that.
|
||||
. Do a binary search over the kernels to figure out which one
|
||||
- Rebuild a revision that you believe works, install, and verify that.
|
||||
- Do a binary search over the kernels to figure out which one
|
||||
introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but
|
||||
you know that 1.3.69 does. Pick a kernel in the middle and build
|
||||
that, like 1.3.50. Build & test; if it works, pick the mid point
|
||||
between .50 and .69, else the mid point between .28 and .50.
|
||||
. You'll narrow it down to the kernel that introduced the bug. You
|
||||
- You'll narrow it down to the kernel that introduced the bug. You
|
||||
can probably do better than this but it gets tricky.
|
||||
|
||||
. Narrow it down to a subdirectory
|
||||
- Narrow it down to a subdirectory
|
||||
|
||||
- Copy kernel that works into "test". Let's say that 3.62 works,
|
||||
but 3.63 doesn't. So you diff -r those two kernels and come
|
||||
@ -83,7 +75,7 @@ You will then do:
|
||||
Copy the non-working directory next to the working directory
|
||||
as "dir.63".
|
||||
One directory at time, try moving the working directory to
|
||||
"dir.62" and mv dir.63 dir"time, try
|
||||
"dir.62" and mv dir.63 dir"time, try::
|
||||
|
||||
mv dir dir.62
|
||||
mv dir.63 dir
|
||||
@ -97,15 +89,15 @@ You will then do:
|
||||
found in my case that they were self explanatory - you may
|
||||
or may not want to give up when that happens.
|
||||
|
||||
. Narrow it down to a file
|
||||
- Narrow it down to a file
|
||||
|
||||
- You can apply the same technique to each file in the directory,
|
||||
hoping that the changes in that file are self contained.
|
||||
|
||||
. Narrow it down to a routine
|
||||
- Narrow it down to a routine
|
||||
|
||||
- You can take the old file and the new file and manually create
|
||||
a merged file that has
|
||||
a merged file that has::
|
||||
|
||||
#ifdef VER62
|
||||
routine()
|
||||
@ -120,7 +112,7 @@ You will then do:
|
||||
#endif
|
||||
|
||||
And then walk through that file, one routine at a time and
|
||||
prefix it with
|
||||
prefix it with::
|
||||
|
||||
#define VER62
|
||||
/* both routines here */
|
||||
@ -153,94 +145,104 @@ To debug a kernel, use objdump and look for the hex offset from the crash
|
||||
output to find the valid line of code/assembler. Without debug symbols, you
|
||||
will see the assembler code for the routine shown, but if your kernel has
|
||||
debug symbols the C code will also be available. (Debug symbols can be enabled
|
||||
in the kernel hacking menu of the menu configuration.) For example:
|
||||
in the kernel hacking menu of the menu configuration.) For example::
|
||||
|
||||
objdump -r -S -l --disassemble net/dccp/ipv4.o
|
||||
|
||||
NB.: you need to be at the top level of the kernel tree for this to pick up
|
||||
your C files.
|
||||
.. note::
|
||||
|
||||
You need to be at the top level of the kernel tree for this to pick up
|
||||
your C files.
|
||||
|
||||
If you don't have access to the code you can also debug on some crash dumps
|
||||
e.g. crash dump output as shown by Dave Miller.
|
||||
e.g. crash dump output as shown by Dave Miller::
|
||||
|
||||
> EIP is at ip_queue_xmit+0x14/0x4c0
|
||||
> ...
|
||||
> Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
|
||||
> 00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
|
||||
> <8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85
|
||||
>
|
||||
> Put the bytes into a "foo.s" file like this:
|
||||
>
|
||||
> .text
|
||||
> .globl foo
|
||||
> foo:
|
||||
> .byte .... /* bytes from Code: part of OOPS dump */
|
||||
>
|
||||
> Compile it with "gcc -c -o foo.o foo.s" then look at the output of
|
||||
> "objdump --disassemble foo.o".
|
||||
>
|
||||
> Output:
|
||||
>
|
||||
> ip_queue_xmit:
|
||||
> push %ebp
|
||||
> push %edi
|
||||
> push %esi
|
||||
> push %ebx
|
||||
> sub $0xbc, %esp
|
||||
> mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb)
|
||||
> mov 0x8(%ebp), %ebx ! %ebx = skb->sk
|
||||
> mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
|
||||
EIP is at ip_queue_xmit+0x14/0x4c0
|
||||
...
|
||||
Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
|
||||
00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
|
||||
<8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85
|
||||
|
||||
Put the bytes into a "foo.s" file like this:
|
||||
|
||||
.text
|
||||
.globl foo
|
||||
foo:
|
||||
.byte .... /* bytes from Code: part of OOPS dump */
|
||||
|
||||
Compile it with "gcc -c -o foo.o foo.s" then look at the output of
|
||||
"objdump --disassemble foo.o".
|
||||
|
||||
Output:
|
||||
|
||||
ip_queue_xmit:
|
||||
push %ebp
|
||||
push %edi
|
||||
push %esi
|
||||
push %ebx
|
||||
sub $0xbc, %esp
|
||||
mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb)
|
||||
mov 0x8(%ebp), %ebx ! %ebx = skb->sk
|
||||
mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
|
||||
|
||||
In addition, you can use GDB to figure out the exact file and line
|
||||
number of the OOPS from the vmlinux file. If you have
|
||||
CONFIG_DEBUG_INFO enabled, you can simply copy the EIP value from the
|
||||
OOPS:
|
||||
number of the OOPS from the ``vmlinux`` file. If you have
|
||||
``CONFIG_DEBUG_INFO`` enabled, you can simply copy the EIP value from the
|
||||
OOPS::
|
||||
|
||||
EIP: 0060:[<c021e50e>] Not tainted VLI
|
||||
|
||||
And use GDB to translate that to human-readable form:
|
||||
And use GDB to translate that to human-readable form::
|
||||
|
||||
gdb vmlinux
|
||||
(gdb) l *0xc021e50e
|
||||
|
||||
If you don't have CONFIG_DEBUG_INFO enabled, you use the function
|
||||
offset from the OOPS:
|
||||
If you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function
|
||||
offset from the OOPS::
|
||||
|
||||
EIP is at vt_ioctl+0xda8/0x1482
|
||||
|
||||
And recompile the kernel with CONFIG_DEBUG_INFO enabled:
|
||||
And recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled::
|
||||
|
||||
make vmlinux
|
||||
gdb vmlinux
|
||||
(gdb) p vt_ioctl
|
||||
(gdb) l *(0x<address of vt_ioctl> + 0xda8)
|
||||
or, as one command
|
||||
|
||||
or, as one command::
|
||||
|
||||
(gdb) l *(vt_ioctl + 0xda8)
|
||||
|
||||
If you have a call trace, such as :-
|
||||
>Call Trace:
|
||||
> [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
|
||||
> [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
|
||||
> [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
|
||||
> ...
|
||||
If you have a call trace, such as::
|
||||
|
||||
Call Trace:
|
||||
[<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
|
||||
[<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
|
||||
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
|
||||
...
|
||||
|
||||
this shows the problem in the :jbd: module. You can load that module in gdb
|
||||
and list the relevant code.
|
||||
and list the relevant code::
|
||||
|
||||
gdb fs/jbd/jbd.ko
|
||||
(gdb) p log_wait_commit
|
||||
(gdb) l *(0x<address> + 0xa3)
|
||||
or
|
||||
|
||||
or::
|
||||
|
||||
(gdb) l *(log_wait_commit + 0xa3)
|
||||
|
||||
|
||||
Another very useful option of the Kernel Hacking section in menuconfig is
|
||||
Debug memory allocations. This will help you see whether data has been
|
||||
initialised and not set before use etc. To see the values that get assigned
|
||||
with this look at mm/slab.c and search for POISON_INUSE. When using this an
|
||||
Oops will often show the poisoned data instead of zero which is the default.
|
||||
with this look at ``mm/slab.c`` and search for ``POISON_INUSE``. When using
|
||||
this an Oops will often show the poisoned data instead of zero which is the
|
||||
default.
|
||||
|
||||
Once you have worked out a fix please submit it upstream. After all open
|
||||
source is about sharing what you do and don't you want to be recognised for
|
||||
your genius?
|
||||
|
||||
Please do read Documentation/SubmittingPatches though to help your code get
|
||||
accepted.
|
||||
Please do read :ref:`Documentation/SubmittingPatches <submittingpatches>`
|
||||
though to help your code get accepted.
|
||||
|
Loading…
Reference in New Issue
Block a user