Search Text         
Search Tips?
Search By   And   Or   Boolean   Exact Match   TA #
Search In   Whole Doc   Keywords Sort By  
Product   Sub Product  

View Technical Articles (sorted by Product) New/Updated in the last:    7 days      14 days      30 days             
TA # Date Created Date Updated Resolved Issue?   Printer Friendly Version of This TA   Print Article
  E-mail This TA   E-mail Article
105840 11/17/1997 05:10 PM 06/03/2008 05:13 AM
Yes No
Techniques to help identify the failing function of a kernel panic.
Keywords
panic crash dump function process perpetrator parachute analysis cocktail openserver osr5 v5 5.0.0 5.0.2 5.0.4 odt open server 3.0 internet faststart failure getblkh alad 505 5.0.5 506 5.0.6 507 5.0.7
Release
          SCO OpenServer Enterprise System Release 5.0.0, 5.0.2, 5.0.4, 5.0.5, 5.0.6, 
5.0.7 
          SCO OpenServer Host System Release 5.0.0, 5.0.2, 5.0.4, 5.0.5, 5.0.6, 
5.0.7 
          SCO OpenServer Desktop System Release 5.0.0, 5.0.2, 5.0.4, 5.0.5, 
5.0.6, 5.0.7 
          SCO OpenServer Internet FastStart Release 5.0.4 
          SCO Open Server Enterprise System Release 3.0 
Problem
          How can I use the crash(ADM) utility to analyze a dump image from
          a panic and identify the failing function along with the calling
          process?

NOTE:
          The process for locating where that function originates is more
          involved and beyond the intended scope of this article.


Solution
          This procedure assumes the following:

          A copy of the dump image has been saved in the file:

                   /tmp/dump.0705    (i.e., <PATH>/dump.<DATE>)

          Enter the following:

          ('#' is the root prompt, '>' is the crash(ADM) prompt):

                # crash -d /tmp/dump.0705 -n /stand/unix -w /tmp/crash.0705
                > panic
                > trace
                > user
                > proc
                > u -f
                > quit

          This places the output of the functions 'panic', 'trace', 'user',
          'proc', and 'u -f' of the crash(ADM) utility in a file called
          crash.0705.

          Below is an example of a crash output with the analysis/comments
          bracketed by /*...*/.


dumpfile = /tmp/dump.0705, namelist = /stand/unix, outfile = /tmp/crash.0705

> panic
System Messages:

mem: total = 32380k, kernel = 6988k, user = 25392k
swapdev = 1/41, swplo = 0, nswap = 256000, swapmem = 128000k
Autoboot from rootdev = 1/42, pipedev = 1/42, dumpdev = 1/41
kernel: Hz = 100, i/o bufs = 3072k

%Stp-0    -             -       -       Vendor=EXABYTE Product=EXB-82058VQANXR1
Unexpected trap in kernel mode:
cr0 0x8001001B     cr2  0x00000020     cr3 0x00002000     tlb  0x00000000
ss  0x0000B0E4     uesp 0x0000000B     efl 0x00010217     ipl  0x00000006
cs  0x00000158     eip  0xF011ABD7     err 0x00000000     trap 0x0000000E
eax 0x00000003     ecx  0xC1E4ED4C     edx 0x000001AA     ebx  0x00000000
esp 0xE0000CC4     ebp  0xE0000CF0     esi 0x000BA0FE     edi  0x0000012B
ds  0x00000160     es   0x00000160     fs  0x00000000     gs   0x00000000
cpu 0x00000001

PANIC: k_trap - Kernel mode trap type 0x0000000E
Trying to dump 8095 pages to dumpdev hd (1/41), 102 pages per '.'
.....................................

Panic String: k_trap - Kernel mode trap type 0x%x

Kernel Trap.  Kernel Registers saved at 0xe0000c94
ERR=0, TRAPNO=14
cs:eip=0158:f011abd7 Flags=10217
ds = 0160   es = 0160   fs = 0000   gs = 0000
esi= 000ba0fe   edi= 0000012b   ebp= e0000cf0   esp= e0000cc4
eax= 00000003   ebx= 00000000   ecx= c1e4ed4c   edx= 000001aa

Kernel Stack before Trap:
STKADDR   FRAMEPTR  FUNCTION   POSSIBLE ARGUMENTS
e0000cc4  e0000cf0  getblkh    (0x12b,0xba0fe,0x2,0x1)
e0000cf8  e0000d44  breadn     (0x12b,0x5d064,0x400,inode+0xb1d0)
e0000d4c  e0000d8c  htreadi    (inode+0xb1d0,inode+0xb1d0,0x8,0)
e0000d94  e0000df4  rdwr       (0x1)
e0000dfc  e0000e00  read       (0x2800,0x2800,0x8047240,proc+0x3468)
e0000e08  e0000e28  systrap    (u+0xe34)

   /* The Kernel Stack before Trap shows the failing function as 'getblkh' */
   /* This function is located as the top function on the stack (this      */
   /* means it was the last function executing prior to the panic).        */
   /* It also means it was the function that caused the panic, though not  */
   /* necessarily the root cause.                                          */

> user
PER PROCESS USER AREA FOR PROCESS 39
USER IDs:      uid: 0, gid: 3, real uid: 0, real gid: 3
        supplementary gids: 3 0 1
PROCESS TIMES:  user: 152, sys: 1540, child user: 1, child sys: 3
PROCESS MISC:
        command: bkup-tar, psargs: bkup-tar -MVRL8 15 .
        proc: P#39, cntrl tty: maj(??) min(??)
        start: Wed Sep  3 23:02:44 1997
        mem: 0x2bbc2, type: exec
        proc/text lock: none
        current directory: I#42
OPEN FILES AND POFILE FLAGS:
        [ 0]: F#163        [ 1]: F#180        [ 2]: F#27     w
        [ 3]: F#89     w  [ 4]: F#197 c   w  [ 5]: F#181 c
        [ 6]: F#204 c      [ 7]: F#199 c      [ 8]: F#158   r
FILE I/O:
        u_base: 0x80b25f4, file offset: 466944, bytes: 1024
        segment: data, cmask: 0000, ulimit: 2097151
        file mode(s): read
SIGNAL DISPOSITION:
        sig#      signal oldmask sigmask
            1: ignore        -
            2: ignore        -
            3: ignore        -
           12:  0x8050690    -


        /* The User area also gives some valuable information that can  */
        /* further explain what was happening.  In this case, the       */
        /* command 'bkup-tar -MVRL8 15' (command plus arguments) is     */
        /* what was executing when the function was called.             */
        /* The User area also indicates which process in the process    */
        /* table was executing (in this case, process 39).              */


> trace
KERNEL STACK TRACE FOR PROCESS 39:
  STKADDR   FRAMEPTR  FUNCTION   POSSIBLE ARGUMENTS
  e0000bb8  e0000c34  prf_task_s (0x4,0,0,0xe)
  e0000c3c  e0000c54  cmn_err    (0x3,dmsize+0x210,0xe,u+0xc94)
  e0000c5c  e0000c88  k_trap     (u+0xc94)
            e0000c94  kern_trap  from 0xf011abd7 in getblkh
    ax:       3 cx:c1e4ed4c dx:     1aa bx:       0 fl:    10217 ds: 160 fs:   0
    sp:e0000cc4 bp:e0000cf0 si:   ba0fe di:     12b err:       0 es: 160 gs:   0
  e0000c9c  e0000cf0  getblkh    (0x12b,0xba0fe,0x2,0x1)
  e0000cf8  e0000d44  breadn     (0x12b,0x5d064,0x400,inode+0xb1d0)
  e0000d4c  e0000d8c  htreadi    (inode+0xb1d0,inode+0xb1d0,0x8,0)
  e0000d94  e0000df4  rdwr       (0x1)
  e0000dfc  e0000e00  read       (0x2800,0x2800,0x8047240,proc+0x3468)
  e0000e08  e0000e28  systrap    (u+0xe34)
            e0000e34  scall_noke from  0x8060dbb
    ax:       3 cx: 80b01f4 dx:  754400 bx:    2800 fl:      246 ds:  1f fs:   0
    sp:e0000e64 bp: 8047014 si:    2800 di: 8047240 err:       3 es:  1f gs:   0

        /* The trace area shows the panic routines following the stack. */
        /* These are present in every valid dump.   The two routines    */
        /* here are: cmn_err, and prf_task_s.                           */
        /* There may be more listed but these are always present.       */
        /* Because of the nature of some dumps, the actual information  */
        /* contained in the 'panic' and 'user' areas may be             */
        /* substantially different or absent from what is shown here.   */
        /* The trace area is very useful in such cases.  Just prior to  */
        /* the 'kern_trap' function is a register dump.  The failing    */
        /* function is just below the dump.  It is also referenced in   */
        /* the 'kern_trap' function: 'from 0xf011abd7 in getblkh'       */

> proc
PROC TABLE SIZE = 71
SLOT ST PID   PPID  PGRP   UID PRI CPU EVENT            NAME           FLAGS
   0 s     0     0     0     0  95   0 runout           sched          load sys
lock nwak
   1 s     1     0     0     0  66   0 u                init           load
   2 s     2     0     0     0  95   0 kspt1+0x128908   vhand          load sys
lock nwak nxec
   3 s     3     0     0     0  95   0 kspt1+0x118348   bdflush        load sys
lock nwak nxec
   4 s     4     0     0     0  95   0 kmd_id           kmdaemon       load sys
lock nwak nxec
   5 s     5     1     0     0  95   0 0xc016b150       htepi_daemon   load sys
lock nwak
   6 s     6     0     0     0  95   0 pbintrpool       strd           load sys
lock nwak nxec
   7 s  1635     1  1635     0  75   0 cn_tty           getty          load
   8 s    42     1    42     0  73   0 proc+0xac0       ifor_pmd       load nxec
   9 s    43    42    42     0  76   0 selwait          ifor_pmd       load nxec
  10 r    38     1    37     0  76   0                  syslogd        load nxec
  11 s    35     1     0     0  95   0 0xc0170150       htepi_daemon   load sys
lock nwak
  12 s    69     1    58     0  75   0 0xfd4433f0       strerr         load
  13 s   396     1   396     0  75   0 cn_tty+0x68      getty          load
  14 s    49    43    49     0  76   0 selwait          sco_cpd        load
  15 s    51    43    42     0  76   0 selwait          ifor_sld       load
  16 s   397     1   397     0  75   0 cn_tty+0xd0      getty          load
  17 s   386     1   386     0  76   0 0xfc3b27de       caldaemon      load
  18 s   247     1   247     0  73   0 proc+0x1830      cron           load nxec
  19 s   371     1   371     0  76   0 0xfc3b0d46       calserver      load
  20 s   232     1     0     0  76   0 selwait          dlpid          load nxec
  21 s   398     1   398     0  75   0 cn_tty+0x138     getty          load
  22 s   177     1     0     0  95   0 0xc0174150       htepi_daemon   load sys
lock nwak
  23 s   260     1   260     0  76   0 0xfc3ade90       lpsched        load nxec
ntrc
  24 s   350     1   350    17  66   0 u                deliver        load omsk
nxec
  25 s   373   371   371     0  76   0 0xfc3b055e       calserver      load nxec
  26 s   399     1   399     0  75   0 cn_tty+0x1a0     getty          load
  27 s   400     1   400     0  75   0 cn_tty+0x208     getty          load
  28 s   401     1   401     0  75   0 cn_tty+0x270     getty          load
  29 s   297     1   297     0  76   0 selwait          inetd          load nxec
  30 s   402     1   402     0  75   0 cn_tty+0x2d8     getty          load
  31 s   403     1   403     0  75   0 cn_tty+0x340     getty          load
  32 s   306     1   306     0  76   0 selwait          routed         load nxec
  33 s   404     1   404     0  75   0 cn_tty+0x3a8     getty          load
  34 s   330     1   330     0  76   0 selwait          scohttpd       load nxec
  35 s   309     1   309     0  76   0 selwait          lpd            load nxec
  36 s   405     1   405     0  75   0 cn_tty+0x410     getty          load
  37 s  1027     1  1027     0  75   0 sio_tty+0xc      getty          load
  38 s  1807   247   247     0  73   0 proc+0x3310      sh             load
  39 p  1851  1807   247     0  46  12                  bkup-tar       load
  40 s   347     1   347     0  76   0 selwait          snmpd          load nxec
  46 s  1573   297   297     0  75   0 iknt+0x20        telnetd        load
  47 s  1574  1573  1574     0  75   0 spt_tty+0x68     login          load
  49 s   417     1   417     0  75   0 cn_tty+0x478     getty          load
  51 s   419     1   419     0  76   0 0xfc3b3d70       sdd            load


        /* Process 39 in the process table shows the bkup-tar process   */
        /* with nothing in the EVENT column (this is frequently seen).  */
        /* Other processes that are missing information in the EVENT    */
        /* column may bear relevance to the panic and should be noted.  */
        /* This also identifies the parent process (sh: 1807) and its   */
        /* parent: (cron: 247).  From this it's evident that the panic  */
        /* occurred during a cron job running bkup-tar (a backup        */
        /* program).  The function bkup-tar was executing was getblkh   */
        /* which later proved to be a part of the driver for the scsi   */
        /* adapter.                                                     */
        /*                                                              */
        /* The following section, 'u -f', is a more detailed expansion  */
        /* of the 'user' function in crash(ADM).  This is useful for    */
        /* some types of panics.  For instance, in one panic, the       */
        /* variables: pr_base, pr_size, pr_off, and pr_scale had        */
        /* values other than 0 which indicated that profiling was       */
        /* enabled.  This is a tool programmers use to identify how     */
        /* many times a line of code is executed when a program is run. */
        /* This caused a panic when the program was compiled on 3.2v4.2 */
        /* and run on 3.2v5.0.0 without recompiling.                    */


> u -f
PER PROCESS USER AREA FOR PROCESS 39
USER IDs:       uid: 0, gid: 3, real uid: 0, real gid: 3
        supplementary gids: 3 0 1
PROCESS TIMES:  user: 152, sys: 1540, child user: 1, child sys: 3
PROCESS MISC:
        command: bkup-tar, psargs: bkup-tar -MVRL8 15 .
        proc: P#39, cntrl tty: maj(??) min(??)
        start: Wed Sep  3 23:02:44 1997
        mem: 0x2bbc2, type: exec
        proc/text lock: none
        current directory: I#42
OPEN FILES AND POFILE FLAGS:
        [ 0]: F#163        [ 1]: F#180        [ 2]: F#27     w
        [ 3]: F#89     w  [ 4]: F#197 c   w  [ 5]: F#181 c
        [ 6]: F#204 c      [ 7]: F#199 c      [ 8]: F#158   r
FILE I/O:
        u_base: 0x80b25f4, file offset: 466944, bytes: 1024
        segment: data, cmask: 0000, ulimit: 2097151
        file mode(s): read
SIGNAL DISPOSITION:
        sig#      signal oldmask sigmask
             1: ignore        -
             2: ignore        -
             3: ignore        -
             4: default       -
             5: default       -
             6: default       -
             7: default       -
             8: default       -
             9: default       -
            10: default       -
            11: default       -
            12:  0x8050690    -
            13: default       -
            14: default       -
            15: default       -
            16: default       -
            17: default       -
            18: default       -
            19: default       -
            20: default       -
            21: default       -
            22: default       -
            23: default       -
            24: default       -
            25: default       -
            26: default       -
            27: default       -
            28: default       -
            29: default       -
            30: default       -
            31: default       -
        ux_uid: 0, ux_gid: 0, ux_mode:
        comp: 0xe0000d13, nextcp: 0xe0000d1e
        bsize: 1024, pgproc: 0, qsav: 0xf0147f82, error: 0
        ap: 0xe0001148, u_r: 0, pbsize: 1024
        pboff: 0, pbdev:   1,43 , rablock: 0x5d065, errcnt: 2
        dirp: 0x8, dent.d_ino: 0 dent.d_name: coinsmas.db, pdir:   -
        ttyip:   - , tsize: 0x20, dsize: 0x4d, ssize: 0x2
        arg[0]:        0x8, arg[1]:  0x80b01f4, arg[2]:     0x2800
        arg[3]:  0x807259c, arg[4]:  0x8047da4, arg[5]:          0
        syscall: 0x3, ar0: 0xe0000e34, ttyp: 0, ticks: 0x549e9f
        pr_base: 0, pr_size: 0, pr_off: 0, pr_scale: 0
        ior: 0x28eb, iow: 0x103, iosw: 0, ioch: 0x9eb770d
        sysabort: 0, systrap: 0
        callgatep: 0, callgate[0]: 0, callgate[1]: 0
        debugpend: 0, debugon: 0
        dr[0]:          0,  dr[1]:          0,  dr[2]:          0
        dr[3]:          0,  dr[4]:          0,  dr[5]:          0
        dr[6]:          0,  dr[7]:          0
        entrymask: 00000000 00000000 00000000 00000000
        exitmask:  00000000 00000000 00000000 00000000
EXDATA:
        ip: I#572, tsize: 0x1f714, dsize: 0x8b22, bsize: 0x3e91a, lsize: 0
        magic#: 011064, toffset: 0x94, doffset: 0x1f7a8, loffset: 0
        txtorg: 0x8048094, datorg: 0x80687a8, entloc: 0x80480a0, nshlibs: 0
        execsz: 0x6c, ldtmodified: 0, ldtlimit: 256

> quit


After locating the failing function (and using the command portion of 'user'
to help identify the evoking process) the next step would be to search
through the appropriate directories to find the 'home' of that function.

Two utilities that are helpful for searching binary files or the kernel are
strings(C) and nm(CP).

strings(C) can be used to locate error messages within binary files such as
           Driver.o or the unix kernel.  It will not help identify functions.

nm(CP)  is useful for locating functions, but only comes with the Development
        System and is frequently not available.


There is also a script that will search text files from the current directory
through subdirectories for the string indicated.  It is included below.

Note:   This is for text files only.  This will not locate anything in
        binary files.

Place it in the path (we recommend /usr/bin) and set permissions for execute:
-rwxr-xr-x   1 root     group        634 Mar 12  1997 /usr/bin/findstring

------------------------------------<findstring>------------------------------

:
# findstring    05/17/96 dond
#
# to use:
# cd <dir to search>
# nohup nice -10 findstring <search string (spaces ok)> &
#
# output in $HOME:
# findstring.out - files and line#s where strings are found
# findstringDONE - empty file flaging completion of findstring
#
trap 'rm core; exit' 3
findfile=$HOME/findstring.out
donefile=$HOME/findstringDONE
exec 2>/dev/null

rm $findfile $donefile
echo "$*" > $findfile
pwd >> $findfile
date >> $findfile
echo "" >> $findfile

find . -type f -follow ! -name findstring.out -print   \
 | xargs grep -ny "$*" >> $findfile

echo "" >> $findfile
echo '*** DONE ***' >> $findfile
touch $donefile

--------------------------------<done>--------------------------------
Back to Search ResultsBack to Search Results