This procedure assumes the following:
A copy of the dump image has been saved in the file:
/tmp/dump.0705 (i.e., <PATH>/dump.<DATE>)
Enter the following:
('#' is the root prompt, '>' is the crash(ADM) prompt):
# crash -d /tmp/dump.0705 -n /stand/unix -w /tmp/crash.0705
> panic
> trace
> user
> proc
> u -f
> quit
This places the output of the functions 'panic', 'trace', 'user',
'proc', and 'u -f' of the crash(ADM) utility in a file called
crash.0705.
Below is an example of a crash output with the analysis/comments
bracketed by /*...*/.
dumpfile = /tmp/dump.0705, namelist = /stand/unix, outfile = /tmp/crash.0705
> panic
System Messages:
mem: total = 32380k, kernel = 6988k, user = 25392k
swapdev = 1/41, swplo = 0, nswap = 256000, swapmem = 128000k
Autoboot from rootdev = 1/42, pipedev = 1/42, dumpdev = 1/41
kernel: Hz = 100, i/o bufs = 3072k
%Stp-0 - - - Vendor=EXABYTE Product=EXB-82058VQANXR1
Unexpected trap in kernel mode:
cr0 0x8001001B cr2 0x00000020 cr3 0x00002000 tlb 0x00000000
ss 0x0000B0E4 uesp 0x0000000B efl 0x00010217 ipl 0x00000006
cs 0x00000158 eip 0xF011ABD7 err 0x00000000 trap 0x0000000E
eax 0x00000003 ecx 0xC1E4ED4C edx 0x000001AA ebx 0x00000000
esp 0xE0000CC4 ebp 0xE0000CF0 esi 0x000BA0FE edi 0x0000012B
ds 0x00000160 es 0x00000160 fs 0x00000000 gs 0x00000000
cpu 0x00000001
PANIC: k_trap - Kernel mode trap type 0x0000000E
Trying to dump 8095 pages to dumpdev hd (1/41), 102 pages per '.'
.....................................
Panic String: k_trap - Kernel mode trap type 0x%x
Kernel Trap. Kernel Registers saved at 0xe0000c94
ERR=0, TRAPNO=14
cs:eip=0158:f011abd7 Flags=10217
ds = 0160 es = 0160 fs = 0000 gs = 0000
esi= 000ba0fe edi= 0000012b ebp= e0000cf0 esp= e0000cc4
eax= 00000003 ebx= 00000000 ecx= c1e4ed4c edx= 000001aa
Kernel Stack before Trap:
STKADDR FRAMEPTR FUNCTION POSSIBLE ARGUMENTS
e0000cc4 e0000cf0 getblkh (0x12b,0xba0fe,0x2,0x1)
e0000cf8 e0000d44 breadn (0x12b,0x5d064,0x400,inode+0xb1d0)
e0000d4c e0000d8c htreadi (inode+0xb1d0,inode+0xb1d0,0x8,0)
e0000d94 e0000df4 rdwr (0x1)
e0000dfc e0000e00 read (0x2800,0x2800,0x8047240,proc+0x3468)
e0000e08 e0000e28 systrap (u+0xe34)
/* The Kernel Stack before Trap shows the failing function as 'getblkh' */
/* This function is located as the top function on the stack (this */
/* means it was the last function executing prior to the panic). */
/* It also means it was the function that caused the panic, though not */
/* necessarily the root cause. */
> user
PER PROCESS USER AREA FOR PROCESS 39
USER IDs: uid: 0, gid: 3, real uid: 0, real gid: 3
supplementary gids: 3 0 1
PROCESS TIMES: user: 152, sys: 1540, child user: 1, child sys: 3
PROCESS MISC:
command: bkup-tar, psargs: bkup-tar -MVRL8 15 .
proc: P#39, cntrl tty: maj(??) min(??)
start: Wed Sep 3 23:02:44 1997
mem: 0x2bbc2, type: exec
proc/text lock: none
current directory: I#42
OPEN FILES AND POFILE FLAGS:
[ 0]: F#163 [ 1]: F#180 [ 2]: F#27 w
[ 3]: F#89 w [ 4]: F#197 c w [ 5]: F#181 c
[ 6]: F#204 c [ 7]: F#199 c [ 8]: F#158 r
FILE I/O:
u_base: 0x80b25f4, file offset: 466944, bytes: 1024
segment: data, cmask: 0000, ulimit: 2097151
file mode(s): read
SIGNAL DISPOSITION:
sig# signal oldmask sigmask
1: ignore -
2: ignore -
3: ignore -
12: 0x8050690 -
/* The User area also gives some valuable information that can */
/* further explain what was happening. In this case, the */
/* command 'bkup-tar -MVRL8 15' (command plus arguments) is */
/* what was executing when the function was called. */
/* The User area also indicates which process in the process */
/* table was executing (in this case, process 39). */
> trace
KERNEL STACK TRACE FOR PROCESS 39:
STKADDR FRAMEPTR FUNCTION POSSIBLE ARGUMENTS
e0000bb8 e0000c34 prf_task_s (0x4,0,0,0xe)
e0000c3c e0000c54 cmn_err (0x3,dmsize+0x210,0xe,u+0xc94)
e0000c5c e0000c88 k_trap (u+0xc94)
e0000c94 kern_trap from 0xf011abd7 in getblkh
ax: 3 cx:c1e4ed4c dx: 1aa bx: 0 fl: 10217 ds: 160 fs: 0
sp:e0000cc4 bp:e0000cf0 si: ba0fe di: 12b err: 0 es: 160 gs: 0
e0000c9c e0000cf0 getblkh (0x12b,0xba0fe,0x2,0x1)
e0000cf8 e0000d44 breadn (0x12b,0x5d064,0x400,inode+0xb1d0)
e0000d4c e0000d8c htreadi (inode+0xb1d0,inode+0xb1d0,0x8,0)
e0000d94 e0000df4 rdwr (0x1)
e0000dfc e0000e00 read (0x2800,0x2800,0x8047240,proc+0x3468)
e0000e08 e0000e28 systrap (u+0xe34)
e0000e34 scall_noke from 0x8060dbb
ax: 3 cx: 80b01f4 dx: 754400 bx: 2800 fl: 246 ds: 1f fs: 0
sp:e0000e64 bp: 8047014 si: 2800 di: 8047240 err: 3 es: 1f gs: 0
/* The trace area shows the panic routines following the stack. */
/* These are present in every valid dump. The two routines */
/* here are: cmn_err, and prf_task_s. */
/* There may be more listed but these are always present. */
/* Because of the nature of some dumps, the actual information */
/* contained in the 'panic' and 'user' areas may be */
/* substantially different or absent from what is shown here. */
/* The trace area is very useful in such cases. Just prior to */
/* the 'kern_trap' function is a register dump. The failing */
/* function is just below the dump. It is also referenced in */
/* the 'kern_trap' function: 'from 0xf011abd7 in getblkh' */
> proc
PROC TABLE SIZE = 71
SLOT ST PID PPID PGRP UID PRI CPU EVENT NAME FLAGS
0 s 0 0 0 0 95 0 runout sched load sys
lock nwak
1 s 1 0 0 0 66 0 u init load
2 s 2 0 0 0 95 0 kspt1+0x128908 vhand load sys
lock nwak nxec
3 s 3 0 0 0 95 0 kspt1+0x118348 bdflush load sys
lock nwak nxec
4 s 4 0 0 0 95 0 kmd_id kmdaemon load sys
lock nwak nxec
5 s 5 1 0 0 95 0 0xc016b150 htepi_daemon load sys
lock nwak
6 s 6 0 0 0 95 0 pbintrpool strd load sys
lock nwak nxec
7 s 1635 1 1635 0 75 0 cn_tty getty load
8 s 42 1 42 0 73 0 proc+0xac0 ifor_pmd load nxec
9 s 43 42 42 0 76 0 selwait ifor_pmd load nxec
10 r 38 1 37 0 76 0 syslogd load nxec
11 s 35 1 0 0 95 0 0xc0170150 htepi_daemon load sys
lock nwak
12 s 69 1 58 0 75 0 0xfd4433f0 strerr load
13 s 396 1 396 0 75 0 cn_tty+0x68 getty load
14 s 49 43 49 0 76 0 selwait sco_cpd load
15 s 51 43 42 0 76 0 selwait ifor_sld load
16 s 397 1 397 0 75 0 cn_tty+0xd0 getty load
17 s 386 1 386 0 76 0 0xfc3b27de caldaemon load
18 s 247 1 247 0 73 0 proc+0x1830 cron load nxec
19 s 371 1 371 0 76 0 0xfc3b0d46 calserver load
20 s 232 1 0 0 76 0 selwait dlpid load nxec
21 s 398 1 398 0 75 0 cn_tty+0x138 getty load
22 s 177 1 0 0 95 0 0xc0174150 htepi_daemon load sys
lock nwak
23 s 260 1 260 0 76 0 0xfc3ade90 lpsched load nxec
ntrc
24 s 350 1 350 17 66 0 u deliver load omsk
nxec
25 s 373 371 371 0 76 0 0xfc3b055e calserver load nxec
26 s 399 1 399 0 75 0 cn_tty+0x1a0 getty load
27 s 400 1 400 0 75 0 cn_tty+0x208 getty load
28 s 401 1 401 0 75 0 cn_tty+0x270 getty load
29 s 297 1 297 0 76 0 selwait inetd load nxec
30 s 402 1 402 0 75 0 cn_tty+0x2d8 getty load
31 s 403 1 403 0 75 0 cn_tty+0x340 getty load
32 s 306 1 306 0 76 0 selwait routed load nxec
33 s 404 1 404 0 75 0 cn_tty+0x3a8 getty load
34 s 330 1 330 0 76 0 selwait scohttpd load nxec
35 s 309 1 309 0 76 0 selwait lpd load nxec
36 s 405 1 405 0 75 0 cn_tty+0x410 getty load
37 s 1027 1 1027 0 75 0 sio_tty+0xc getty load
38 s 1807 247 247 0 73 0 proc+0x3310 sh load
39 p 1851 1807 247 0 46 12 bkup-tar load
40 s 347 1 347 0 76 0 selwait snmpd load nxec
46 s 1573 297 297 0 75 0 iknt+0x20 telnetd load
47 s 1574 1573 1574 0 75 0 spt_tty+0x68 login load
49 s 417 1 417 0 75 0 cn_tty+0x478 getty load
51 s 419 1 419 0 76 0 0xfc3b3d70 sdd load
/* Process 39 in the process table shows the bkup-tar process */
/* with nothing in the EVENT column (this is frequently seen). */
/* Other processes that are missing information in the EVENT */
/* column may bear relevance to the panic and should be noted. */
/* This also identifies the parent process (sh: 1807) and its */
/* parent: (cron: 247). From this it's evident that the panic */
/* occurred during a cron job running bkup-tar (a backup */
/* program). The function bkup-tar was executing was getblkh */
/* which later proved to be a part of the driver for the scsi */
/* adapter. */
/* */
/* The following section, 'u -f', is a more detailed expansion */
/* of the 'user' function in crash(ADM). This is useful for */
/* some types of panics. For instance, in one panic, the */
/* variables: pr_base, pr_size, pr_off, and pr_scale had */
/* values other than 0 which indicated that profiling was */
/* enabled. This is a tool programmers use to identify how */
/* many times a line of code is executed when a program is run. */
/* This caused a panic when the program was compiled on 3.2v4.2 */
/* and run on 3.2v5.0.0 without recompiling. */
> u -f
PER PROCESS USER AREA FOR PROCESS 39
USER IDs: uid: 0, gid: 3, real uid: 0, real gid: 3
supplementary gids: 3 0 1
PROCESS TIMES: user: 152, sys: 1540, child user: 1, child sys: 3
PROCESS MISC:
command: bkup-tar, psargs: bkup-tar -MVRL8 15 .
proc: P#39, cntrl tty: maj(??) min(??)
start: Wed Sep 3 23:02:44 1997
mem: 0x2bbc2, type: exec
proc/text lock: none
current directory: I#42
OPEN FILES AND POFILE FLAGS:
[ 0]: F#163 [ 1]: F#180 [ 2]: F#27 w
[ 3]: F#89 w [ 4]: F#197 c w [ 5]: F#181 c
[ 6]: F#204 c [ 7]: F#199 c [ 8]: F#158 r
FILE I/O:
u_base: 0x80b25f4, file offset: 466944, bytes: 1024
segment: data, cmask: 0000, ulimit: 2097151
file mode(s): read
SIGNAL DISPOSITION:
sig# signal oldmask sigmask
1: ignore -
2: ignore -
3: ignore -
4: default -
5: default -
6: default -
7: default -
8: default -
9: default -
10: default -
11: default -
12: 0x8050690 -
13: default -
14: default -
15: default -
16: default -
17: default -
18: default -
19: default -
20: default -
21: default -
22: default -
23: default -
24: default -
25: default -
26: default -
27: default -
28: default -
29: default -
30: default -
31: default -
ux_uid: 0, ux_gid: 0, ux_mode:
comp: 0xe0000d13, nextcp: 0xe0000d1e
bsize: 1024, pgproc: 0, qsav: 0xf0147f82, error: 0
ap: 0xe0001148, u_r: 0, pbsize: 1024
pboff: 0, pbdev: 1,43 , rablock: 0x5d065, errcnt: 2
dirp: 0x8, dent.d_ino: 0 dent.d_name: coinsmas.db, pdir: -
ttyip: - , tsize: 0x20, dsize: 0x4d, ssize: 0x2
arg[0]: 0x8, arg[1]: 0x80b01f4, arg[2]: 0x2800
arg[3]: 0x807259c, arg[4]: 0x8047da4, arg[5]: 0
syscall: 0x3, ar0: 0xe0000e34, ttyp: 0, ticks: 0x549e9f
pr_base: 0, pr_size: 0, pr_off: 0, pr_scale: 0
ior: 0x28eb, iow: 0x103, iosw: 0, ioch: 0x9eb770d
sysabort: 0, systrap: 0
callgatep: 0, callgate[0]: 0, callgate[1]: 0
debugpend: 0, debugon: 0
dr[0]: 0, dr[1]: 0, dr[2]: 0
dr[3]: 0, dr[4]: 0, dr[5]: 0
dr[6]: 0, dr[7]: 0
entrymask: 00000000 00000000 00000000 00000000
exitmask: 00000000 00000000 00000000 00000000
EXDATA:
ip: I#572, tsize: 0x1f714, dsize: 0x8b22, bsize: 0x3e91a, lsize: 0
magic#: 011064, toffset: 0x94, doffset: 0x1f7a8, loffset: 0
txtorg: 0x8048094, datorg: 0x80687a8, entloc: 0x80480a0, nshlibs: 0
execsz: 0x6c, ldtmodified: 0, ldtlimit: 256
> quit
After locating the failing function (and using the command portion of 'user'
to help identify the evoking process) the next step would be to search
through the appropriate directories to find the 'home' of that function.
Two utilities that are helpful for searching binary files or the kernel are
strings(C) and nm(CP).
strings(C) can be used to locate error messages within binary files such as
Driver.o or the unix kernel. It will not help identify functions.
nm(CP) is useful for locating functions, but only comes with the Development
System and is frequently not available.
There is also a script that will search text files from the current directory
through subdirectories for the string indicated. It is included below.
Note: This is for text files only. This will not locate anything in
binary files.
Place it in the path (we recommend /usr/bin) and set permissions for execute:
-rwxr-xr-x 1 root group 634 Mar 12 1997 /usr/bin/findstring
------------------------------------<findstring>------------------------------
:
# findstring 05/17/96 dond
#
# to use:
# cd <dir to search>
# nohup nice -10 findstring <search string (spaces ok)> &
#
# output in $HOME:
# findstring.out - files and line#s where strings are found
# findstringDONE - empty file flaging completion of findstring
#
trap 'rm core; exit' 3
findfile=$HOME/findstring.out
donefile=$HOME/findstringDONE
exec 2>/dev/null
rm $findfile $donefile
echo "$*" > $findfile
pwd >> $findfile
date >> $findfile
echo "" >> $findfile
find . -type f -follow ! -name findstring.out -print \
| xargs grep -ny "$*" >> $findfile
echo "" >> $findfile
echo '*** DONE ***' >> $findfile
touch $donefile
--------------------------------<done>--------------------------------
|