These steps are listed in the SUMMARY section below for quick
reference.
1. It is important to copy the exact register dump information from
the system console prior to rebooting or powering off the computer.
This is sometimes not saved in the dump and is extremely helpful in
determining whether the panic is caused by software or hardware.
If, however, it is not practical to copy the entire contents of the
screen, try to get the cs: register, the eip: register and the kernel
trap type.
If the eip and es values are consistent then this points to a likely
software/driver fault.
2. During reboot, you may receive prompts about checking the
filesystem or saving the dump image. Answer them in the following
manner:
a. At the message about filesystem repair by fsck, answer no. This
allows running fsck with extended flags after entering single-user
(maintenance) mode.
NOTE:
It is also fine to run the "quick" fsck option by answering
yes to the message. In SCO UNIX Version 4.2 and earlier, running
the "quick" version of fsck may not run with the -s flag which would
be critical for establishing a clean filesystem.
b. At the message about saving the swap image, answer yes. This
places an image of the dump on tape. (This step can be deferred
by answering no, and using the "dumpsave" procedure described below
before entering multiuser mode.)
c. At the message about deleting the swap image, answer no. This
preserves the image in the swap area for immediate analysis using
the crash(ADM) utility. This image will remain in the swap area
until swapping is done by the kernel. It is safe while in
single-user (Maintenance) mode.
See: http://osr507doc.sco.com/en/man/html.ADM/dumpsave.ADM.html
This is called by "bcheckrc" from the /etc/inittab file during the
boot process.
3. At the Control-D prompt, enter the root password. This enters
single-user (Maintenance) mode.
IMPORTANT NOTE:
Do NOT enter <CONTROL-D>. Entering multiuser mode
at this time would endanger the image in swap if the kernel does any
swapping. This would overwrite portions of the dump in swap causing
corruption. Additionally, the root filesystem has not yet been
cleaned and repaired by the following steps, potentially causing
more problems.
4. In single-user mode:
For UNIX 3.2v4.2 and earlier, enter:
/etc/fsck -y -b -s -D /dev/root
For OpenServer 5.0.X, enter:
/etc/fsck -ofull /dev/root
NOTE:
If the computer reboots, re-enter Maintenance mode.
5. Run crash(ADM) on the dump and save the output to a file for
analysis using the following syntax:
/etc/crash -d <dump> -w <outputfile>
This redirects stdout directly to the file so no output is displayed
on the screen other than the '>' prompt for crash(ADM). For example:
/etc/crash -d /dev/swap -w /tmp/crashout.<date> (crashout.0925)
6. Run the following processes: (Note that the prompt for crash(ADM)
is '>'.)
> panic
> trace
> stack
> user
> proc
> u -f
> quit
This will produce an output file which can be viewed and analyzed.
The output file from the crash(ADM) utility, as well as an accurate
copy of the register dump, are important items for crash analysis.
However, it is frequently necessary to compare more than one register
dump and crash(ADM) output file to accurately determine the cause.
PROCEDURE "SAVEDUMP" procedure for saving the image in /dev/swap to either
a tape or file. This must be done BEFORE entering multi-user
mode to guarantee the integrity of the image in swap. Each step
is commented. The '#' is the root prompt.
# memsize /dev/swap
# dd if=/dev/swap of=<filename> bs=1024k count=<#MB>
*/ memsize /dev/swap gives the size of the dump (in bytes) /*
*/ <filename> is either <tapedevice> or
<outputfilename> /*
*/ <#MB> is the amount determined by memsize in (megabytes+1) /*
*/ i.e. if memsize=33026048 (32MB), count=33 /*
7. After doing the crash(ADM) analysis go through the steps outlined
in filesystem repair "cocktail" scripts for your version of the
operating system:
Technical Article 105410, "Filesystem Repair Utilities for SCO UNIX Release 3.2
Version 4.2."
Technical Article 105411, "Filesystem Repair Utilities for SCO OpenServer 5.0.0,
5.0.2 and 5.0.4."
A full "cocktail" may not be necessary after each panic, however
doing so helps maintain filesystem integrity. It is strongly
advised that the "cocktail" be run prior to backing up, to
minimize the danger of reintroducing corruption during a restore.
8. After the system has been returned to operational status, place
the necessary information on a tape to save for future analysis:
Save the dumpfile and kernel to a tape using either cpio or tar.
For example:
tar cvf /dev/<tapedev> <dumpfile> <kernel>
For OpenServer 5.0.x:
tar cvf /dev/<tape> <dumpfile> /stand/unix
For UNIX 3.2v4.2 and earlier:
tar cvf /dev/<tape> <dumpfile> /unix
--------------------------------------------------------------------------------
SUMMARY: Panic Parachute
Here are the basic steps outlined for quick reference to use after
a panic.
- Copy the register dump information from the console screen
before reboot.
- Reboot the system and defer running fsck(ADM) until after
entering System Maintenance mode.
- Defer removing image from swap until after running crash(ADM)
routines: panic, user, trace, proc, 'u -f' and quit, then
verifying that the image on tape is good.
- Using "savedump" procedure, transfer image in /dev/swap to file.
- Run filesystem repair utilities "cocktail".
/etc/fsck -ofull /dev/root (OSES 5.0.X only)
/etc/fsck -y -b -s -D /dev/root (UNIX 3.2v4.2 and earlier)
/tcb/bin/integrity -em > /tmp/integ.rpt
/etc/tcbck <enter>
/tcb/bin/authck -a -v <enter>
/etc/fixmog -v <enter>
(OpenServer 5.0.X only:)
Software Manager ==> Software ==>
Verify System ==> Normal System State (thorough)
- Transfer dumpfile and kernel to tape using tar.
--------------------------------------------------------------------------------
SEE ALSO:
Technical Article 106181, "Why did my unix kernel panic?"
Technical Article 104166, "How do I determine which function in the kernel PANICked?"
Technical Article 106009, "What are Traps, Interrupts and Exceptions?"
Technical Article 105411, "Filesystem Repair Utilities for SCO OpenServer 5.0.0,
5.0.2 and 5.0.4."
Technical Article 105410, "Filesystem Repair Utilities for SCO UNIX Release 3.2
Version 4.2."
|