Smallfoot Toolkit Engine Internals
An engine is encapsulated in a directory under engines/. It must have
certain executables in there that are methods implementing steps of
the Smallfoot process. Currently there is only one such mandatory
executable, "build" which implements the build stage. Otherwise an
engine is free to store whatever else it wants in its directory.
The executables can be of whatever format, binary or scripts, though
currently they all are ksh scripts. They may also use the libraries
and binaries from lib/ and vendor.lib/ (for us that would be sco.lib).
When the engine method executable is started by the toolkit, it gets a
set of arguments defining the configuration it must work on. These
arguments are expected to change as we get more experience with the
engines.
For the "build" method they are:
- location of the setuid-root shell binary that the engine may use to
run the commands as superuser
- engine base directory (absolute path ot the directory under engines/),
the engine may use it to access its directory's contents
- temporary directory name (pre-created and pre-cleared for build, unless
-l was specified)
- result directory name (pre-created and pre-cleared, unless -l was specified)
- placeholder for result directory name for RAMDISK image (pre-created
and pre-cleared); the idea here is that an engine may need to build
two related images: one to be installed onto the machine and another
for the ISL-time RAM-disk boot, but how this will be done is not clear yet
- possibly more location arguments added in the future
- literal "--" denoting the start of config files
- toolkit configuration file name (absolute) - such
as /opt/smallfoot/sfcfg/smallfoot.cfg
- project configuration file name (absolute) - such
as /opt/smallfoot/cfg/project/projcfg
- run options configuration file name - a temporary file where sflh
writes the information about the options with which it was called
(see more below)
- possibly more config files added in the future
- engine configuration file name (absolute, will be always last) - such
as /opt/smallfoot/cfg/project/engine.ecfg
The engine's config file is guaranteed to be last, so the engine
can always tell it apart. All the config files except the engine config
file are guaranteed to be in a simple shell-based format. The engine
config file may be in whatever format the engine prefers, though
currently all the negines use the same simple shell-based format as well.
The simple shell-based format may contain:
- comments start with a pound sign (#)
- empty lines are permitted
- settings start from the first column (no empty space in front) and are
in the form
variable=value without spaces around the equal sign
- value may be quoted in single or double quotes, with normal shell escapes
- value may span multiple lines
- no other shell commands or expressions are permitted
The variable names used by different config files differ from each
other by conventional prefixes:
SF* (such as SFSYSIMG and SFPROJDESCR) - toolkit and project settings
SFO_* (such as SFO_LAZY) - sflh options
e_* (such as e_DRIVERS) - engine config settings
Also inside the build scripts all the current engines use the
variables starting with E* to save the names of the directory and
file names (but this is not mandatory):
- SFSUIDSH
- location of the setuid-root shell binary that the engine may
use to run the commands as superuser (this variable name is expected
by the library finction runsu)
- EBASEDIR
- engine base directory (absolute path ot the directory
under engines/), the engine may use it to access its directory's contents
- ETMPDIR
- temporary directory name (pre-created and pre-cleared for
build, unless -l was specified)
- ERESDIR
- result directory name (pre-created and pre-cleared, unless -l
was specified)
The engines must check the results of all the commands they do and
fail if these commands fail in a non-recoverable way. The customary
idiom to do that is:
command || die "message"
"die" will print the message on stderr, prefixed by INTERNAL ERROR
and engine name, and exit with code 1. It takes the engine name from
the variable "progname" that should be set inside the build script.
The "die" function and others come from $SFLIBDIR/baselib.sh.
The message says "INTERNAL ERROR" because normally by the time
the lower half of the toolkit runs, all the possible conflicts
must be already detected and resolved, and if something goes wrong,
this means a problem with the toolkit.
The options are saved by sflh as setting in a shell config file and
imported by the engines. They are:
- SFO_FORCE
- value "Y" means "-f" was specified
- SFO_DEBUG
- value "Y" means "-d" was specified
- SFO_LAZY
- value "Y" means "-l" was specified
- SFO_VERBOSE
- value "-v" means that "-v" was specified, otherwise will be empty.
This seemingly strange choice of values allow to use this value
directly in commands as in:
cpio $SFO_VERBOSE ...
More options may be supported in the future.
Packers have the structure very similar to the engines. They use the same
kind of config files, with the only difference being in the first
half of arguments (before "--") due to their different purpose.
The settings in the packer config files still start with e_*,
and the internal location variables still normally start with E*.
Naturally, the result directories of the packers are also located under
res/ in the project instead of under build/.
The packer arguments are:
- location of the setuid-root shell, same as for engines (SFSUIDSH)
- packer base directory (absolute path of packer's directory under packers/,
EBASEDIR)
- temporary directory name (pre-created and pre-cleared for build, ETMPDIR)
- result directory name (pre-created and pre-cleared, ERESDIR)
- combined image directory name, is the source of data for the
packer (ESRCDIR)
- placeholder for the combined RAMDISK image directory name
- packer save directory (normally tmp/pk-save, ESAVEDIR)
- packer undo log file (normally tmp/pk-undo-log, ESAVELOG)
- possibly more args added in the future
- literal "--" denoting the start of config files
- toolkit configuration file name
- project configuration file name
- run options configuration file name
- possibly more config files added in the future
- packer configuration file name (will be always last)
Sometimes a packer needs to make temporary changes to the combined image.
For example, the files in /stand (such as kernel and resmgr) are not
included into the memfs image for a floppy or network-bootable image.
So /stand needs to be moved aside, then memfs created from the rest
of the combined image, then this memfs image and the contents of /stand
combined into a smallfs image. The problem is, what happens if the
packer hits an error in between? Then the combined image will be left
invalid, with /stand missing from it.
To provide for automatic recovery after such situations, packers must
follow a specific procedure and keep an undo log when doing such
changes. When a file or directory needs to be modified or temporarily
removed, the packer must do:
- Write "move $name" into the undo log file ($ESAVELOG)
- Move the file or directory into the save directory ($ESAVEDIR),
to the same subdrectory name as it was under $ESRCDIR
- If neccessary, copy the files from the save copy back and modify them
When a packer wants to create a new file, it must write
"new $name" into the undo log file. So this stuff works
essentially as a roll-back transaction log.
The changed files/directories MUST BE NOT OVERLAPPING!!! Otherwise
it would be impossible to restore them reliably with the present
structure of the data.
|