Introduction to Shell Programming by Michael Paoli
a.k.a.:
- over 80% of everything you ever needed or wanted to know about shell programming in under 80 minutes
- the top stuff you need and want to know about shell programming
- learning shell programming in 6 pages or less
Shortcuts: top Why start here? sh man page review Good Programming Practices examples references/resources
Why start here? (Bourne Shell, 1979 sh(1) man page, etc.)
- Why not start with other scripting languages, such as ...
- C-shell (csh) or tcsh or the like?
- There are entire FAQs why not, including issues/problems such as:
- limited I/O redirection, no general file descriptor manipulation
(can't, for example, redirect stderr separately from stdout, echo a message to stderr, copy or close file descriptors, etc.)
- exceedingly limited signal handling
(interrupt can optionally be caught or ignored, no other possibilities or other signal handling)
- quoting doesn't work very well in the C-shell - much of the quoting is awkward to impossible, you also can't nest command substitution
- inconsistent implementations - many implementations are significantly different - what works on one vendor's or distribution's C-shell on one system may fail on many others
- much etc.
- Why not Perl?
- why sh, why start with Bourne Shell sh UNIX Seventh Edition circa 1979?
- most of what one needs to use for scripting/programming purposes
exists in ye olde (UNIX Seventh Edition) Bourne Shell,
and still functions essentially the same way
- Least Common Denominator / portability - code for Bourne shell
will generally be quite usable under any Bourne-like shell (Bourne, Korn, POSIX, Bash, ash, dash, etc.)
- "modern day" equivalent/successor shells
(Bourne, Korn, POSIX, Bash, etc.) -
typically 25 to 91 pages for the man pages
- Very concise reference: UNIX Seventh Edition sh(1) only 6 pages!
- More on concise terse references (man pages):
- "Within the area it surveys,
this volume attempts to be timely, complete and concise.
Where the latter two objectives conflict,
the obvious is often left unsaid in favor of brevity.
It is intended that each program be described as it is,
not as it should be."
"You will want to study sh(1) long and hard."
- UNIX PROGRAMMER'S MANUAL,
Seventh Edition, January, 1979, Volume 1,
INTRODUCTION TO VOLUME 1
sh(1) (UNIX Seventh Edition), etc.
Stepping through the man page and such (intended to be used with man page reference(s) and presentation, not as a complete reference by itself):
- keywords:
sh, for, case, if, while, :, ., break, continue, cd, eval, exec,
exit, export, login, newgrp, read, readonly, set, shift, times,
trap, umask, wait
- sh - what it is - much more than just a command interpreter - it's a rather full featured, high level, flexible, powerful programming language
- it won't do everything - it's not a full featured general purpose programming language
- it will however, handle a great majority of tasks one typically wants to accomplish on a UNIX or UNIX-like (LINUX, BSD, etc.) system
- it fits well with UNIX design/tool strategy - does what it does very well, connects well to other UNIX programs and such
- Commands:
- simple-command -
sequence of non blank words
separated by "blanks", command name is passed as argument 0 (see exec(2))
- pipeline - command(s) separated by
|
- list - sequence of one or more pipelines separated by
;
,
&
,
&&
,
or
||
and optionally terminated by
;
or
&
.
- command - is simple command or ...
- 200+status - that's 200 in octal! (128 decimal)
- The following words are only recognized as the first word of a command and when not quoted.
if then else elif fi case in esac for while until do done { }
- Command substitution
- ``
- trailing newlines are stripped
- may be used as all or part of a word
- Parameter substitution
$
- Positional parameters may be assigned values by set.
- Variables may be set by writing
name=value
[ name=value
] ...
${parameter}
A parameter is a sequence of letters, digits or underscores
(a name), a digit, or any of the characters * @ # ? - $ !.
The value, if any, of the parameter is substituted. The
braces are required only when parameter is followed by a
letter, digit, or underscore that is not to be interpreted
as part of its name. If parameter is a digit then it is a
positional parameter. If parameter is * or @ then all the
positional parameters, starting with $1, are substituted
separated by spaces. $0 is set from argument zero when the
shell is invoked.
${parameter-word}
${parameter=word}
${parameter?word}
${parameter+word}
- The following parameters are automatically set by the shell.
- The following parameters are used but not set by the shell.
HOME
PATH
MAIL
PS1
PS2
IFS
- Blank interpretation
- File name generation
- Quoting
- I/O redirection and file descriptor manipulation
- [
digit
]>
[>
]word
- [
digit
]<
word
- [
digit
]<<
word
- I/O redirection is processed left to right:
- These will not redirect stderr to /dev/null (unless stdout was going to /dev/null before this was invoked):
2>&1 >/dev/null
command
2>&1 command >/dev/null
command 2>&1 >/dev/null
These will redirect stderr to /dev/null:
>/dev/null 2>&1
command
command >/dev/null 2>&1
>/dev/null
command 2>&1
&
and stdin (/dev/null or not /dev/null)
- Environment
- Signals
- Execution
- PATH
- colon (:) separated list of directories
- null elements are interpreted as . (current directory)
- Special commands
:
.
file
break
[ n ]
continue
[ n ]
cd
[ arg ]
eval
[ arg ]
exec
[ arg ]
exit
[ n ]
export
[ name ... ]
login
[ arg ]
newgrp
[ arg ]
read
name ...
readonly
[ name ... ]
set
[ -eknptuvx
[ arg ... ] ]
-e
-k
-n
-t
-u
-v
-x
-
$-
- Remaining arguments are positional parameters and are assigned, in order, to $1, $2, etc.
If no arguments are given then the values of all names are printed.
shift
times
trap
[ arg ] [ n ] ...
umask
[ nnn ]
wait
[ n ]
- Invocation
- FILES
- SEE ALSO
- DIAGNOSTICS
- BUGS
Good Programming Practices
This is really a huge area itself, but for some starters:
- check return/exit values
- do something reasonable even when things fail unexpectedly
- For at least Bourne and compatible shells, pay proper attention and take due care with interpretation, appropriate quoting, environmental considerations, race conditions, etc.
- Principle of Least Surprise
- try to behave in manners least likely to generate surprise or unexpected behavior
- this especially applies when given unexpected or bad data or conditions, or adverse circumstances
- handle unexpected input/conditions reasonably - e.g. programs shouldn't catastrophically fail in such circumstances
- "Be conservative in what you do, be liberal in what you accept from others" - Jon Postel
- principle (or rule) of least astonishment (or surprise) (from Wikipedia)
- comments and good comment style
- comments which precisely state what the code is quite obviously doing are redundant clutter
- comments more usefully reflect why something is being done, or higher level explanation of what is being done, or why
- beware - comments don't always tell the truth
- least privilege principle - don't use more privilege than is necessary to accomplish the task
- avoid pitfalls and mistakes, learn from the mistakes of others
Examples
References/resources:
- sh(1) man pages and related:
- Books, etc.:
- Various USENET news groups, e.g. comp.unix.shell, and stuff I've posted in comp.unix.shell (typically pointing out or answering stuff in response to other posts). Also, a somewhat more complete set of (mostly) UNIX/LINUX/shell/C/programming/security and related stuff I've posted on USENET
- Useful UNIX utilities for working with Bourne and Bourne-like shells
(note that some of these also exist as built-ins for several contemporary shells):
- (1):
ar,
at,
awk,
basename,
batch,
bc,
cal,
cat,
chmod,
chown,
cmp,
col,
comm,
cp,
crontab,
date,
dd,
df,
dirname,
du,
echo,
ed,
expr,
file,
find,
grep,
iostat,
join,
kill,
ln,
ls,
m4,
mail,
make,
mkdir,
mv,
nice,
od,
ps,
pwd,
quot,
rm,
sed,
sh,
sleep,
sort,
split,
stty,
sum,
tail,
tar,
tee,
test,
time,
touch,
tr,
true,
tty,
uniq,
units,
wait,
wall,
wc,
who,
write