Evolution of shells in Linux
From Bourne to Bash and beyond
Shells are like editors: Everyone has a favorite and vehemently defends that choice (and tells you why you should switch). True, shells can offer different capabilities, but they all implement core ideas that were developed decades ago.
My first experience with a modern shell came in the 1980s, when I was developing software on SunOS. Once I learned the capability to apply output from one program as input to another (even doing this multiple times in a chain), I had a simple and efficient way to create filters and transformations. The core idea provided a way to build simple tools that were flexible enough to be applied with other tools in useful combinations. In this way, shells provided not only a way to interact with the kernel and devices but also integrated services (such as pipes and filters) that are now common design patterns in software development.
Let's begin with a short history of modern shells, and then explore some of the useful and exotic shells available for Linux today.
A history of shells
Shells—or command-line interpreters—have
a long history, but this discussion begins with the first UNIX® shell.
Ken Thompson (of Bell Labs) developed the first shell for UNIX called the
V6 shell in 1971. Similar to its predecessor in Multics, this shell
(/bin/sh) was an independent user program that executed outside of the kernel.
Concepts like globbing (pattern matching for parameter expansion, such as *.txt)
were implemented in a separate utility called glob, as was the
if
command to evaluate conditional expressions. This
separation kept the shell small, at under 900 lines of C
source (see Related topics for a link to the original source).
The shell introduced a compact syntax for redirection (< >
and >>
) and piping (|
or ^
) that has survived into modern shells. You can also
find support for invoking sequential commands (with ;
)
and asynchronous commands (with &
).
What the Thompson shell lacked was the ability to script. Its sole purpose was as an interactive shell (command interpreter) to invoke commands and view results.
UNIX shells since 1977
Beyond the Thompson shell, we begin our look at modern shells in 1977, when the
Bourne shell was introduced. The Bourne shell, created by Stephen Bourne at
AT&T Bell Labs for V7 UNIX, remains a useful shell today (in some cases, as
the default root shell). The author developed the Bourne shell after working on an
ALGOL68 compiler, so you'll find its grammar more similar to the Algorithmic
Language (ALGOL) than other shells. The source code itself, although developed in
C
, even made use of macros to give it an ALGOL68 flavor.
The Bourne shell had two primary goals: serve as a command interpreter to interactively execute commands for the operating system and for scripting (writing reusable scripts that could be invoked through the shell). In addition to replacing the Thompson shell, the Bourne shell offered several advantages over its predecessors. Bourne introduced control flows, loops, and variables into scripts, providing a more functional language to interact with the operating system (both interactively and noninteractively). The shell also permitted you to use shell scripts as filters, providing integrated support for handling signals, but lacked the ability to define functions. Finally, it incorporated a number of features we use today, including command substitution (using back quotes) and HERE documents to embed preserved string literals within a script.
The Bourne shell was not only an important step forward but also the anchor for
numerous derivative shells, many of which are used today in typical Linux systems.
Figure 1 illustrates the lineage of important
shells. The Bourne shell led to the development of the Korn shell (ksh),
Almquist shell (ash), and the popular Bourne Again Shell (or Bash). The
C
shell (csh) was under development at the time the
Bourne shell was being released. Figure 1 shows the primary
lineage but not all influences; there was significant
contribution across shells that isn't depicted.
Figure 1. Linux shells since 1977
We'll explore some of these shells later and see examples of the language and features that contributed to their advancement.
Basic shell architecture
The fundamental architecture of a hypothetical shell is simple (as evidenced by Bourne's shell). As you can see in Figure 2, the basic architecture looks similar to a pipeline, where input is analyzed and parsed, symbols are expanded (using a variety of methods such as brace, tilde, variable and parameter expansion and substitution, and file name generation), and finally commands are executed (using shell built-in commands, or external commands).
Figure 2. Simple architecture of a hypothetical shell
In the Related topics section, you can find links to learn about the architecture of the open source Bash shell.
Exploring Linux shells
Let's now explore a few of these shells to review their contribution and examine an
example script in each. This review includes the C
shell,
the Korn shell, and Bash.
The Tenex C shell
The C
shell was developed for Berkeley Software
Distribution (BSD) UNIX systems by Bill Joy while he was a graduate student
at the University of California, Berkeley, in 1978. Five years later, the shell
introduced functionality from the Tenex system (popular on DEC PDP systems).
Tenex introduced file name and command completion in addition to command-line
editing features. The Tenex C
shell (tcsh) remains
backward-compatible with csh but improved its overall interactive features. The
tcsh was developed by Ken Greer at Carnegie Mellon University.
One of the key design objectives for the C
shell was to
create a scripting language that looked similar to the C
language. This was a useful goal, given that C
was
the primary language in use (in addition to the operating system being developed
predominantly in C
).
A useful feature introduced by Bill Joy in the C
shell
was command history. This feature maintained a history of the previously
executed commands and allowed the user to review and easily select previous
commands to execute. For example, typing the command history
would show the previously executed commands. The up and down arrow keys
could be used to select a command, or the previous command could be executed
using !!
. It's also possible to refer to arguments of
the prior command; for example, !*
refers to all
arguments of the prior command, where !$
refers
to the last argument of the prior command.
Take a look at a short example of a tcsh script (Listing 1). This script takes a single argument (a directory name) and emits all executable files in that directory along with the number of files found. I reuse this script design in each example to illustrate differences.
The tcsh script is divided into three basic sections. First, note that I use the shebang, or hashbang symbol, to declare this file as interpretable by the defined shell executable (in this case, the tcsh binary). This allows me to execute the file as a regular executable rather than precede it with the interpreter binary. It maintains a count of the executable files found, so I initialize this count with zero.
Listing 1. File all executable files script in tcsh
#!/bin/tcsh # find all executables set count=0 # Test arguments if ($#argv != 1) then echo "Usage is $0 <dir>" exit 1 endif # Ensure argument is a directory if (! -d $1) then echo "$1 is not a directory." exit 1 endif # Iterate the directory, emit executable files foreach filename ($1/*) if (-x $filename) then echo $filename @ count = $count + 1 endif end echo echo "$count executable files found." exit 0
The first section tests the arguments passed by the user. The #argv
variable represents the number of arguments passed in (excluding the command
name itself). You can access these arguments by specifying their index: For
example, #1
refers to the first argument (which is
shorthand for argv[1]
). The script is expecting one
argument; if it doesn't find it, it emits an error message, using
$0
to indicate the command name that was typed at
the console (argv[0]
).
The second section ensures that the argument passed in was a directory. The
-d
operator returns True if the argument is a
directory. But note that I specify a !
symbol first,
which means negate. This way, the expression says that if the argument
is not a directory, emit an error message.
The final section iterates the files in the directory to test whether they're executable.
I use the convenient foreach
iterator, which loops
through each entry in the parentheses (in this case, the directory), and then tests
each as part of the loop. This step uses the -x
operator
to test whether the file is an executable; if it is, the file is emitted and the count
increased. I end the script by emitting the count of executables.
Korn shell
The Korn shell (ksh), designed by David Korn, was introduced around the same
time as the Tenex C
shell. One of the most
interesting features of the Korn shell was its use as a scripting language in
addition to being backward-compatible with the original Bourne shell.
The Korn shell was proprietary software until the year 2000, when it was released as open source (under the Common Public License). In addition to providing strong backward-compatibility with the Bourne shell, the Korn shell includes features from other shells (such as history from csh). The shell also provides several more advanced features found in modern scripting languages like Ruby and Python—for example, associative arrays and floating point arithmetic. The Korn shell is available in a number of operating systems, including IBM® AIX® and HP-UX, and strives to support the Portable Operating System Interface for UNIX (POSIX) shell language standard.
The Korn shell is a derivative of the Bourne shell and looks more similar to it and
Bash than to the C
shell. Let's look at an example
of the Korn shell for finding executables (Listing 2).
Listing 2. Find all executable files script in ksh
#!/usr/bin/ksh # find all executables count=0 # Test arguments if [ $# -ne 1 ] ; then echo "Usage is $0 <dir>" exit 1 fi # Ensure argument is a directory if [ ! -d "$1" ] ; then echo "$1 is not a directory." exit 1 fi # Iterate the directory, emit executable files for filename in "$1"/* do if [ -x "$filename" ] ; then echo $filename count=$((count+1)) fi done echo echo "$count executable files found." exit 0
The first thing you'll notice in Listing 2 is its similarity to Listing 1.
Structurally, the script is almost identical, but key differences
are evident in the way conditionals, expressions, and iteration are
performed. Instead of
adopting C
-like test operators, ksh adopts the
typical Bourne-style operators (-eq
,
-ne
, -lt
, and so on).
The Korn shell also has some differences related to iteration. In the Korn shell, the
for in
structure is used, with command substitution
to represent the list of files created from standard output for the command
ls '$1/*
representing the contents of the named
subdirectory.
In addition to the other features defined above, Korn supports the alias feature (to replace a word with a user-defined string). Korn has many other features that are disabled by default (such as file name completion) but can be enabled by the user.
The Bourne-Again Shell
The Bourne-Again Shell, or Bash, is an open source GNU project intended to replace the Bourne shell. Bash was developed by Brian Fox and has become one of the most ubiquitous shells available (appearing in Linux, Darwin, Windows®, Cygwin, Novell, Haiku, and more). As its name implies, Bash is a superset of the Bourne shell, and most Bourne scripts can be executed unchanged.
In addition to supporting backward-compatibility for scripting, Bash has
incorporated features from the Korn and C
shells.
You'll find command history, command-line editing, a directory stack
(pushd
and popd
), many
useful environment variables, command completion, and more.
Bash has continued to evolve, with new features, support for regular expressions (similar to Perl), and associative arrays. Although some of these features may not be present in other scripting languages, it's possible to write scripts that are compatible with other languages. To this point, the sample script shown in Listing 3 is identical to the Korn shell script (from Listing 2) except for the shebang difference (/bin/bash).
Listing 3. Find all executable files script in Bash
#!/bin/bash # find all executables count=0 # Test arguments if [ $# -ne 1 ] ; then echo "Usage is $0 <dir>" exit 1 fi # Ensure argument is a directory if [ ! -d "$1" ] ; then echo "$1 is not a directory." exit 1 fi # Iterate the directory, emit executable files for filename in "$1"/* do if [ -x "$filename" ] ; then echo $filename count=$((count+1)) fi done echo echo "$count executable files found." exit 0
One key difference among these shells is the licenses under which they are released. Bash, as you would expect, having been developed by the GNU project, is released under the GPL, but csh, tcsh, zsh, ash, and scsh are all released under the BSD or a BSD-like license. The Korn shell is available under the Common Public License.
Exotic shells
For the adventurous, alternative shells can be used based on your needs or taste. The Scheme shell (scsh) offers a scripting environment using Scheme (a derivative of the Lisp language). The Pyshell is an attempt to create a similar script that uses the Python language. Finally, for embedded systems, there's BusyBox, which incorporates a shell and all commands into a single binary to simplify its distribution and management.
Listing 4 provides a look at the find-all-executables script within the
Scheme shell (scsh). This script may appear foreign, but it implements similar
functionality to the scripts provided thus far. This script includes three functions
and directly executable code (at the end) to test the argument count. The real
meat of the script is within the showfiles
function,
which iterates a list (constructed after with-cwd
),
calling write-ln
after each element of the list. This
list is generated by iterating the named directory and filtering it for files that
are executable.
Listing 4. File all executable files script in scsh
#!/usr/bin/scsh -s !# (define argc (length command-line-arguments)) (define (write-ln x) (display x) (newline)) (define (showfiles dir) (for-each write-ln (with-cwd dir (filter file-executable? (directory-files "." #t))))) (if (not (= argc 1)) (write-ln "Usage is fae.scsh dir") (showfiles (argv 1)))
Conclusion
Many of the ideas and much of the interface of the early shells remain the same almost 35 years later—a tremendous testament to the original authors of the early shells. In an industry that continually reinvents itself, the shell has been improved upon but not substantially changed. Although there have been attempts to create specialized shells, the Bourne shell derivatives continue to be the primary shells in use.
Downloadable resources
Related topics
- The V6 Thompson Shell Port (osh), developed and
maintained by J.A. Neitzel, is a great resource for the osh source as well as the
external shell utilities that it relies on (such as
if
andgoto
). You can also find an archive of utilities written in the Thompson shell in addition to the original source code itself. - Goosh is the unofficial Google shell, which implements a shell interface over the commonly used Google search interface. Goosh is an interesting example of how shells can be applied to nontraditional interfaces.
-
The Bourne shell is the anchor from which our current shells were derived. The
source
files have a certain ALGOL68 flavor that was accomplished through the use
of
C
macros. -
The Bourne-Again Shell is the most
commonly used shell in Linux, combining features of the Bourne shell, Korn shell,
and
C
shell. For a great read, learn about the structure and internals of Bash in the third chapter of "The Architecture of Open Source Applications." - Check out additional developerWorks articles on shell scripting, such as Daniel Robbins' "Bash by example" Part 1 (March 2000), Part 2 (April 2000), and Part 3 (May 2000). You can also learn about Korn shell scripting (June 2008) and Tcsh shell variables (August 2008).
- At the kornshell site, get the latest news on the Korn shell, including documentation and other resources.
- Wikipedia includes a great comparison of shells, including general characteristics, interactive features, programming features, syntax, data types, and IPC mechanisms.
- Tim's article "BusyBox simplifies embedded Linux systems" (developerWorks, August 2006) explores the BusyBox application and how to add new commands to this static shell architecture.
- In the developerWorks Linux zone, find hundreds of how-to articles and tutorials, as well as downloads, discussion forums, and a wealth of other resources for Linux developers and administrators.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment.
- Follow Tim on Twitter. You can also follow developerWorks on Twitter, or subscribe to a feed of Linux tweets on developerWorks.