Due: Monday, 11 April, 2016 by 11:59:59 PM.
Programming assignments in CS 485G are individual work. You may discuss approaches with other students, but may not share code or pseudocode for the assignment. If do get ideas from somebody, or use snippets of code from elsewhere, you must cite the source in your documentation.
Correction: There was an error in the expected output
of test 01 (tests/expected/test.01.empty-input.out
). The
correct output should be an empty file (0 bytes), but the original version
contained a newline (1 byte).
You may either download a new copy of the
provided code, or replace the file
yourself with:
: > tests/expected/test.01.empty-input.out
(The colon is part of the command.)
Late submissions: The penalty for the first day late has been reduced from to 5% (from the usual 10%). Late penalties for subsequent days are still 10%. That is, if you turn in the program by Tuesday night, your score will be reduced by 5 points (out of 100); by Wednesday night, 15 points; and so on.
A shell is an application program that runs other programs on behalf of the user (see chapter 8.4.6 of the textbook). A shell reads in command lines from the user or from a script file, executing the program specified on each line. Most commands run by a shell are external executables to be executed in separate processes, but a few are built-in commands, which are implemented in the shell itself.
More advanced shells also provide programming capabilities: variables, conditional statements, loops, and so on, but you will not be going that far in this assignment: that is more of a matter for CS 441G (compilers) or CS 450G (programming languages).
Some popular shells:
sh
, the "Bourne shell", the traditional shell provided
by Unix systems since the 1970s.
csh
, an interactive shell originally provided with
the BSD variant of Unix.
bash
, the "Bourne again shell", the standard
interactive shell on GNU/Linux, OS X, and many other modern
Unix systems. As the name might imply, bash
is a more
advanced, but backwards-compatible, reimplementation of the Bourne shell.
zsh
, a mostly Bourne-compatible shell with many
added features, even more than bash
.
cmd.exe
, the standard shell on Windows,
based on the earlier DOS shell command.com
.
Scripts in cmd
are called batch files.
PowerShell
, a more advanced shell for
Windows systems.
Your task in this assignment is to implement a simple shell that supports executing external command (programs), performing simple redirections, and evaluating a few built-in commands.
command
structure, and prototypes for
the provided functions parse_command
and free_command
.parse_command
and free_command
.Your shell should support being called in one of two ways:
./shell
: When called with no arguments, your program
should read commands from stdin
until it encounters an
end-of-file (or a command that causes it to exit). When reading from
stdin,
your shell should print the string "shell> "
(with
no newline) to stderr
as a prompt before every user input.
./shell scriptfile
: When called with an argument,
your program should open that file and read commands from it until
end-of-file (or a command that causes the shell to exit). When reading
from a script file, your shell should not print a prompt.
If there was an error opening or reading from the script (and not
simply end-of-file), your program should
print a message and exit with a nonzero status.
Whichever way it is called, your shell should do the following in a loop:
stdin
or the script. If there was
an error, or you encounted an end of file.parse_command
(in parser.c
)
to parse the line into a structure containing an array of arguments, an
optional input redirection, and an optional output redirection.
free_command
to free the memory returned by
parse_command
.
You do not have to parse commands yourself, but can rely on
the provided parse_command
function. This function takes
a string argument (the command line to be parsed) and returns a pointer
to a struct command
containing three members:
args
is an array of pointers to C strings, terminated
by a NULL
pointer after the last argument.
args[0]
is the command name, and the subsequent members
are the command's arguments. This array is suitable for passing
as the second argument to execvp()
.
in_redir
is the name of the file to redirect standard
input from, or NULL
if stdin should not be redirected.
out_redir
is the name of the file to redirect standard
output to, or NULL
if stdout should not be redirected.
Note that, if the command line was blank, parse_command
returns a command
struct with no arguments or command
name (args[0] == NULL
). Your program should handle
such empty commands (by ignoring them).
You must call free_command
on the pointer
returned by parse_command
, after executing the command
and before reading the next one.
Your shell should support the following built-in commands. You may ignore any redirections specified with these commands. If a built-in command encounters an error, it should not terminate the entire shell, but rather report that error and return to the main loop.
cd dir
The cd
command, when given one or more arguments, should
use the chdir
system call to change the shell's current
working directory as specified by the dir
argument.
cd
If cd
is called with no arguments, it should change
to the user's home directory. You can find the home directory with
getenv("HOME")
, but beware: that function might return
NULL
if the user has removed that environment variable.
If that is the case, report an error message (but do not terminate
the program).
setenv variable value
The setenv
command sets an environment variable. These
variables control various aspects of the standard library and the
programs you execute: the user's home directory, the path in which to
search for executables, etc. For example, the LD_PRELOAD
we saw in Lab 3 is an environment variable.
Your implementation should call the setenv()
library
function, passing a nonzero value as the overwrite
argument.
You should report any error reported by setenv
. If the
user did not supply any arguments, you should report an error and
not call setenv
.
setenv variable
When called with only one argument, setenv
should
unset the named environment variable by calling the
unsetenv()
library function.
exit
The exit
command, with any number of arguments, causes
the shell to exit with exit status 0 (indicating success).
If the input did not contain a command name (for example, if the user entered only spaces), your shell should ignore the input.
Any other value of the 0th argument indicates an external command. The syntax of an external command is:
command arg1 arg2 ... [< infile] [> outfile]
To execute an external command, you will need to:
fork
to create a child process.dup2()
. If there was an error, exit the child
process.
O_RDONLY
.
O_WRONLY
, along with appropriate flags to create
new files or truncate existing files. See man 2 open
for more details.
dup2()
succeeds, to avoid taking up file descriptor
slots even after executing the external program. This will not affect
the file descriptor (0 or 1) that you duplicated to.
execvp()
to execute the command.
waitpid()
to wait for the
child process to terminate.
stderr
based on the child process's exit
status (see man waitpid
for more information):
"Command returned 3"
"Command killed: Interrupt"
. You can use the
strsignal()
function to convert a signal
number returned by WTERMSIG
into a string such
as "Interrupt".
There is one more trick to executing external commands. If you follow
the above algorithm, and the user tries to kill the command
with Ctrl-C, that will also kill your shell. You should not
allow that to happen. Inside the parent process, when you are about to
wait for the child, you can use the signal()
function to
ignore the SIGINT
(keyboard interrupt) signal. After the
child process has finished, use signal()
again to reset
the handler to the default, so that Ctrl-C at the shell prompt
does still kill your shell.
CFLAGS
:
-std=c89
: old-school ANSI C (gcc
default).
-std=c99
: the most commonly-used version of modern C.
-std=c11
: the most recent C standard.
-std=c++98
: old-school C++ (g++
default).
-std=c++11
: modern C++, the newest version supported
by gcc 4.8.
-Wall
compiler flag
(enable all warnings), without producing any compiler or linker warnings
or errors.
make clean; make
should build your program
without any manual intervention.
parse_command
should be matched by a call to
free_command
on the same pointer.
fork()
, execvp()
, and waitpid()
.
You should not use system()
,
posix_spawn()
, or similar functionss.
open()
system call directly (not fopen
or ifstream
),
followed by dup2()
to put the file on the appropriate file
descriptor (0 for stdin, 1 for stdout).
>
should create a new file if necessary,
or truncate an existing file.
Input redirection <
should not.
stderr
, not stdout
.
If any system call or library function fails, your program should report
the reason for a failure
in an error message to stderr
, and either (1) handle and
correct the error,
(2) terminate the program (or the child process), or (3) (in built-in
commands only) end the command without terminating the program.
Your program explictly does not have to check for or handle
errors in output functions (fprintf
, cerr <<
,
etc.).
If your program does terminate
because of an error, it should indicate this by calling exit
with a nonzero status code.
Your error messages should generally include two parts, separated by
a colon. First, a message indicating the context of the error in the
program: that is, what it was trying to do. Then, a message
indicating the nature or cause of the error, usually as indicated by the
errno
variable. For example:
Error executing badcmd: No such file or directory Could not open /private for output: Permission denied
For consistency with the messages produced by other programs, you should obtain the second part of the message by using one (or both) of the following C library functions:
perror()
to print the error message associated with errno
, along with a prefix you supply, to stderr.strerror()
to obtain the error message as a (C) string, for use with
fprintf
, cerr
, etc.
Among the library functions and system calls you are likely to use, at least the following require error checks and messages:
malloc
and calloc
.
fopen
and fclose
of the script file, if
provided.
ferror()
and feof()
to tell the difference.
fork
ing a child process.
open
of redirected files. Note that terminating the
program because of an error in the child process will (correctly)
not kill the parent process.
dup2
of the redirected file handle.
execvp
to execute a program from the child process.
In fact, an error is the only reason an exec
function
would ever return.
waitpid
in the parent process.
chdir
in your cd
builtin command (do not terminate the shell!)
setenv
and unsetenv
in your setenv
builtin command (do not terminate the shell!)
Most of these functions indicate errors by returning either -1 (for
functions that return an integer) or NULL
(for functions
that return a pointer), but there are a few exceptions, particularly
in the stdio
functions: check the manual pages to be sure.
Remember to send error messages to stderr
(cerr
in C++), not stdout
.
A script and collection of test cases are provided to help test your
program. To run the tests, first make sure you are in the directory
that contains your executable named shell
and the
tests
subdirectory, then do:
./run-tests
You should see a list of 25 tests, with "SUCCESS" or "FAILURE" appearing after each. If the test failed, the next line is one of three messages that explains the error:
The script did not produce the expected output to stdout. The message
lists the file containing the script's actual output in your shell
(for example, tests/output.12
), and the file containing
the expected output
(tests/expected/test.12.redir-out-failure.in
).
Note that the output here is the stdout of the commands that your shell executes. If your shell itself prints any output to stdout (as opposed to stderr), that will result in a test failure. Also, the output must match exactly: even a difference in whitespace or line endings will cause a failure.
Your shell was not expected to print any messages to stderr for this script, but it did. The listed file contains the data your shell printed to stderr.
If you print a prompt even when given a script file as input, or if you left debugging output in your shell, that will cause this failure.
Your shell was expected to print an error message to stderr for this script, but it did not. The listed file contains the expected error messages.
Note that run-tests
does not check the exact contents
of the error messages, only whether there were error messages. It
is still your responsibility to make sure the
error messages are reasonable.
Each of the twenty-five tests has three associated files. For example, test 12:
tests/test.12.redir-out-failure.in
contains the
script being tested, which is provided to your shell as a command-line
argument.
tests/expected/test.12.redir-out-failure.out
contains
the expected stdout output of your shell for this test.
tests/expected/test.12.redir-out-failure.err
contains
the expected stderr output of your shell for this test. As noted
above, you do not have to exactly match the error output.
If a test fails, its output and errors are saved to
tests/output.NN
and tests/error.NN
, where
NN
is the number of the test. If the test succeeds, these
two files are deleted.
Your submission should include a README
file. This should
be a plain text file with at least the following sections:
main
and helper functions",
or "README: this file".
If you have any failing test cases, memory errors, etc., also describe those problems here. Where do the problems occur, what steps did you take to try to solve them, and what further steps would you take if you had more time?
Likewise, if you noticed in your own testing any situations that your shell does not handle well, even if they are not covered by the test cases, describe them here.
The design of the
frobulate()
function benefitted from discussions with my tutor J. Random Hacker, who suggested doing the bit-twiddling before the loop rather than inside the loop.
Likewise, if you used code snippets from sites such as Stack Overflow, describe what you used, explain how it works in your own words, and provide the name of the author and the URL where they posted the code. For example:
Programming assignments are expected to be your own work, so any borrowed code should be very small relative to the total size of your program. You may not borrow code from or share code with other UK students at all.The data-copying loop in
perform_magic()
is based on code written by C. Guru at:http://answermyprogrammingquestions.com/0xc0debeef/
. The loop casts the character pointer to a integer pointer and uses that to copy four bytes of data at a time. If the number of bytes was not divisible by four, the code then copies the remaining 1-3 bytes individually.
Submit a zip or tar.gz file containing a directory with the following files:
parser.c
.
shell
.
This may be the provided
Makefile,
a modified version of it, or a brand new Makefile.
To make a .tar.gz archive of the directory program4, you can use a command like:
tar czvf cs485-program4.tar.gz program4/To make a .zip archive:
zip -r cs485-program4.zip program4/
Submit your .tar.gz or .zip file at the CS Portal website, under course CS485G006 and assignment "Programming Assignment 4 - Shell".
This assignment will be scored out of 100 points:
-Wall
.In addition, up to 15 bonus points are available, for implementing background jobs and a few related built-in commands.
fflush(stderr)
after printing it to flush the output
buffer. Otherwise, the output might not appear until a newline is
printed.
open
with
O_CREAT
to create a file, you must also supply a third
argument for the file mode (permissions). Use the value 0666
(octal, hence the leading zero) so that the file will have the
correct default permissions (see man umask
to see why
this doesn't result in files that are writable by everybody). For example:
if ((fd = open(filename, O_RDWR | O_CREAT, 0666)) < 0) ...
If you dislike magic numbers, you may instead write 0666
as
S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH
.
For up to 15 bonus points, add support to your shell for background
commands, indicated by an ampersand
(&
) at the end. When your shell executes a background
command, it should not wait for the child process to terminate.
Instead, it should add that process's command and PID to a list (which
you might store in a global array for example), and print the
job number (the index into that list) for future reference.
To assist with using background commands, you should also implement the following built-in commands:
jobs
List the job number, PID, and command name of each running background command. For example:
shell> jobs [1] 12371: sleep [2] 12373: cp
fg num
Bring the command with job number num
back to the
foreground, by waiting for it with waitpid
.
If num
is not a valid job number, or that process
has already terminated, report an error message.
Like the parent process of an external command, this command
should ignore SIGINT
while it is waiting, and
should print a message if the process's
exit status is nonzero.
fg
If no argument is provided, fg
operates on the
most-recently-started job that is still running.
Note that this might not be the last background command executed,
because that one may have already terminated.
To implement the bonus, you will have to modify
struct command
to add a flag
indicating whether this is a foreground or background command.
You will probably also need to modify parse_command
to set the flag. Note that the &
should not
be passed to the command as an argument!
Your program should not leave zombie processes around. That means
that you will need to use the signal()
function to
establish a signal handler for SIGCHLD
. In your
signal handler you can call wait
to get the process ID
and exit status of the process that exited, remove the job from the
jobs table, then print a message to stderr with the exit status
or terminating signal.
[1] 12371: sleep - Command returned 0. [2] 12373: cp - Command killed: Hangup
Finally, note that the command name in the struct command
will be freed by the free_command
function. So your table
of background jobs will need to maintain a copy of this string, which
you can make with the strdup