Shell We C: Building Terminals

Explore the journey of building a powerful C-based shell, from process handling to executing commands with piping.

Shell We C: Building Terminals

This project was created as part of the operating systems course at UW Madison. My sincere thanks to Professor Oliphant for overseeing this project. This was one of my most challenging undertakings and one which really accelerated my understanding of large scale C projects.

Program Specifications

We know it will be hard. We also expect it to be long.

Winston Churchill

The shell is very simple. It asks for input and then executes that command. We exit when we type in exit. The shell can run in interactive mode as well as batch mode. We also have the following built-in commands:

  • exit: simply call in exit
  • cd: takes 1 argument that changes the current working directory
  • jobs: prints the jobs running in the background
  • fg: used for moving processes into the foreground
  • bg used for moving processes into the background

We will also implement handlers for SIGINT (Ctrl+C) and the SIGTSTP (Ctrl+Z).
We allow user to launch background commands with the ampersand operator
sh> <cmd> <args...> &

We will support piping allowing chaining multiple programs to create a command with a complex behavior. Pipe is denoted as |.

Knowing where to start

The journey of a thousand miles begins with a single step

Before we move to implement the core features of our shell, we must create a process structure to store important metadata about the processes our shell is handling. We'll use a heap data structure (check out GeekForGeeks implementation here) to keep track of the free process IDs.

struct processMetadata {
    int id;
    int pgid;
    status stat;
    run_status run_stat;
    char pgm_args[COMMAND_LENGTH];
}
struct processList {
    struct Heap* freeIDs;
    int n_procs;
    struct processMetadata procs[MAX_BACKGROUND_PROCESSES];
}

Lets discuss the constituents of processMetadata

  • id: stores the id of the process
  • pgid: when commands are chained together via piping, they have the same "process group id".
  • stat: whether the process is running in the foreground or background
  • run_stat: whether the process is running or paused

Coming to the processList we have:

  • freeIDs: a min heap containing the ids available. Our shell supports upto a maximum of 256 processes in the background and freeIDs range between 0 to 255.
  • n_procs: number of active processes in the processList
  • procs: array containing metadata information about the processes on the shell

Next we need to create functions to initialize, add, search and remove from the process list. We won't delve into the details but the file describing the implementation is given below.

We now have the processList struct down, we can then move to read the user arguments. Based on this, we'll execute in interactive mode or batch mode. While the basic structure of the code in both remains more or less the same, interactive mode runs in an infinite loop and the batch mode reads lines from a file and executes them.

Built-in Commands

Whilst we yearn for binaries to grace each command, there be those we must, from the very earth, weave into code. Thus do we impart our own flourish upon the script of our shell.

We now need to define and store the built-in commands into a global list. We'll be checking against a global list when we execute commands allowing us to seamlessly add more built-in commands (create a command that allows us to modify this built-in command list!).

We'll now discuss the built-ins in slightly more detail below and leave the implementation details in the ensuing file.

  • exit: One liner really - exit(0)
  • cd: changes our current working directory (surprise surprise!). simply uses chdir to achieve this
  • jobs: iterate through the process list we defined above (and initialised on start up) and print the process ids and the command input
  • fg: move a process to the foreground. if the user passes a process id, use that. otherwise use the max id* in the process list. Use the kill function to resume the function and the waitpid with the WUNTRACED flag to wait till the process exits.
  • bg: move a process to the background. The same as foreground except we'll use waitpid with the WNOHANG flag to simply fall through. We will get a SIGCHLD when we finish the background task

*most likely the most recent process albeit a few caveats hidden in the implementation. we should be able to identify this with a reading

The Signal Handlers

A key feature of our shell is an ability to respond to keystrokes for terminating and pausing as well as receiving a few background signals. We'll go over a high level overview of the various signals our shell needs to handle:

  • SIGCHLD: This signal is sent to a parent process whenever one of its child processes terminates or stops. Now we'd like to handle this signal so that we can remove processes that were running in the background from the process list
  • SIGTSTP: The SIGTSTP signal is an interactive stop signal. We "pause" the process on receiving this signal and add it to our process list. And put it in the background.
  • SIGINT: The SIGINT signal is an interactive interrupt signal. We terminate the process on receiving this signal and remove it from our process list.
  • SIGTTIN: A process cannot read from the user’s terminal while it is running as a background job. When any process in a background job tries to read from the terminal, all of the processes in the job are sent a SIGTTIN signal. We typically disable this before forking and set it to the default behavior once we're done.
  • SIGTTOU: This is similar to SIGTTIN, but is generated when a process in a background job attempts to write to the terminal or set its modes. Again, the default action is to stop the process. Handle this similar to SIGTTIN.

We'll spare you the nitty gritties of the implementation of the signal handlers and instead invite you to check the full code attached below this blog. It would be better understood in context rather than in isolation (as can the rest of the parts!)

Parse and Execute Commands

Now that we're getting commands line by line (in batch mode or interactive mode), we want to be reading them into a buffer. After some amount of cleaning (removing '\n', inserting '\0' at the end of the buffer, etc), we can now move to handle the monster called piping. Since we can have multiple piped commands, we'll have to separate the individual commands before handling them (don't worry we'll discuss this!).

If we don't have any piped commands, we can simply execute the command. To execute, we'll check if the command is in the built-in commands. If so, we'll simply execute the relevant code section. If not, we'll be checking in the PATH list for a binary to execute. Once a matching binary is found, we'll execute the binary(hint: use fork and exec). End of story.

executeCommand: Code flow

But we will have piping sometimes. And we will have to painstakingly redirect the output of one command to the input of the next command till the last command where we'll flush the output to stdout. Now this is easier said than done which is why start small. Chain 2 commands - see if piping works and move to more commands.

forkPipes and pipeHandler (no puns intended!)

You can find a more detailed implementation here:

Conclusion

Every 'exit' is an entrance to new possibilities. All we have to do is look outside our shells

And thus, we have the beginnings of a working shell. I've provided links to the completed code below. There's a ton of limitations in this code - the process list could be optimized to not use sort (think of element insertion in O(1)), the path variable could hold current directory binaries, etc. Perhaps some of us could take this project above and beyond its current form (git repo soon!).