Unix shell allows for quickly automating interaction with processes and data. In this class we revise basic Unix shell concepts.
Basic commands
The following simple commands are used very often:
ls: shows the content of current directory. Option-ldisplays content inlongformat; option-aalso displays hidden files (beginning with a.);file filename: shows the type of file namedfilename;pwd: (print working directory) shows the path of current working directory;mkdir name: creates a new directorynamein the current working directory;cd path: (change directory) moves working directory to the specified path;cat file: show file content. If more than one file is specified contents are concatenated;echo "hello": prints string “hello”grep word file: look forwordinfileand prints all lines that contain it;man command: shows command man page. Arrows up and down navigate,qexits,/searches (nnext hit,Nprevious hit);find path expressionlooks for files inpath(recursively) matching the specifiedexpression. For example:find / -name "*.c" -printprints all the file that ends with.csort file: sort lines of a text file;strings file: find printable strings in a (binary) file.
For most of the above commands, when no filename is specified input is taken from the terminal (ctrl-D sends a EOF and terminates). For example:
$ grep work I'm checking what happens when grep is run without specifying a filename! How does this work? How does this work? ah: matching line are printed out as expected! (ctrl-D terminates) $
Redirection
Redirection is a fundamental Unix shell mechanism to redirect program input and output from/to a file. When the program output is redirected to a file (symbol >) any output from the program will be written to the file instead of the terminal; similarly, when program input is redirected from a file (symbol <) the content of the file will be sent as input to the program, in place of what the user writes on the terminal.
The following examples illustrate redirection:
ls > tmpfile: write the content of the current folder into filetmpfile. Check withcat tmpfile;grep shell < tmpfile: commandgrep shell, with no file specified on the command line, looks for wordshellon the input given from the terminal. Adding< tmpfileredirects the content of the file to thegrepcommand. The behaviour is the same asgrep shell tmpfile; in fact,grep shellalone waits for user’s input as explained above;date >> tmpfile: appends current date to filetmpfile(notice that > overwrites the file instead). Check withcat. Note: overwriting is done silently so be careful when using redirection with a single>.
Pipe
Pipes are a fundamental mechanism for process communication in Unix. They are similar to redirection but work between two programs. They constitute a communication channel between processes: a process can write to the pipe and another one can read from it.
In the Unix Shell, pipes are specified using |. In particular, cmd1 | cmd2 | … | cmdn, executes all commands and the output of each command i is given as input to the next command i+1. The output of the last command is printed on the terminal. This is very handy to combine commands and make them operate on data as a pipeline.
A few examples follow:
ls -l | grep shell: shows all file names that contain wordshell;ls | grep shell | sort -r: as before but file names are sorted in inverse alphabetic order (option-r). Notice that in this case we have three programs cooperating together;ls | grep shell | grep txt: shows all file names that contain bothshellandtxt.
Regular expressions
Regular expressions are patterns representing sets of strings. They are very useful to perform advanced searches in which it is necessary to find strings with a particular structure. Command grep allows for specifying regular expressions.
^is the beginning of line.ls -al | grep '^d'matches all directory files in the current directory (d is the flag that indicates a directory file). If we omit the^symbol,grepwill match all lines containing a d, not necessarily in the first position;- Analogously
$indicates end of the row; .represents a single character. For examplegrep '.ino'will match names such as Nino, Pino, Gino, …c*represents a possibly empty, arbitrary number of occurrences of characterc. For example,grep 'smart *card'will matchsmartcard,smart card,smart cardand so on (black space is repeated an arbitrary number of times). Of course, it is possible to use.*to match an arbitrary number of arbitrary characters;- Similarly,
c\+represents one or more occurrences ofcandc\?represents zero or one occurrences ofc. Notice that these characters need to be protected pre-pending a backslash\character; - To find a special character like
.or*it is enough to protect it with a backslash\character. For characters that needs to be protected in regular expression such as\+and\?it is instead enough to remove the backslash; [0123456789]or equivalently[0-9]represents all digits from 0 to 9. For example,[0-9]\+is a decimal number of arbitrary length;[^0-9]represents anything that is not a digit. Notice the use of^for negating the content of a set in square brackets (which is different from the previous usage representing the beginning of a line). For example grep'^[^0-9]*$ filename'finds all lines that do not containg digits in filefilename.- There exist predefined set of characters. For example:
^[[:alnum:][:blank:]]*$matches all lines composed of alphanumerics and spaces.