Unix shell allows for quickly automating interaction with processes and data. In this class we revise basic Unix shell concepts.
Basic commands
The following simple commands are used very often:
ls
: shows the content of current directory. Option-l
displays content inlong
format; option-a
also displays hidden files (beginning with a.
);file filename
: shows the type of file namedfilename
;pwd
: (print working directory) shows the path of current working directory;mkdir name
: creates a new directoryname
in the current working directory;cd path
: (change directory) moves working directory to the specified path;cat file
: show file content. If more than one file is specified contents are concatenated;echo "hello"
: prints string “hello”grep word file
: look forword
infile
and prints all lines that contain it;man command
: shows command man page. Arrows up and down navigate,q
exits,/
searches (n
next hit,N
previous hit);find path expression
looks for files inpath
(recursively) matching the specifiedexpression
. For example:find / -name "*.c" -print
prints all the file that ends with.c
sort file
: sort lines of a text file;strings file
: find printable strings in a (binary) file.
For most of the above commands, when no filename is specified input is taken from the terminal (ctrl-D sends a EOF and terminates). For example:
$ grep work I'm checking what happens when grep is run without specifying a filename! How does this work? How does this work? ah: matching line are printed out as expected! (ctrl-D terminates) $
Redirection
Redirection is a fundamental Unix shell mechanism to redirect program input and output from/to a file. When the program output is redirected to a file (symbol >
) any output from the program will be written to the file instead of the terminal; similarly, when program input is redirected from a file (symbol <
) the content of the file will be sent as input to the program, in place of what the user writes on the terminal.
The following examples illustrate redirection:
ls > tmpfile
: write the content of the current folder into filetmpfile
. Check withcat tmpfile
;grep shell < tmpfile
: commandgrep shell
, with no file specified on the command line, looks for wordshell
on the input given from the terminal. Adding< tmpfile
redirects the content of the file to thegrep
command. The behaviour is the same asgrep shell tmpfile
; in fact,grep shell
alone waits for user’s input as explained above;date >> tmpfile
: appends current date to filetmpfile
(notice that > overwrites the file instead). Check withcat
. Note: overwriting is done silently so be careful when using redirection with a single>
.
Pipe
Pipes are a fundamental mechanism for process communication in Unix. They are similar to redirection but work between two programs. They constitute a communication channel between processes: a process can write to the pipe and another one can read from it.
In the Unix Shell, pipes are specified using |
. In particular, cmd1 | cmd2 | … | cmdn
, executes all commands and the output of each command i is given as input to the next command i+1. The output of the last command is printed on the terminal. This is very handy to combine commands and make them operate on data as a pipeline.
A few examples follow:
ls -l | grep shell
: shows all file names that contain wordshell
;ls | grep shell | sort -r
: as before but file names are sorted in inverse alphabetic order (option-r
). Notice that in this case we have three programs cooperating together;ls | grep shell | grep txt
: shows all file names that contain bothshell
andtxt
.
Regular expressions
Regular expressions are patterns representing sets of strings. They are very useful to perform advanced searches in which it is necessary to find strings with a particular structure. Command grep
allows for specifying regular expressions.
^
is the beginning of line.ls -al | grep '^d'
matches all directory files in the current directory (d is the flag that indicates a directory file). If we omit the^
symbol,grep
will match all lines containing a d, not necessarily in the first position;- Analogously
$
indicates end of the row; .
represents a single character. For examplegrep '.ino'
will match names such as Nino, Pino, Gino, …c*
represents a possibly empty, arbitrary number of occurrences of characterc
. For example,grep 'smart *card'
will matchsmartcard
,smart card
,smart card
and so on (black space is repeated an arbitrary number of times). Of course, it is possible to use.*
to match an arbitrary number of arbitrary characters;- Similarly,
c\+
represents one or more occurrences ofc
andc\?
represents zero or one occurrences ofc
. Notice that these characters need to be protected pre-pending a backslash\
character; - To find a special character like
.
or*
it is enough to protect it with a backslash\
character. For characters that needs to be protected in regular expression such as\+
and\?
it is instead enough to remove the backslash; [0123456789]
or equivalently[0-9]
represents all digits from 0 to 9. For example,[0-9]\+
is a decimal number of arbitrary length;[^0-9]
represents anything that is not a digit. Notice the use of^
for negating the content of a set in square brackets (which is different from the previous usage representing the beginning of a line). For example grep'^[^0-9]*$ filename'
finds all lines that do not containg digits in filefilename
.- There exist predefined set of characters. For example:
^[[:alnum:][:blank:]]*$
matches all lines composed of alphanumerics and spaces.