Shell scripting cheat sheet
Notes from the Udemy course “Shell Scripting: Discover How to Automate Command Line Tasks” by Jason Cannon
Shebang
- shebang is located at the top of your script and is followed by the path to an interpreter: for example #!/bin/bash or #!/bin/python
- it indicates which interpreter to use for the commands listed in the script. The interpreter being what executes commands in your script
- the interpreter is executed and the path used to call the script is passed as an argument to the interpreter
- If a script does not contain a shebang, the command are executed using your shell. Different shells have slightly varying syntax
Execute the script
chmod +x finale_name.sh./final_name.sh
Variables
- variables are case sensitive and by convention are uppercase. Be sure to not use any space around the “=” sign. For example VARIABLE_NAME=”value”
- to use variable in your script use the $ sign. For example:
echo “I like my $MY_SHELL shell”
echo “I like my ${MY_SHELL} shell”
- you can also assign the output of a command to a variable with parenthesis. For example: SERVER_NAME=$(hostname)
Tests
- you can do test conditions which return True (= exits with a status of 0) or false (= exits with a status of 1)
- Syntax: [ condition_to_test ]. For example, this test checks if /etc/passwd exists [ -e /etc/passwd ]
File operators (tests)
- -d FILE: True if file is a directory
- -e FILE: True if file exists
- -f FILE: True if file exists and is a regular file
- -r FILE: True if file is readable by you
- -s FILE: True if file exists and is not empty
- -w FILE: True if file is writable by you
- -x FILE: True if file is executable by you
String operators (tests)
- z STRING: True if string is empty
- -n STRING: True if string is not empty
- STRING1 = STRING2: True if strings are equal
- STRING1 != STRING2: True if strings are not equal
Arithmetic operators (tests)
- arg1 -eq arg2: True if arg1 is equal to arg2
- arg1 -ne arg2: True if arg1 is not equal to arg2
- arg1 -lt arg2: True if arg1 is less than arg2
- arg1 -le arg2: True if arg1 is less than or equal to arg2
- arg1 -gt arg2: True if arg1 is greater than arg2
- arg1 -ge arg2: True if arg1 is greater or equal to arg2
If, elif and else statement
if [ condition-if-true ]then command 1elif [ condition-if-true ]then command 2else command 3fi
- by convention, we enclose variables in quotes to prevent some unexpected side effect
#!/bin/bashMY_SHELL =”bash”if [ “$MY_SHELL” = “bash” ]then echo “You seem to like the bash shell”fi
For loop
for VARIABLE_NAME in ITEM_1 ITEM_Ndo command 1 command 2 command Ndone
example:
#!/bin/bashfor COLOR in red green bluedo echo “COLOR: $COLOR”done
It is also common practice to store the items list in a variable
#!/bin/bashCOLORS=”red green blue”for COLOR in $COLORSdo echo “COLOR: $COLOR”done
Positional parameters
Give parameters when executing the script
$ script.sh parameter1 parameter2 parameter3… can be found with:
$0: “script.sh”
$1: “parameter1”
$2: “parameter2”
$3: “parameter3”
Note that the script name itself is store in $0
#!/bin/bashecho “Executing script: $0”echo “Archiving user: $1”
You will execute this script with
./archive_user.sh elvis
You can also assign the parameter value to a variable
#! /bin/bashUSERR=$1
Access all parameters
You can access all the positional parameters with $@
#!/bin/bashfor USER in $@do XXXXXXdone
Accepting User Input (STDIN)
read -p “PROMPT” VARIABLE
Example:
#!/bin/bashread -p “Enter a user name: “ USERecho “archiving user: $USER”
Exit statuses and return code
- every command returns an exit status which range from 0 to 255
- 0 = success ; other means error condition which is used for error checking
- use man or info to find meaning of exit status
- $? contains the return code of the previously executed command
ls /not/hereecho “$?”if [ “$?” -eq “0”] then …
Logical
- && = AND. second command will be run only if the first command succeeded
- || = OR. second command won’t be run if the first command succeeded
- semicolon: separate commands to ensure they all get executed
cp test.txt /tmp/bak/ ; cp test.txt /temp
Exit command
- explicitly define the return code. Default value to the last command executed
- When the exit command is reached, your script will stop running
- examples: exit 0, exit 2, exit 255
#!/bin/bashHOST=”google.com”ping -c 1 $HOSTif [ “$?” -ne “0” ]then echo “$HOST unreachable” exit 1fiexit 0
Functions
function hello() { echo “Hello!”}
- note: you call the function without parenthesis, i.e hello
- if you need parameters you do: hello param_1
Positional parameters
- functions can accept parameters
- The first parameters is stored in $1, etc.
- $@ contains all of the parameters
- $0 is the script itself, not the function name
Variable scope
- by default, variables are global. However, if defined within a function, the variable is not available outside of the function until the function is called and executed
- variables have to be defined before used
Local variables
- can only be accessed within the function. Created using the local keyword, i.e local LOCAL_VAR=1
- only functions can have local variables
Exit status
- functions have an explicit exit status return <RETURN_CODE> ; for example return 1. Implicitly, it is the exit status of the last command executed in the function
- Similarly, 0 = success and you access the exit status with $?
Wildcards
- * matches 0 or more characters like *.txt
- ? matches exactly one character like a?.txt
- [] is a character class. It matches any of the characters included between the brackets. Matches exactly one character, like ca[nt]* can be cat, can, candy, etc.
- [!] matches any of the character NOT included between the brackets. Matches eactly one character, like [!aeiou]* can be baseball or cricket
- Use [a-g] or [3–6] to create a range in a character class
- You can used predefined named character classes such as: [[:alpha:]] to get all characters from the alphabet or [[:alnum:]] for all alphanumeric characters. Others are [[:digit:]], [[:lower:]], [[:space:]], [[:upper:]]
- to match a wildcard pattern use the escape character \, for example *\? matches all files that end with a question mark
Case statements
case “$VAR” in pattern_1) commands_go_here ;; pattern_N) commands_go_here ;;esac
Example:
case “$1” in start) /usr/sbin/sshd ;; stop) kill $(cat /var/run/sshd.pid ;; *) echo “XXXXXX” ;;esac
Logging
The syslog standard uses facilities and severities to categorize messages
- Facilities: kern, user, mail, auth, info, debug, daemon, local0, local7. Indicates where the message comes from
- Severities: emerg, alert, crit, err, warning, notice, info, debug.
Log file locations are also configurable:
- /var/log/messages
- /var/log/syslog
Logging with logger
By default, the logger utility creates an user.notice message
logger “Message”logger -p local0.info “Message”logger -t myscript -p local0.info “Message”logger -i -t myscript “Message”
- -p: use to specify the facility and severity
- -t: to tag the message. Typically you use the name of the script as tag
- -i: to include the process ID (PID), used to differentiate logs if you run several time the same script
- -s: if you want to send the message to the screen in addition to the logging system
While loop
Loop format
while [ condition_is_true ]do command 1done
Loop a defined number of times
INDEX=1while [ $INDEX -lt 6 ]do echo “XXXXX” ((INDEX++))done
Here INDEX is increased incrementally with ((INDEX++))
Reading a file, line by line
LINE_NUM=1while read LINEdo echo “${LINE_NUM} : ${LINE} “ ((LINE_NUM++))done < /etc/fstab
Break and continue
You can also use the break and continue statement inside a loop to control when the loop should stop
- break: exit a loop before the normal ending
- continue: restart the loop at the next iteration before the loop completes
Debugging
Built in debugging help
- -x = prints commands as they execute; so arguments are printed as they are executed
- after substitutions and expansions
- called an x-trace, tracing, or print debugging
- #!/bin/bash -x
- in the script: set -x to start debugging. Set +x to stop debugging
#!/bin/bashTEST_VAR=’test’set -xecho $TEST_VARset +xhostname
- -e = exit on error
- can be combined with other option: #!/bin/bas -ex
#!/bin/bash -eFILE_NAME=’/not/here’ls $FILE_NAMEecho $FILE_NAME
- -v = print shell input as they are read
- can be combined with other options
#!/bin/bash -vTEST_NAME=’test’echo $TEST_NAME
Other information for writing script
This is outside of the Udemy course. In addition to the above, the following commands are useful
Pipeline and piping
Source: https://ryanstutorials.net/linuxtutorial/piping.php
When you run a program, you have 3 streams connected to it. A number is associated with each stream and is used for identification
- STDIN (0) — Standard input (data fed into the program)
- STDOUT (1) — Standard output (data printed by the program, defaults to the terminal)
- STDERR (2) — Standard error (for error messages, also defaults to the terminal)
Save the output (STDOUT) to a file with > or >>
“The greater than operator ( > ) indicates to the command line that we wish the programs output (or whatever it sends to STDOUT) to be saved in a file instead of printed to the screen”
ls > test.txt
By default, it will create a new file or if the file already exists, clear its content and save the new output. If we need to append the output to a file, use >>
ls >> test.txt
Feed the input of a program (STDIN) with a file by using <
“Read data from the file and feed it into the program via it’s stream”. Here wc counts the number of words in myoutput
wc -l < myoutputwc -l < barry.txt > myoutput
Redirect STDERR
You can use numbers to indicate to save STDERR into a file. STDERR is represented by number 2
ls -l video.mpg blah.foo 2> errors.txt
In the example above, if there is an error, the message will be saved into errors.txt.
“to save both normal output and error messages into a single file. This can be done by redirecting the STDERR stream to the STDOUT stream and redirecting STDOUT to a file. We redirect to a file first then redirect the error stream. We identify the redirection to a stream by placing an & in front of the stream number (otherwise it would redirect to a file called 1)”
ls -l video.mpg blah.foo > myoutput 2>&1
Send data from one program to another with |
This is called piping.
ls | head -3ls | head -3 | tail -1
Few useful commands
source: https://analyticsindiamag.com/top-commands-in-shell-scripting-every-data-scientist-must-know/
Sed
Sed is used to search for a particular string in a file and then apply diverse operations such as replace, delete, insert, etc… So you can edit a file without opening it. It is often used for string replacement
sed “s/[Cc]omputer/COMPUTER/g” file
For example, here we are doing a substitution (indicated by s/) changing all occurrences (indicated by g for global) of computer or Computer by COMPUTER in the file indicated
Grep
“Global Regular Expression Print or grep is a command-line tool which is basically used to search for a string of characters in a specified file. The grep filter searches a file for a particular pattern of characters, and displays all lines that contain that pattern”
On the contrary to sed, it is used mainly to return lines from a file
grep “literal_string” filename
Awk
“This command searches for text-based files or data and is basically used for generating information or manipulating data. It also allows users to implement numeric functions, string functions, logical operators, etc. It is useful for the transformation of data files along with creating formatted reports.” It is used for pattern scanning and processing
awk ‘/manager/ {print}’ employee.txt
For example, it prints all lines from the employee.txt file where we have the word “manager”
awk ‘{print 4}’ employee.txt
Print the word 1 and 4 of each line (source)
Tee
“Tee command reads the standard input and writes it to both the standard output and one or more files. […] It basically breaks the output of a program so that it can be both displayed and saved in a file” (source).
the -a option: allows to append rather than overwrite the given file.
wc -l output.txt|tee -a file2.txt
In this example, we count the number of words from output.txt, then display the result on the terminal and save the result in the file2.txt
Cat
“The cat command is used for concatenating files and printing on the standard output. It allows the user to create as well as concatenate files after reading the given file”
cat text1
Read the content of the file text1
cat text1 text2 > log.txt
Concatenate the 2 files together and save the result in log.txt
Export
Allows you to pass environment variables to other processes. You can also export functions. It can be useful to export the PATH for example
Function example in a data science product case
execute_sql(){ SQL_DIR=$1 SQL_FILE=$2 cat $SQL_DIR/$SQL_FILE > ./my_folder/execute_$SQL_FILE hive -f ./my_folder/execute_$SQL_FILE}