Bash Scripting : hints

Bash hints to know or how to make complex issues simple

Expantion

Expantion is at heart of bash. It changes prefixed '$' or special characterer like '*' expression by their computed values and occurs before evaluating command. One may use this feature for write shorter script. Pathname expansion is an incredibly powerful tool to avoid having to specify exact pathnames in our arguments, or to go looking through our file system for the files we need. Brace expantion use the same idea but do not depend on filesystem.

The brace expention hints

#!/bin/bash
 
# basic list
echo {1,2,3,4,5}
# result : 1 2 3 4 5
 
 
# range list
echo {1..5}
# result : 1 2 3 4 5
 
# usefull for loop 
for i in {1..5}  ; do echo $i ; done
 
 
# Preamble and postscript ( ie    preamble{case1,case2} or {case1,case2}postscript )
echo file{.txt,.log}
# result : file.txt file.log
 
# brace expension and file expantion
ls *{.txt,.log} # equivalent to "ls *.txt  *.log"

String operations

String manipulation

#!/bin/bash
 
I=foobar
 
 
#trailing substitution
echo ${I%bar}
# result :foo
 
 
#leading substitution
echo ${I#foo}
# result :bar
 
#full substitution
STR="the wall is black"
echo ${STR/black/blue}
 
# result : the wall is blue

Uppercase and Lowercase for bash

#!/bin/bash
 
 
F="hello HELLO"
 
echo ${F^^}
# result : HELLO HELLO
 
 
echo ${F,,}
# result :hello hello

Substring and length of string

#!/bin/bash
 
 
F="abcdefgh"
echo ${F:5:3}
 
# result : fgh
 
 
echo ${#F}
# result : 8

Usecase rename

#!/bin/bash
 
F="my_file.log"
echo ${F%.log}.txt
 
# result : my_file.txt

Detailed documentation for sed doc_sed.odt
Detailed documentation for vim doc_vim.odt

Processing many files at once

Full Usecase rename

#!/bin/bash
 
 
for F in `find . -name "*.java"` ; do
  echo $F
  # rename .java to .txt
  mv $F ${F%.java}.txt
done

Processing larger dataset or files

Let's dive into the details about those two loops : for and while. Let's face it 'for' is fine but while both versions will work in general, the second variant is better for the following reasons:

It's faster. Namely, in the first case the find command has to finish before processing on directories starts. This isn't noticeable for small directory hierarchies, but it becomes very noticeable for large ones. In the second case the find command outputs results and in parallel while loop picks them up and does processing.
In case you have spaces embedded in directory names, the second version will work, while the first won't.

for

for i in `find . -type d`
do
     # do some processing on the found directory
done

Should be replace by

while

find . -type d | while read i
do
    # do some processing on the found directory
done

As more cores are available and when parallel processing is required, one may use gnu parallel 'for loop'. Methode using gnu parallel should be prefered as it has been developped on purpose to handle this issue.

parallel loop

#!/bin/bash
 
# standard sequential for loop
for F in $(find . -name *.mp3) ; do mpg -w $F ${F%.mp3}.wav ; done
 
 
# using gnu parallel : parallel creation of bz2 archive of all *.bin file
find . -name '*.bin' 2>/dev/null | parallel tar  -cjf {1}.tar.bz2 {1}
 
 
#  parallel but without gnu parallel : parallel execution in max N-process concurrent
N=4
 
for i in {1..100}; do
    (
        # .. do your stuff here
        echo "starting task $i.."
        sleep $(( (RANDOM % 3) + 1))
    ) &
 
    # allow only to execute $N jobs in parallel
    if [[ $(jobs -r -p | wc -l) -gt $N ]]; then
        # wait only for first job
        wait -n
    fi
 
done
 
# wait for pending jobs
wait
 
echo "all done"

Special trick

Special and internal variable

#!/bin/bash
 
 
# list of arguments passed to script as string
echo $*
 
# list of arguments passed to script as delimited list
echo $@
 
#number of arguments passed to current script
echo $# 
 
# name of shell script (relative path from the current directory)
echo $0 
 
# argument 1
echo $1
 
#pid of the current shell
echo $$
 
 
 
 
bash -c "exit 24"
#last exit status
echo $?
 
#PID of the most recent background ( empty if not set)
echo "PID :"$!
cat /proc/cpuinfo >/dev/null &
echo "PID :"$!
 
 
#the (input) field separator 
echo -n "$IFS" | od -abc
 
SET="1 2 3 4 5"
 
for I in $SET ; do  echo "I="$I ;done
 
# result : 
#I=1
#I=2
#I=3
#I=4
#I=5
 
IFS=':'
echo -n "$IFS" | od -abc
 
for I in $SET ; do   echo "I="$I ;done
# result : 
#I=1 2 3 4 5

Remove undersore

touch 'test test'
IFS=$'\n'
for F in `find . -name "* *"` ; do
   OLDF=`echo $F | sed -e 's# #\ #g'`
   echo $OLDF   
   NEWF=`echo $F | sed -e 's/ /_/g'`
   echo $NEWF
   mv $OLDF $NEWF
done