Tuesday, November 22, 2011

Shell Script Errors. Part1. Basic syntax


Pitfalls of shell scripting. Part1

In this post ( and in the next few posts) I'll try to cover some pitfalls where beginners to shell scripting might find themselves wasting hours looking through the code to find the error.

Variable assignment
  •   No space can occur between the variable name, the equal to sign, and the value.

  eg, var1=10     (correct)
      var1 = 10   (incorrect)

      
Output redirection
  •   '>' operator writes the output to the file name mentioned.

  eg., `date` > date.txt

  •   '>>' appends to the file. Normal '>' overwrites the previous contents.

  eg., `date` >> date.txt

  
Input redirection
  •   '<' reads from a file mentioned

  eg., sort < data

  •   '<<' is a powerful operator that allows inline input redirection. Instead of specifying a file to read from, with this operator one can specify the inputs in the code itself. The format is that after the << symbol, one can place a marker and continue giving data, ending it with the same marker.

  eg., sort << datalist
  data1
  data2
  data3
  data4
  ...
  datan
  datalist

  
  The last datalist signifies that our data has ended with this marker.

Mathematical expressions
  •   Legacy style is to use the expr command to evaluate mathematical expressions.

  eg., var3=`expr $var1 + $var2`

  •   Note that we need to use the backtick operator(`) for the shell to identify as a command, since the expr is a command and its output is assigned to the variable var3. But it becomes tedious and too much typing as the expressions get big and ugly.
  •   So the [ ] can be used to avoid all of this. The same thing can be easily written as

  var3=$[$var1 + $var2]

  •   The part inside the braces can be as complex as needed with "()" used to mark off sub-expressions and so on.
  
  LIMITATION
  •   The problem with bash shell is that it only supports integer arithmetic. Therefore
  var1=$[100 / 45] will store 2 as the result.
  
  To overcome this we use the inbuilt bc(bash calculator) tool as it can handle all types of operators.

  Level1 change 


  var1=`echo " 100 / 5" | bc`
  echo $var1

  • This is a rather convoluted way of giving the inputs to bc but the first way that one can think of. 
  • bc is a tool that needs inputs, so we echo all of our required expression and pipe it to the bc command.
  • We need to include the whole line in the backtick(`) for the shell to interpret it as a command.
  • The cool part is that bc can handle variables as well and supports all complex mathematical operations required in real-life applications.
  Level2 change

Instead of doing all this, there is also a simple way of doing this. And welcome "<<" operator discussed above.
  

  var1=100
  var2=45
  var3=`bc << end_of_data
  a = ( $var1 + $var2 )
  b = (a*2)
  end_of_data
  `
  echo the output is $var3

  
  Note a few points in this example.
  •     bc can handle variables declared in the main code block
  •     there is no limitation for assigning values to variables
  •     there is no limitation in values placed inside the brackets. (Something you'll see later on, exists in the normal bash shell scripts)
Exit status.

  UNIX provides a special variable "$?" denoting the exit status from the last command run. But one must use it immediately after the command for which the exit status is to be inspected.
  eg.,


  $ echo hello
  $ $?
  (Gives the status of echo)
  $ touch file1
  $ ls file1
  $ $?

  (Gives the status of ls, not touch.)

  Generally in the UNIX world, some special codes have universal meaning across all commands.

 
CodeMeaning
0Successful completion of the command
1General unknown error
2Misuse of shell command
126The command can't execute
127Command not found
128Invalid exit argument
130Command terminated with Ctrl-C
255Exit status out of range

  One should use this in designing a script. And it is also handy to determine how a command exited and resume/cancel further operation based on the value.

Shell Script Errors. Part2. Control Structures


Pitfalls of shell scripting. Part2

Control structures

If statement


  • General programming languages provide the ability to add conditions after the if command, but in shell scripting one cannot give conditions as it is. 
  • It has to be a command and the EXIT VALUE of the command is used in the if condition testing. 
  • If the command returned a 0, meaning successful completion, it goes to the "then" block, for all other VALUES, it moves to the next line of the "then" statement.
  • There can be more than one commands on the if line, and all of them are executed in the order they are specified but the exit code of the last command alone is used to check the "if" condition. This can be a source of common error.
  • There can be an "elif" statement(equivalent to the generic-else if), an "else" statement(which should, of course be the last conditional). 
  • But there has to be a "fi" statement at the end of the "if" block. 
  • One cannot put simple checks, like we do in other programming languages, on the "if" line. It has to be a command. So for operations like

    $var1 is greater than $var2,

we have to use the test command, which evaluates an expression and stores the true value as 0, and false as 1.
    So we have to use this command to get around checks in the "if" block.
    eg., if test $var1 -gt $var2

  • There is a better substitute for the test command, which reduces typing and makes code more readable. [ ] operator is of use here.

    eg., if [ $var -gt $var2 ]

  • One hugely important point to note here that there should be a space after the opening brace and a space between the last character and the closing brace. This is a huge source of common error.

    There are 3 classes of tests available in UNIX shell scripting.

1. Numeric comparisons

OperationOperatorExample
Less Than-lt$var1 -lt $var2
Greater Than-gt$var1 -gt $var2
Less Than Equal to-le$var1 -le $var2
Greater Than Equal to-ge$var1 -ge $var2
Equal to-eq$var1 -eq $var2
Not Equal to-ne$var1 -ne $var2


  • One notable shortcoming of the bash shell is that it cannot handle anything other than integers. So this operation would give an error.
      var1=`bc << 10/3`
      if [ $var1 -gt 3 ]
      then
        echo something
      fi


It will give an error in the comparison statement as var1 is 3.33 and it expects an integer. And there is no work-around this. :-(

2.String comparisons

OperationOperatorExample
Greater Than>$var1 > $var2
Less Than<$var1 < $var2
Equal to=$var1 = $var2
Not Equal to!=$var1 != $var2

  • Note that the equality test is only a single "=" unlike most other programming languages.
  • Also note that one has to escape the ">" & "<" characters with the normal backslash(\) otherwise the shell treats them as the file redirection operators.
    There are 2 handy operations available for strings
      -n = Tests if a string has length greater than 0.
      -z = Tests if a string has 0 length.
    eg.,

    var1="hello"
    var2=""
    if [ -n $var1 ] # Returns true
    if [ -z $var2 ] # Returns true

    
    3. File comparisons
      This kind of comparison is useful for files and directories manipulation 

     
OperationOperatorExample
Check if file exists and is a file-f-f file1
Check if file exists and is a directory-d-d file1
Check if file exists-e-e file1
Check if file is writable-w-w file1
Check if file is readable-r-r file1
Check if file is executable-x-x file1
Check if file1 is newer than file2-ntfile1 -nt file2
Check if file1 is older than file2-otfile1 -ot file2

Compound condition testing
  The normal [ ]  style can also handle multiple testing operations. 

   eg., [ $var1 -gt $var2 ] && [ $var3 -le $var4 ]

  
  The curious reader might observe that it comes handy while operating on files as one would like to test if a file exists and is writable before attempting to write to it. This makes the program more robust as it fails hard and loudly when something is not as it expected, which is a desirable design policy.

  eg., if [ -f $file1 ] && [ -w $file1 ]

  
checks if the file in variable file1 is a file and exists and is writable?

    There are 2 variations of the [ ]  operation.
      1. [[ ]] This is useful for string comparisons because it provides the ability for pattern matching.
      eg., if [[ -e /folder/file* ]]
      2. (( )) this is useful for numeric comparisons as it can perform complex operations like **(exponentiation), <<, >> (bitwise operations for shifting), &, | (bitwise operations for logical comparisons) etc.

case command

  the syntax for case command is
  

  case variable in
  pattern1 | pattern2 ... | patternN) command_Set1;;
  patternX) command_Set2;;
  *) default commands;;
  esac

  
  One good thing is that one can list multiple cases separating them with the "|" operator together.
  
  eg.,

  
  case $v in
  [0-5])echo lower half nos
        echo "skipping the rest";;
  [5-8])echo higher half nos
        echo "skipping the rest";;
  *)echo "It is 9!!!!";;
  esac

  
  Observe that the set of commands are to be listed as is. the ";;" symbol is to appear only at the end of the command set.



shell script errors. Part3. Looping constructs


Pitfalls of shell scripting. Part 3

Looping constructs.

FOR

  for var in list
  do
    commands
  done

  • list can be a space separated set of values on which we want to operate. 
  • One limitation here is any word that has a space in it has to be enclosed in double quotes("") for the shell to treat them as a single element in the list.
  • The power of this loop is that the list does not have to be hard coded. It can be the output of a command.
  eg.,
 
  for word in I don\'t know if "this'll" work
  do
    echo "WORD = $word"
  done
 

  OR
 
  list="word1 word2 word3 word4"
  list=$list" word5"
  for word in $list
  do
    echo "WORD = $word"
  done
 

  Observe the second line of the listing. This is how concatenation happens in shell scripting.

  OR
 
  for word in `cat words`
  do
    echo "WORD = $word"
  done

  • A problem here or with any of the above methods is that if the file or the listing has spaces, newlines, etc. every space separated value will be considered an individual element and processed separately, while this is not we wanted.
  • The reason here is something more internal to UNIX. In Unix, there is something called the "Internal Field Separator(IFS)" which determines what identifier acts as the delimiter. 
  • By default IFS is set to white space, tab and a newline. So when it encounters any one of them the next stuff becomes a separate element.
  • If in our case we want each newline alone to be considered an element, we will have to set the IFS to "\n" explicitly before processing the for loop.
  IFS=$'\n'
  for word in `cat words`
  ...
 

  Anything more than one character has to follow this syntax,
  IFS=$[characters]
  otherwise, one can write simply
  IFS=:
  if we only want the ":" to be the field separator.

  Read this post to find out some effective programs with this technique.

  There is also a C-style for loop, kept in order to allow C-programmers to be at ease. :-)

  for (( variable assignment; condition; iteration process ))
  eg.,

  for (( a = 1, b=8; a
 

  This has multiple advantages over the normal for loop.
  1. Spaces are allowed between variable assignment. (unlike the traditional style like THIS)
  2. Multiple variables can be initialized.
  3. Accessing variables doesn't require a dollar($) sign.

WHILE

  This is also a similar construct like the FOR loop.
 
  while test command
  do
    commands
  done
 

  The key to the while loop is that the command being tested on the first line has to change over the course of the loop, else it will be an infinite loop.
  eg.,
 
  while [ $var1 -gt 10 ]
  do
    echo "Something"
    var1=$[ $var1 + 2 ]
  done
 

Like IF, only the exit status of the last command on the first line is used to evaluate the loop's running.

Controlling the loop
  • Extending the traditional programming styles, bash also provides the two commands "break" and "continue", which serve pretty much the same function as their counterparts in other famous languages.
  • One additional feature of both this commands is they can be told to break "n" levels in cases of multiple nesting ( where the current level of nesting is n = 1 ). Consider this example. 
  for (( a = 1; a < 4; a++ ))
  do
    echo "Outer: $a"
    for (( b = 1; b < 5; b++ ))
    do
      echo "Inner: $b"
      if [ $b -gt 3 ]
      then
        break 2 # tells to break 2 levels.
      fi
    done
  done

  This will give:

  Outer: 1
  Inner: 1
  Inner: 2
  Inner: 3
  Inner: 4
 

Since it breaks the outer loop the whole execution stops. Similarly "continue" can also be used.

Processing output of loops

There is one small piece of beauty in the looping constructs. Instead of echo-ing things from within the loop, one can decide to write them to a file or pipe to a command.

  eg.,
 
  for file in /home/user/*
  do
    if [ -d "$file" ]
    then
      echo "$file is a directory"
    else
      echo "$file is a file"
    fi
  done > fileList.txt
 

  Instead of printing the output, it will write all of it to the file mentioned at the end of the loop.

shell script Part4. file handling


Utility small scripts.

Reading directories using wildcards

  for file in /home/user/*
  do
    if [ -d "$file" ]
    then
      echo "$file is a directory"
    elif [ -f "$file" ]
    then
      echo "$file is a file"
    else
      echo "$file is neither a file nor a directory"
    fi
  done

  Observe that in the checks $file is enclosed in double quotes since the names can also contain spaces and in that case we would want the whole name to be treated as one element for the FOR loop.

Reading and displaying the contents of the passwd file in some meaningful format.

Now the passwd file contains lines of records in the following format, denoting the information about each of the users(both visible and system)

login_name:password:user_id:group_id:description_of_account:HOME_directory_location:default_shell_for_user

We would like to present each user's information in a tree-like structure, that makes things more readable. Yes, you can write them to a file for future readability.

  IFS=$'\n'
  for line in `cat /etc/passwd`
  do
    echo "Values: $line"
    IFS=:
    for field in $line
    do
      echo "  Field: $field"
    done
  done

Shell Script examples for user Input. Part6


Handling User Inputs from the shell - Code snippets


1.  Grab all parameters
  
    count=1
    echo "No of parameters = ${!#}"
    while [ $count -le $#]
    do
      echo "param[$count] = ${$count}"
      count=$[$count+1]
    done
   

    But this does not work well when no of command line parameters exceeds 9.

2.  Grab all parameters
   
    count=1
    for var in $*
    do
      echo "Param[$count] = $var"
      count=$[$count+1]
    done
 
    count=1
    for var in $@
    do
      echo "param[$count] = $var"
      count=$[$count+1]
    done
 
    Using this with
    file a b c d1
 
    will produce
    Param[1] = a
    param[1] = a
    param[2] = b
    param[3] = c
    param[4] = d1
   

    For explanation look HERE.


3.  Use of shift to process parameters

    count=1
    while [ -n "$1" ]
    do
      echo "param[$count]: $1"
      count=$[$count + 1]
      shift
    done
   

    For explanation look HERE.

4.  Using shift to process options and parameters

    Suppose your code expects options a,b(with 2 parameters), c, d and a set of parameters.
   
    count=1
    while [ -n "$1" ]
    do
      case "$1" in
      -a) echo "Found -a option";;
      -b) echo "Found -b option with value $2 and $3"
          shift 2;;       # since we know we have consumed 3 values from the list of parameters.We shift one at the end of each iteration
      -c) echo "Found -c option";;
      -d) echo "Found -d option";;
      --) shift           # We use the -- separator to delineate options from parameter set
          break;;
      *)  echo "$1 not an option";;
      esac
      shift     # this is for whatever option we might have found, we need to shift them out
    done
 
    count=1
    for param in "$@"     # now only params remain
    do
      echo "Param[$count] = $param"
      count=$[$count + 1]
    done
   

    The dependency on the "--" is the only drawback of this method.
    For explanation look HERE.

5.  using getopt to process parameters

    set -- `getopt ab:c:d "$@"`
    while [ -n "$1" ]
    do
      case "$1" in
        -a) echo "Found -a option" ;;
        -b) echo "Found -b option with parameter $2"
            shift ;;
        -c) echo "Found -c option with parameter $2"
            shift;;
        -d) echo "Found -d option";;
        --) shift
        break;      #this character means options have ended
        *)  echo "Not supported option $1";;
      esac
      shift
    done
 
    count=1
    for param in "$@"     # now only params remain
    do
      echo "Param[$count] = $param"
      count=$[$count + 1]
    done
   

    For explanation look HERE.
    But this fails when the parameters have spaces in them.

6.  using getopts to process parameters.

    count=1
    while getopts ab:c:d opt
    do
      case "$opt" in
        -a) echo "Found -a option" ;;
        -b) echo "Found -b option with parameter $OPTARG";;
        -c) echo "Found -c option with parameter $OPTARG";;
        -d) echo "Found -d option";;
        *)  echo "Found unknown option $opt";;
      esac
    done
 
    shift $[ $OPTIND - 1 ]
 
    count=1
    for param in "$@"     # now only params remain
    do
      echo "Param[$count] = $param"
      count=$[$count + 1]
    done
   

    For explanation look HERE.
     

Shell Scripting errors. Part5. Handling User Input


Pitfalls of shell scripting. Part 5

Handling User Input.

  1. Command line parameters
  $ ./file 10 20 30
  •   The way to capture these parameters is use the variables $1, $2, $3, ... $9, ${10}, ${11}, ... in your code.
  •   Note: If the program expects 3 parameters and is passed less than 3, using $3 will cause a run-time error. Hence it is always better to check for a parameter before using it.
  if [ -n "$1" ]
  then
    # Use $1
  fi

  •   The program name can be read with $0.
  •   The program name irrespective of its path from where it is invoked can be found by
  name=`basename $0`

  $ ./file  #name=file
  $ /home/user/dir/file   #name=file
 

  2. Counting parameters.
  •   $# provides the number of parameters passed. This can be used to check if the program was called with the required no of parameters, and further execution can be stopped otherwise.
  if [ $# -lt 4 ]
  then
    echo "Usage file a b c d"
  fi
  else
    #Commands
  fi
 

  3. Grabbing the last parameter.
  •   We cannot use echo "Last param= ${$#}" because $ cannot be inside the {} braces. Another bug point.
  •   To get it, we have to replace the $ with !.
  echo "Last param= ${!#}"

  Look HERE to see a way to grab all the parameters.

  4. Grabbing all parameters
  •   Shell also provides PERL like objects which store all the parameters passed in.
  •   $* and $@ are two such objects.
  •   $* stores the entire param set as a string.
  •   $@ handles each value passed as one element and it itself is a list, and can be used in an array.
  Look HERE to see their usage and differences.

  5. Playing with parameters.
  • shift is a command which can be used to use a parameter one at a time and then remove it from the set of parameters, shifting the remaining ones one place to the left. 
  • Hence the variable $1 will always hold the parameter, if we operate on the list of params in a loop.
  • shift can also move multiple parameters together.
  eg., echo "$1, $2, $3, $4"
       shift 4    # Now that we have used the 4 params, remove them all at once
     

  Look HERE to use shift and elegantly process all parameters.

  6. Options
  •   We can continue using the shift to work on the options and their parameters, if any.
  •   But this is rather tedious and we need to manually take care of each of the options.

  7. Separating options from parameters.
  • The standard technique of doing this is providing a "--" after the options have ended and then list the parameters. 
  • So the code knows to operate on the option list first using shift and grab their parameters, till it reaches --. 
  • After that we can use the $@ to find out how many parameters remain, and they are true params and operate on them as in example 4 above.
  Look HERE to see a piece of code doing this.

This can also handle the most complex positions, but again everything has to be manually checked and coded likewise. Which is not the practical way to go about.

  8. getopt command.
  •   getopt command is an inbuilt command to separate the options from the parameters. Internally we also force it to do the same thing as in #7.
  •   Syntax:
getopt options optstring parameters
  •   We list each option letter and place a colon(:) after the one that requires a parameter.
  eg.,   getopt ab:cd -a -b test1 -cd test2 test3
  •   Here test1 is the parameter of b and test2, test3 are program parameters.
  •   The way to use it in the code is
     set -- `getopt ab:cd: "$@"`
  •   We specify the list $@ as the set of parameters
  Look HERE to look at a piece of code doing this.
  •   But this fails to handle real-world cases where the parameter has spaces and has been enclosed in double quotes("") at the command line.

  9. getopts command
  •   getopts is the ultimate tool for processing the command line options, howsoever nested, complex and disparate in nature.
  •   It processes the parameters one by one as it detects them and thus can be used in a loop, thereby removing the need to write code for each option.
  •   Syntax:
getopts optstring variable
  • It places the current parameter in the "variable". It is powerful because the parameter of an option is available in an environment variable $OPTARG and the no of parameters it has evaluated at any point in another env variable $OPTIND.
  • After it has run out of options provided in the optstring, we are left with the parameters, and we know that we have moved over $OPTIND no of parameters. 
  • So we can shift those many values, and assume that the remaining parameter set (available with $@) is our set of true parameters meant for the program. 
  • They can be easily operated in another loop, as we have done above.
  Look HERE to see a piece of code doing this.

  Generally in UNIX world, there is a practice of standardizing the options, which can be found HERE.

  10. Getting user Input


    a)  read command can be used to read values and place them in a variable of our choice
        eg.,
        echo -n "Enter Name: "    # -n doesn't create a newline after the echo
        read name                 # $name now contains whatever the user typed till pressing Enter


    b)  Multiple values can be stored in multiple variables.
        echo -n "First, Last Name: "
        read firs last

     
    c)  Input process can be timed out using -t so that program does not wait forever for the user's input
        read -t 10 -p "Enter a no: "      # times out after 10 secs.
     

        -p allows to give a promt inline. Without a variable name the input is available in the environment variable $REPLY.
     
    d)  read can also be limited to a specific no of characters using -n option.
        This is useful for Y/N responses from the user.
        As soon as user presses "n" no of keys, the value is stored and input is finished.
        eg., read -n1 "Continue? (Y/N): " answer    # answer stores Y/N/any other value user entered. So it must be checked before using it.
     
    e)  Silent reading is useful when reading passwords, using the -s option
        eg., read -s -p "Enter password: " pass
     
        What shell does in these reads is it makes the text color same as the background color, making it impossible to read.
     
    f)  Reading from a file.
     
        eg.,
       
        count=1
        cat file | while read line
        do
          echo "Line $count: $line"
          count=$[ $count + 1 ]
        done

Monday, November 21, 2011

Hack for functions inside loop in Javascript


In one of my earlier blogs, I had mentioned that is not advisable to create functions inside a loop in Javascript, as any strict environment will result in an error in the JS file, such as a JsLint installation.

I had also cited an alternative approach to call a function inside the loop, instead of creating it inline.

But in some cases, it becomes imperative to have this functionality and even calling the function is not helpful.

Consider this piece of code


var obj = [ { a: ".link1", b: ".div1", c: ".div2"},
{ a: ".link4", b: ".div3", c: ".div4"},
...
];

for( var i = 0; i < obj.length; i++) { 

 var o = obj[i]; 
 $(o.a).click(function(e) { 
    e.preventDefault(); // some more operations... 
 }); 
}



Monday, November 14, 2011

Smart UNIX commands


The power of UNIX commands is that each of them come with an array of parameters and options to support. But practically speaking it becomes difficult to remember them all and use them effectively.

Mostly in the UNIX world, some options have the common meaning across most commands :


Option NameOption Meaning
-iIgnore text case.
-vVerbose mode. Echoes the steps of operation.
-aShow all objects.
-cProduce a count.
-dSpecify a directory.
-fSpecify a file to read from.
-hDisplay a help message for the command.
-lProduce a long format of output.
-oSpecify an output file.
-qRun in quiet mode. Doesn't echo messages and warnings.
-rProcess directories and files recursively.
-sRun in silent mode.
-yAnswer yes to all questions

Some commands which are used frequently have some useful option set, of which I will be talking here.

1) ls - The first command that we use on opening the terminal. It has a lot of options that make it one of the most powerful tools. But remembering them is a task in itself and only few are used frequently.

A handy pack is called the "-sail" set (Go ahead and attach meaning to it)

This set of options generally produces the most effective listing with the most common details.
  • -s = Block size of files
  • -a = list all files
  • -i = lists the file serial no popularly known as the inode number. which is a unique representation of the file at the system level. This value can be used in all machine level implementations and internal representation scenarios. (Go ahead and think of ways you can use it )
  • -l = long format, which produces the classical listing of files with their extended information.


A snapshot helps in understanding (with the normal -l option)



























And combined with the above options :-




2) touch - Used to create an empty file of size 0 bytes. Useful when we want to write to it later.





Observe the "0" after the "staff" which indicates the size of the file

3) cp - copy command. Some useful options are
  • -p = preserve file access and modification times of the original file. Basically keeps the same timestamp as the original file instead of marking the timestamp for the creation of the new file
A snapshot without the -p option: (Pay close attention to the timestamps of the files)


    With the -p option



Also, rcp is a command used for remote file copy. Of course, one needs to have at least read access to the remote path. That is a different story altogether.
4) ps - process listing. This is another powerful command with a whole bunch of options to choose from. But it becomes again difficult to remember all of them and their meanings. So a concise set is -ef if we want to see everything running on our system, where
  • -e = show all processes
  • -f = show formatted listing for processes.
Without the -e option one gets a very minimal set of processes(those belonging to the current user and running on the current terminal)

With the -ef option
Observe the difference.

The columns displayed with the full listing have the following meaning:-

  • UID = User Id. (assigned by the system to each user)
  • PID = Process Id. (assigned by the system to each running process)
  • PPID = Parent Process Id. (Id of the process which spawned the current process. 0 if it is an independent process)
  • C = CPU utilization over the lifetime of the process
  • STIME = system time when the process started
  • TTY = Terminal device from which the process was launched. For most processes it is ?? as they are system processes and not launched from any terminal. For processes that were launched by users explicitly or indirectly one can get values for this column.
  • TIME = Total CPU time needed by the process to run.
  • CMD = Command name that started this process. As everything in UNIX based OS is a process.