Arindam's blog: 2011

Tuesday, November 22, 2011

Shell Script Errors. Part1. Basic syntax

Pitfalls of shell scripting. Part1

In this post ( and in the next few posts) I'll try to cover some pitfalls where beginners to shell scripting might find themselves wasting hours looking through the code to find the error.

Variable assignment

No space can occur between the variable name, the equal to sign, and the value.



  eg, var1=10     (correct)

var1 = 10 (incorrect)

Output redirection

'>' operator writes the output to the file name mentioned.



  eg., `date` > date.txt

'>>' appends to the file. Normal '>' overwrites the previous contents.



  eg., `date` >> date.txt

Input redirection

'<' reads from a file mentioned



  eg., sort < data

'<<' is a powerful operator that allows inline input redirection. Instead of specifying a file to read from, with this operator one can specify the inputs in the code itself. The format is that after the << symbol, one can place a marker and continue giving data, ending it with the same marker.



  eg., sort << datalist

data1

data2

data3

data4

...

datan

datalist

The last datalist signifies that our data has ended with this marker.

Mathematical expressions

Legacy style is to use the expr command to evaluate mathematical expressions.



  eg., var3=`expr $var1 + $var2`

Note that we need to use the backtick operator(`) for the shell to identify as a command, since the expr is a command and its output is assigned to the variable var3. But it becomes tedious and too much typing as the expressions get big and ugly.
So the [ ] can be used to avoid all of this. The same thing can be easily written as



  var3=$[$var1 + $var2]

The part inside the braces can be as complex as needed with "()" used to mark off sub-expressions and so on.

LIMITATION

The problem with bash shell is that it only supports integer arithmetic. Therefore

var1=$[100 / 45] will store 2 as the result.

To overcome this we use the inbuilt bc(bash calculator) tool as it can handle all types of operators.

Level1 change



  var1=`echo " 100 / 5" | bc`

echo $var1

This is a rather convoluted way of giving the inputs to bc but the first way that one can think of.
bc is a tool that needs inputs, so we echo all of our required expression and pipe it to the bc command.
We need to include the whole line in the backtick(`) for the shell to interpret it as a command.
The cool part is that bc can handle variables as well and supports all complex mathematical operations required in real-life applications.

Level2 change

Instead of doing all this, there is also a simple way of doing this. And welcome "<<" operator discussed above.



  var1=100

var2=45

var3=`bc << end_of_data

a = ( $var1 + $var2 )

b = (a*2)

end_of_data

echo the output is $var3

Note a few points in this example.

bc can handle variables declared in the main code block
there is no limitation for assigning values to variables
there is no limitation in values placed inside the brackets. (Something you'll see later on, exists in the normal bash shell scripts)

Exit status.

UNIX provides a special variable "$?" denoting the exit status from the last command run. But one must use it immediately after the command for which the exit status is to be inspected.

eg.,



  $ echo hello

$ $?

(Gives the status of echo)

$ touch file1

$ ls file1

$ $?

(Gives the status of ls, not touch.)

Generally in the UNIX world, some special codes have universal meaning across all commands.

Code	Meaning
0	Successful completion of the command
1	General unknown error
2	Misuse of shell command
126	The command can't execute
127	Command not found
128	Invalid exit argument
130	Command terminated with Ctrl-C
255	Exit status out of range

One should use this in designing a script. And it is also handy to determine how a command exited and resume/cancel further operation based on the value.

Shell Script Errors. Part2. Control Structures

Pitfalls of shell scripting. Part2

Control structures

If statement

General programming languages provide the ability to add conditions after the if command, but in shell scripting one cannot give conditions as it is.
It has to be a command and the EXIT VALUE of the command is used in the if condition testing.
If the command returned a 0, meaning successful completion, it goes to the "then" block, for all other VALUES, it moves to the next line of the "then" statement.
There can be more than one commands on the if line, and all of them are executed in the order they are specified but the exit code of the last command alone is used to check the "if" condition. This can be a source of common error.
There can be an "elif" statement(equivalent to the generic-else if), an "else" statement(which should, of course be the last conditional).
But there has to be a "fi" statement at the end of the "if" block.
One cannot put simple checks, like we do in other programming languages, on the "if" line. It has to be a command. So for operations like

$var1 is greater than $var2,

we have to use the test command, which evaluates an expression and stores the true value as 0, and false as 1.
So we have to use this command to get around checks in the "if" block.
eg., if test $var1 -gt $var2

There is a better substitute for the test command, which reduces typing and makes code more readable. [ ] operator is of use here.

eg., if [ $var -gt $var2 ]

One hugely important point to note here that there should be a space after the opening brace and a space between the last character and the closing brace. This is a huge source of common error.

There are 3 classes of tests available in UNIX shell scripting.

1. Numeric comparisons

Operation	Operator	Example
Less Than	-lt	$var1 -lt $var2
Greater Than	-gt	$var1 -gt $var2
Less Than Equal to	-le	$var1 -le $var2
Greater Than Equal to	-ge	$var1 -ge $var2
Equal to	-eq	$var1 -eq $var2
Not Equal to	-ne	$var1 -ne $var2

One notable shortcoming of the bash shell is that it cannot handle anything other than integers. So this operation would give an error.

var1=`bc << 10/3`

if [ $var1 -gt 3 ]

then

echo something

fi

It will give an error in the comparison statement as var1 is 3.33 and it expects an integer. And there is no work-around this. :-(

2.String comparisons

Operation	Operator	Example
Greater Than	>	$var1 > $var2
Less Than	<	$var1 < $var2
Equal to	=	$var1 = $var2
Not Equal to	!=	$var1 != $var2

Note that the equality test is only a single "=" unlike most other programming languages.
Also note that one has to escape the ">" & "<" characters with the normal backslash(\) otherwise the shell treats them as the file redirection operators.

There are 2 handy operations available for strings

-n = Tests if a string has length greater than 0.

-z = Tests if a string has 0 length.

eg.,



    var1="hello"

var2=""

if [ -n $var1 ] # Returns true

if [ -z $var2 ] # Returns true

3. File comparisons

This kind of comparison is useful for files and directories manipulation

Operation	Operator	Example
Check if file exists and is a file	-f	-f file1
Check if file exists and is a directory	-d	-d file1
Check if file exists	-e	-e file1
Check if file is writable	-w	-w file1
Check if file is readable	-r	-r file1
Check if file is executable	-x	-x file1
Check if file1 is newer than file2	-nt	file1 -nt file2
Check if file1 is older than file2	-ot	file1 -ot file2

Compound condition testing

The normal [ ] style can also handle multiple testing operations.



   eg., [ $var1 -gt $var2 ] && [ $var3 -le $var4 ]

The curious reader might observe that it comes handy while operating on files as one would like to test if a file exists and is writable before attempting to write to it. This makes the program more robust as it fails hard and loudly when something is not as it expected, which is a desirable design policy.



  eg., if [ -f $file1 ] && [ -w $file1 ]

checks if the file in variable file1 is a file and exists and is writable?

There are 2 variations of the [ ] operation.

1. [[ ]] This is useful for string comparisons because it provides the ability for pattern matching.

eg., if [[ -e /folder/file* ]]

2. (( )) this is useful for numeric comparisons as it can perform complex operations like **(exponentiation), <<, >> (bitwise operations for shifting), &, | (bitwise operations for logical comparisons) etc.

case command

the syntax for case command is



  case variable in

pattern1 | pattern2 ... | patternN) command_Set1;;

patternX) command_Set2;;

*) default commands;;

esac

One good thing is that one can list multiple cases separating them with the "|" operator together.

eg.,

case $v in

[0-5])echo lower half nos

echo "skipping the rest";;

[5-8])echo higher half nos

echo "skipping the rest";;

*)echo "It is 9!!!!";;

esac

Observe that the set of commands are to be listed as is. the ";;" symbol is to appear only at the end of the command set.

shell script errors. Part3. Looping constructs

Pitfalls of shell scripting. Part 3

Looping constructs.

FOR



  for var in list

  do

    commands

  done

list can be a space separated set of values on which we want to operate.
One limitation here is any word that has a space in it has to be enclosed in double quotes("") for the shell to treat them as a single element in the list.
The power of this loop is that the list does not have to be hard coded. It can be the output of a command.

eg.,



  for word in I don\'t know if "this'll" work

  do

    echo "WORD = $word"

  done



  list="word1 word2 word3 word4"

  list=$list" word5"

  for word in $list

  do

    echo "WORD = $word"

  done

Observe the second line of the listing. This is how concatenation happens in shell scripting.

OR



  for word in `cat words`

  do

    echo "WORD = $word"

  done

A problem here or with any of the above methods is that if the file or the listing has spaces, newlines, etc. every space separated value will be considered an individual element and processed separately, while this is not we wanted.
The reason here is something more internal to UNIX. In Unix, there is something called the "Internal Field Separator(IFS)" which determines what identifier acts as the delimiter.
By default IFS is set to white space, tab and a newline. So when it encounters any one of them the next stuff becomes a separate element.
If in our case we want each newline alone to be considered an element, we will have to set the IFS to "\n" explicitly before processing the for loop.


  IFS=$'\n'

  for word in `cat words`

  ...

Anything more than one character has to follow this syntax,
IFS=$[characters]
otherwise, one can write simply
IFS=:
if we only want the ":" to be the field separator.

Read this post to find out some effective programs with this technique.

There is also a C-style for loop, kept in order to allow C-programmers to be at ease. :-)

for (( variable assignment; condition; iteration process ))

  eg.,



  for (( a = 1, b=8; a

This has multiple advantages over the normal for loop.
1. Spaces are allowed between variable assignment. (unlike the traditional style like THIS)
2. Multiple variables can be initialized.
3. Accessing variables doesn't require a dollar($) sign.

WHILE

This is also a similar construct like the FOR loop.



  while test command

  do

    commands

  done

The key to the while loop is that the command being tested on the first line has to change over the course of the loop, else it will be an infinite loop.
eg.,



  while [ $var1 -gt 10 ]

  do

    echo "Something"

    var1=$[ $var1 + 2 ]

  done

Like IF, only the exit status of the last command on the first line is used to evaluate the loop's running.

Controlling the loop

Extending the traditional programming styles, bash also provides the two commands "break" and "continue", which serve pretty much the same function as their counterparts in other famous languages.
One additional feature of both this commands is they can be told to break "n" levels in cases of multiple nesting ( where the current level of nesting is n = 1 ). Consider this example.

  for (( a = 1; a < 4; a++ ))

  do

    echo "Outer: $a"

    for (( b = 1; b < 5; b++ ))

    do

      echo "Inner: $b"

      if [ $b -gt 3 ]

      then

        break 2 # tells to break 2 levels.

      fi

    done

  done



  This will give:



  Outer: 1

  Inner: 1

  Inner: 2

  Inner: 3

  Inner: 4

Since it breaks the outer loop the whole execution stops. Similarly "continue" can also be used.

Processing output of loops

There is one small piece of beauty in the looping constructs. Instead of echo-ing things from within the loop, one can decide to write them to a file or pipe to a command.

eg.,



  for file in /home/user/*

  do

    if [ -d "$file" ]

    then

      echo "$file is a directory"

    else

      echo "$file is a file"

    fi

  done > fileList.txt

Instead of printing the output, it will write all of it to the file mentioned at the end of the loop.

shell script Part4. file handling

Utility small scripts.

Reading directories using wildcards



  for file in /home/user/*

  do

    if [ -d "$file" ]

    then

      echo "$file is a directory"

    elif [ -f "$file" ]

    then

      echo "$file is a file"

    else

      echo "$file is neither a file nor a directory"

    fi

  done

Observe that in the checks $file is enclosed in double quotes since the names can also contain spaces and in that case we would want the whole name to be treated as one element for the FOR loop.

Reading and displaying the contents of the passwd file in some meaningful format.

Now the passwd file contains lines of records in the following format, denoting the information about each of the users(both visible and system)



login_name:password:user_id:group_id:description_of_account:HOME_directory_location:default_shell_for_user

We would like to present each user's information in a tree-like structure, that makes things more readable. Yes, you can write them to a file for future readability.



  IFS=$'\n'

  for line in `cat /etc/passwd`

  do

    echo "Values: $line"

    IFS=:

    for field in $line

    do

      echo "  Field: $field"

    done

  done

Shell Script examples for user Input. Part6

Handling User Inputs from the shell - Code snippets

1. Grab all parameters



    count=1

    echo "No of parameters = ${!#}"

    while [ $count -le $#]

    do

      echo "param[$count] = ${$count}"

      count=$[$count+1]

    done

But this does not work well when no of command line parameters exceeds 9.

2. Grab all parameters



    count=1

    for var in $*

    do

      echo "Param[$count] = $var"

      count=$[$count+1]

    done

  

    count=1

    for var in $@

    do

      echo "param[$count] = $var"

      count=$[$count+1]

    done

  

    Using this with

    file a b c d1

  

    will produce

    Param[1] = a

    param[1] = a

    param[2] = b

    param[3] = c

    param[4] = d1

For explanation look HERE.

3. Use of shift to process parameters



    count=1

    while [ -n "$1" ]

    do

      echo "param[$count]: $1"

      count=$[$count + 1]

      shift

    done

For explanation look HERE.

4. Using shift to process options and parameters

Suppose your code expects options a,b(with 2 parameters), c, d and a set of parameters.



    count=1

    while [ -n "$1" ]

    do

      case "$1" in

      -a) echo "Found -a option";;

      -b) echo "Found -b option with value $2 and $3"

          shift 2;;       # since we know we have consumed 3 values from the list of parameters.We shift one at the end of each iteration

      -c) echo "Found -c option";;

      -d) echo "Found -d option";;

      --) shift           # We use the -- separator to delineate options from parameter set

          break;;

      *)  echo "$1 not an option";;

      esac

      shift     # this is for whatever option we might have found, we need to shift them out

    done

  

    count=1

    for param in "$@"     # now only params remain

    do

      echo "Param[$count] = $param"

      count=$[$count + 1]

    done

The dependency on the "--" is the only drawback of this method.
For explanation look HERE.

5. using getopt to process parameters



    set -- `getopt ab:c:d "$@"`

    while [ -n "$1" ]

    do

      case "$1" in

        -a) echo "Found -a option" ;;

        -b) echo "Found -b option with parameter $2"

            shift ;;

        -c) echo "Found -c option with parameter $2"

            shift;;

        -d) echo "Found -d option";;

        --) shift

        break;      #this character means options have ended

        *)  echo "Not supported option $1";;

      esac

      shift

    done

  

    count=1

    for param in "$@"     # now only params remain

    do

      echo "Param[$count] = $param"

      count=$[$count + 1]

    done

For explanation look HERE.
But this fails when the parameters have spaces in them.

6. using getopts to process parameters.



    count=1

    while getopts ab:c:d opt

    do

      case "$opt" in

        -a) echo "Found -a option" ;;

        -b) echo "Found -b option with parameter $OPTARG";;

        -c) echo "Found -c option with parameter $OPTARG";;

        -d) echo "Found -d option";;

        *)  echo "Found unknown option $opt";;

      esac

    done

  

    shift $[ $OPTIND - 1 ]

  

    count=1

    for param in "$@"     # now only params remain

    do

      echo "Param[$count] = $param"

      count=$[$count + 1]

    done

For explanation look HERE.

Shell Scripting errors. Part5. Handling User Input

Pitfalls of shell scripting. Part 5

Handling User Input.

1. Command line parameters
$ ./file 10 20 30

The way to capture these parameters is use the variables $1, $2, $3, ... $9, ${10}, ${11}, ... in your code.

Note: If the program expects 3 parameters and is passed less than 3, using $3 will cause a run-time error. Hence it is always better to check for a parameter before using it.


  if [ -n "$1" ]

  then

    # Use $1

  fi

The program name can be read with $0.
The program name irrespective of its path from where it is invoked can be found by

name=`basename $0`



  $ ./file  #name=file

  $ /home/user/dir/file   #name=file

2. Counting parameters.

$# provides the number of parameters passed. This can be used to check if the program was called with the required no of parameters, and further execution can be stopped otherwise.


  if [ $# -lt 4 ]

  then

    echo "Usage file a b c d"

  fi

  else

    #Commands

  fi

3. Grabbing the last parameter.

We cannot use echo "Last param= ${$#}" because $ cannot be inside the {} braces. Another bug point.
To get it, we have to replace the $ with !.

echo "Last param= ${!#}"

Look HERE to see a way to grab all the parameters.

4. Grabbing all parameters

Shell also provides PERL like objects which store all the parameters passed in.
$* and $@ are two such objects.
$* stores the entire param set as a string.
$@ handles each value passed as one element and it itself is a list, and can be used in an array.

Look HERE to see their usage and differences.

5. Playing with parameters.

shift is a command which can be used to use a parameter one at a time and then remove it from the set of parameters, shifting the remaining ones one place to the left.
Hence the variable $1 will always hold the parameter, if we operate on the list of params in a loop.

shift can also move multiple parameters together.


  eg., echo "$1, $2, $3, $4"

       shift 4    # Now that we have used the 4 params, remove them all at once

Look HERE to use shift and elegantly process all parameters.

6. Options

We can continue using the shift to work on the options and their parameters, if any.
But this is rather tedious and we need to manually take care of each of the options.

7. Separating options from parameters.

The standard technique of doing this is providing a "--" after the options have ended and then list the parameters.
So the code knows to operate on the option list first using shift and grab their parameters, till it reaches --.
After that we can use the $@ to find out how many parameters remain, and they are true params and operate on them as in example 4 above.

Look HERE to see a piece of code doing this.

This can also handle the most complex positions, but again everything has to be manually checked and coded likewise. Which is not the practical way to go about.

8. getopt command.

getopt command is an inbuilt command to separate the options from the parameters. Internally we also force it to do the same thing as in #7.
Syntax:

getopt options optstring parameters

We list each option letter and place a colon(:) after the one that requires a parameter.

eg., getopt ab:cd -a -b test1 -cd test2 test3

Here test1 is the parameter of b and test2, test3 are program parameters.
The way to use it in the code is

set -- `getopt ab:cd: "$@"`

We specify the list $@ as the set of parameters

Look HERE to look at a piece of code doing this.

But this fails to handle real-world cases where the parameter has spaces and has been enclosed in double quotes("") at the command line.

9. getopts command

getopts is the ultimate tool for processing the command line options, howsoever nested, complex and disparate in nature.
It processes the parameters one by one as it detects them and thus can be used in a loop, thereby removing the need to write code for each option.
Syntax:

getopts optstring variable

It places the current parameter in the "variable". It is powerful because the parameter of an option is available in an environment variable $OPTARG and the no of parameters it has evaluated at any point in another env variable $OPTIND.
After it has run out of options provided in the optstring, we are left with the parameters, and we know that we have moved over $OPTIND no of parameters.
So we can shift those many values, and assume that the remaining parameter set (available with $@) is our set of true parameters meant for the program.
They can be easily operated in another loop, as we have done above.

Look HERE to see a piece of code doing this.

Generally in UNIX world, there is a practice of standardizing the options, which can be found HERE.

10. Getting user Input

a) read command can be used to read values and place them in a variable of our choice
eg.,

echo -n "Enter Name: "    # -n doesn't create a newline after the echo

        read name                 # $name now contains whatever the user typed till pressing Enter

b) Multiple values can be stored in multiple variables.

 echo -n "First, Last Name: "

        read firs last

c) Input process can be timed out using -t so that program does not wait forever for the user's input

 read -t 10 -p "Enter a no: "      # times out after 10 secs.

-p allows to give a promt inline. Without a variable name the input is available in the environment variable $REPLY.

d) read can also be limited to a specific no of characters using -n option.
This is useful for Y/N responses from the user.
As soon as user presses "n" no of keys, the value is stored and input is finished.
eg.,

read -n1 "Continue? (Y/N): " answer    # answer stores Y/N/any other value user entered. So it must be checked before using it.

e) Silent reading is useful when reading passwords, using the -s option
eg., read -s -p "Enter password: " pass

What shell does in these reads is it makes the text color same as the background color, making it impossible to read.

f) Reading from a file.

eg.,



        count=1

        cat file | while read line

        do

          echo "Line $count: $line"

          count=$[ $count + 1 ]

        done

Monday, November 21, 2011

Hack for functions inside loop in Javascript

In one of my earlier blogs, I had mentioned that is not advisable to create functions inside a loop in Javascript, as any strict environment will result in an error in the JS file, such as a JsLint installation.

I had also cited an alternative approach to call a function inside the loop, instead of creating it inline.

But in some cases, it becomes imperative to have this functionality and even calling the function is not helpful.

Consider this piece of code



var obj = [ { a: ".link1", b: ".div1", c: ".div2"},

{ a: ".link4", b: ".div3", c: ".div4"},

...

];



for( var i = 0; i < obj.length; i++) {

var o = obj[i];
$(o.a).click(function(e) {
e.preventDefault(); // some more operations...
});
}

Smart UNIX commands

The power of UNIX commands is that each of them come with an array of parameters and options to support. But practically speaking it becomes difficult to remember them all and use them effectively.

Mostly in the UNIX world, some options have the common meaning across most commands :

Option Name	Option Meaning
-i	Ignore text case.
-v	Verbose mode. Echoes the steps of operation.
-a	Show all objects.
-c	Produce a count.
-d	Specify a directory.
-f	Specify a file to read from.
-h	Display a help message for the command.
-l	Produce a long format of output.
-o	Specify an output file.
-q	Run in quiet mode. Doesn't echo messages and warnings.
-r	Process directories and files recursively.
-s	Run in silent mode.
-y	Answer yes to all questions

Some commands which are used frequently have some useful option set, of which I will be talking here.

1) ls - The first command that we use on opening the terminal. It has a lot of options that make it one of the most powerful tools. But remembering them is a task in itself and only few are used frequently.

A handy pack is called the "-sail" set (Go ahead and attach meaning to it)

This set of options generally produces the most effective listing with the most common details.

-s = Block size of files
-a = list all files
-i = lists the file serial no popularly known as the inode number. which is a unique representation of the file at the system level. This value can be used in all machine level implementations and internal representation scenarios. (Go ahead and think of ways you can use it )
-l = long format, which produces the classical listing of files with their extended information.

A snapshot helps in understanding (with the normal -l option)

And combined with the above options :-

2) touch - Used to create an empty file of size 0 bytes. Useful when we want to write to it later.

Observe the "0" after the "staff" which indicates the size of the file

3) cp - copy command. Some useful options are

-p = preserve file access and modification times of the original file. Basically keeps the same timestamp as the original file instead of marking the timestamp for the creation of the new file

A snapshot without the -p option: (Pay close attention to the timestamps of the files)

Also, rcp is a command used for remote file copy. Of course, one needs to have at least read access to the remote path. That is a different story altogether.

4) ps - process listing. This is another powerful command with a whole bunch of options to choose from. But it becomes again difficult to remember all of them and their meanings. So a concise set is -ef if we want to see everything running on our system, where

-e = show all processes
-f = show formatted listing for processes.

Without the -e option one gets a very minimal set of processes(those belonging to the current user and running on the current terminal)

With the -ef option

Observe the difference.

The columns displayed with the full listing have the following meaning:-

UID = User Id. (assigned by the system to each user)
PID = Process Id. (assigned by the system to each running process)
PPID = Parent Process Id. (Id of the process which spawned the current process. 0 if it is an independent process)
C = CPU utilization over the lifetime of the process
STIME = system time when the process started
TTY = Terminal device from which the process was launched. For most processes it is ?? as they are system processes and not launched from any terminal. For processes that were launched by users explicitly or indirectly one can get values for this column.
TIME = Total CPU time needed by the process to run.
CMD = Command name that started this process. As everything in UNIX based OS is a process.

Monday, October 24, 2011

Creating functions inside loops in Javascript

This is an ongoing effort to make Javascript a more robust programming language. So when one uses JSLint it complains about a number of bad coding practices.

One of its very rare occurrence is the creating of functions inside a loop.
Experienced programmers get into the habit of writing code like this:

var i;
for(i = 0 ; i < someValue; i++) {
...
// some operations

$(".a").onclick = (function() {
$(this).href = // some manipulated code.
});

} // end for

While this has become a standard way of operating in Javascript, this is a bad practice, because you are creating functions for every iteration of the loop.
Since in Javascript, functions are also allocated memory and can be accessed with their name references anywhere in the code, that is after their point of declaration, this coding practice unnecessarily creates many instances (the word instance is also debatable in Javascript, as Object Oriented means differently in here) of the function.

In order to keep the existing functionality, the code can be re-factored to meet the standards in this way:

var doSomething = function(){
$(".a").onclick = //some manipulated code.
};

var i;
for(i = 0 ; i < someValue ; i++) {
...
// some operations

$(".a").onclick = doSomething();

} // end for

The beauty of this style of coding is more evident when the function becomes complex and takes in values.

Finding more than one record easily in Rails (with Ruby)

In order to find records from the database matching certain criteria, one can use the Rails find(:all) construct, passing to it all the required conditions, in a manner like this :

objects = Model.find(:all, :conditions => { :field1 => "value1", :field2 => value2, ... })

But this becomes much of an overkill if you want to specify only 2/3 conditions.

Let's suppose, the model is called Users and 2 of its fields, user_name and user_email are to be used in the condition.

We could rather use a shorthand form:

objects = Users.find_by_user_name_and_user_email("ABCD","abcd@efgh.com")

Note that the order of values passed is picked up to match the field names mentioned in the calling method.

This has the same result as the more conventional:

objects = Users.find(:all, :conditions => {:user_name => "ABCD", :user_email => "ABCD@efgh.com"})

For one thing it reduces the need to type all the braces and the => operator and can be used effectively to reduce the typing and make code more readable.

The short-hand can be extended with as many fields but is not suggested as then it makes code all the more difficult to read.

Guidelines to writing specs for ROR.

Specs for
1) Controllers

a) In most cases it is safe and advisable to set-up the test environment for the specs to be run. So the following code becomes a habit at the beginning of each controller's spec

Note that the line numbers indicate the position of code relative to each other in the actual specs file.

1. describe ExampleController, :type => :controller do
2. before :each do
3. request.env["HTTP_HOST"] = "test"
4. User.delete_all
5. end
6. ...
7. end

Though the part after the name of the controller on the first line is not absolutely necessary, adding it provides good heads-up to the reader of the code.


b) To test any method of the controller, begin by describing it as the following lines in place of line 6 in the code.

6. describe "#method_name" do
7. # Tests
8. end

c) Then add some steps which you might require before the test begins by setting up pseudo-variables and values.

6. describe "#method_name" do

7. before do
8. @user = Factory.create(:user)
9. @objects = Factory.create(:object, :parameter_name => parameter_value, ...)
10. # Tests
11. end

In this step you should generate all objects needed by the spec to run properly. This is done so as to mimic to Rails that an actual object is being passed to the controller and it can use any of its properties that it may need in this methods. Hence, you have to pass all such values, which will be used for that object in the concerned method, at the point of creation of that variable.

In this case it is @objects which can be based on any model and parameter_names would be the list of attributes of @object which will be used in the method_name method. You should pass them on to the controller with which it can complete the method execution, without which it will fail, as the object is a dummy one and would not have any value you don't explicitly assign. Having said that, it is not necessary to pass all the attributes. You need only pass those values which are of specific concern to the method being tested. For eg.,

@user object will generally have a lot of test values assigned to it. If for any reason, you want the email-id, for example, to be of some specific domain for your method to complete, then we must pass its value as "abcd@specificdomain.com" at the time of creation of @user.

d) From this point on, it is time for getting in to the details of the method.
So different methods have different nature, and they should be tested in their own ways.

1) Simple - Methods which are straight-forward in their behavior and proceed along a straight path of execution may fall into this category.
They only place calls to few other methods and at the end may/may not generate some specific kinds of pages.

You should add a "it" line which gives a sense to the reader of the code. Or a "context" line would also serve the same purpose.

6. describe "#method_name" do
12. (it/context) "should do this and that and return successfully" do
13. controller.should_receive(:method1_name)
14. controller.should_receive(:method2_name)

These are the names of the methods which will be called during the method_name's period. So we expect the controller to receive calls to them. If in the method you are calling method_other of any other model, say ModelA, then that should also be tested in here, by :-

15. ModelA.should_receive(:method_other)

If you are certain that a method will be called x number of times and want to verify that,

16. controller/ModelA.should_receive(:method_other).exactly(x).times

2) Complex - These are the methods which create a number of objects in their period and also execute some statements which find specific objects from other models and tables. Then they use that to fire other complex operations and eventually fire validations on these objects received.

At this point you would have to create a number of fake objects by either stub-ing them or mock-ing them from their models.

@user = Factory.stub(:user) OR

@object_name = mock(ModelName, :parameter_name1 =>

parameter_value ...)

Note that the parameter_name1 has to be pre-pended with a : for Rails to identify properly an attribute of the ModelName being referred to but the parameter_value is mostly in the form of values, as in "value1" for string types or 123 for number types and so on.

These lines come mostly in the before do sections of the method or the before :each section of the controller, if the values being created are to be used throughout the controller code, which is a very rare case.

Once you have all the objects ready, comes the part where you have to stub method definitions, which is the most complex part.

We stub methods, meaning provide false definitions to the controller for those methods which you don't want the controller to run. You would do this because you don't write specs to monitor the entire method definition, but only to test specific portions of the method. So if there are some methods which might not run on the server/local because of some setup-limitations, so instead of letting the spec to fail, one should stub it out.

We do this, generally in the before do section, by:

8. ModelName.stub(:additional_method_name).and_return(@variable)

Now the second part is extremely crucial if the method_name expects additional_method_name to return some values and use it in some future validations. If you simply stub the additional_method_name, any dependencies on it would cause the spec to fail. Hence if the additional_method_name returns an object of type ModelC, we generally apply the following procedure to stub it out safely

6. before do

7. @objectC = mock(ModelC, list_of_required_parameters...)

8. ModelName.stub!(:additional_method_name).and_return(@objectC)

Which tells Rails to assume that the additional_method_name returned the object @objectC and use it for future operations.

Once all of this is set-up, you are ready to place a call to the method_name. It can be done in one of the two ways:

17. get :method_name, :parameter_name => parameter_value ...

which is the standard and most common way of calling the method with the set of values it expects.

But sometimes, there can be validations in the method_name to check if the request type had been post, only then continue, else return. In which case the previous statement becomes like:

17. post :method_name, :parameter_name => parameter_value ...

Now that calling the method is done, it becomes essential to see the return from the method and test it against various conditions, which may be among the following:

18. response.response_code.should == 200 (Checks for a successful return)

response.should render_template('template_name') (Checks that after the method, the expected template is being rendered)

response.should redirect_to(:controller => 'controller_name', :action => 'action_name', ...)

(this line checks that the page should be redirected to the action_name method of the controller_name controller and with a set of parameters. This line is mostly a copy of the statement in the controller method, wherein we verify if the expected thing actually happened.)

It might also be possible that the method_name creates objects of certain types, say ModelX, and you need to check if it really got created.

18. object_variable = ModelX.find(:all, :conditions => { list of condition hash})

this line tries to find the object being created by passing a known set of conditions. Since the method must have created such an object, we verify it by:

19. object_variable.should_not == nil

But to be 100% sure that this object had not been existing in the DB before this method spawned, you have to delete the table in the before do section of the method. Be rest assured it does not harm the data as all of this creation and deletion is only done in the test environment.

7. ModelX.delete_all

2) Models

1. describe ModelClassName, :type => :model do

To note is, that the ModelClassName must be the same as the model name for which the spec is being written.

In models the only difference comes in the way of calling the method in line 17.

line 17 becomes

17. ModelName.method_name(list of parameters)

3) Helpers

If you need to stub the helper with some values,

1. describe HelperClassName, :type => :helper do

To note is, that the HelperClassName must be the same as the name of the helper for which the spec is being written.

helper.stub!(:param_name).and_return({list of values in hash style})

In order to call the method, you would

helper.method_name.should == #something.

and the method expectancy tests go on.

XOR trivialities

Shortcuts for XOR and XNOR

The classical formula for XOR on two variables remains

result = a.b' + b.a'

(where . means logical AND and + means logical OR and ' means logical NOT).

On the possible set of values, the table becomes something like :

a	b	result
0	0	0
0	1	1
1	0	1
1	1	0

which boils down to the interpretation of ANY BUT NOT ALL. This holds true for any no of variables in question, i.e.,
a ^ b ^ c ^ d ^ e ^ ....... x ^ y ^ z = 0
(iff all a, b, c ... z == 0|1 together. And ^ denotes XOR operation.)

So when it comes to applying an XOR formula to 2 variables, say a & b and storing their result in a variable, say x.

The starting point would be
x = (a AND !b) OR (b AND !a).

While this is correct, problems may arise when a & b are themselves functions which return true/false. In this case, the functions a & b would be executed twice just to compute the value of x. In cases where this may not be desired, we could stick to the short hand equivalent for XOR.

x = (a != b)

Which says that the result will be 1(true) only when both the variables are not the same, as can be inferred from the truth table above.

It solves many problems:

Reduces typing effort
Doesn't evaluate functions twice
Reduces complexity when a & b themselves become large expressions.
Makes code easy to read and understand and maintain

Similar interpretation could be extended to the function XNOR, which in its formula looks like:

result = (a XOR b)', where ' means logical complement.

  result = (a.b' + b.a')'
         = (a'+b) . (b'+a)
         = a'b' + b.b' + b.a + a'.a
         = a.b + a'.b'

In plain words, we could simply derive that

  a XNOR b  = !(a XOR b)
            = !(a != b)
  a XNOR b  = (a == b)

But for the curious reader, the truth table should be enough proof:

Taking the derived formula
result = a.b + a'.b'

a	b	result
0	0	1
0	1	0
1	0	0
1	1	1

Readers to this point can observe that the result is 1(true) only when (a == b).

Tuesday, November 22, 2011

Monday, November 21, 2011

Monday, November 14, 2011

Monday, October 24, 2011

Shortcuts for XOR and XNOR

result = a.b' + b.a'