Saturday, March 21, 2009

Perl Regular Expressions - Searching Array Without Loops

Iterating through arrays always require using of loops.  Array elements will be processed for many things but the most common one is searching for a particular element in an array.  If your purpose is to check whether a particular word is in the array elements, you can avoid using loops the following way.

The first thing to learn is a feature of perl arrays, ie;  if you enclose a perl array variable within double quotes it will be equivalent to a string containing all elements of the array separated by spaces.
Like this:

safeer@enjoyfast-lx:~/Scripts$ perl -e '@words=("tom","jerry","atom");print "@words\n"';
tom jerry atom

Now we have a string containing all the elements in the array.  All we need to do is match it against the search string.  If your seach term can be a substring of an array element,things are pretty easy.  You can directly match it agains the string converted array.
Like this:

if ( "$@words" =~ "jer" )
This will match the element jerry and the search will be successfull.  But if you are looking for an exact match of your search string with an array element, you will have to use a bit of regular expression techniques.  In this case we will have four different patterns.

1)  Array has multiple elements and the search string is the firrst element, in this case left hand side of the element will be begining of string ( denoted by "^") and the right hand side a space ( denoted by "\s").
2) Array has multiple elements and the search string is the last element, in this case left hand side of the element will be a space ( denoted by "\s") and  the right hand side be end of the string ( denoted by "$").
3) Array has multiple elements (>2)and the search string is somwhere in the middle of the string, in this case both side of the element will be space ("\s")
4) If the array has a single element, LHS of element will be start of string ("^") and RHS will be end of string ("$").

So a matching expression will look like:

if ( "@words" =~ /(^|\s)$searchterm(\s|$)/ )

Lets write a full script and try some options.  This script will accept search pattern from command line.

safeer@enjoyfast-lx:~/Scripts$ cat regex-search.pl
#!/usr/bin/perl -w
@words=("tom","jerry","atom");
if (defined $ARGV[0])
{
if ( "@words" =~ /(^|\s)$ARGV[0](\s|$)/)
{
print "Match!! \n";
}
else
{
print "No Match \n";
}
}
else
{
print "Provide a Search Term\n";
}

safeer@enjoyfast-lx:~/Scripts$ ./regex-search.pl tom
Match!!
safeer@enjoyfast-lx:~/Scripts$ ./regex-search.pl jerry
Match!!
safeer@enjoyfast-lx:~/Scripts$ ./regex-search.pl atom
Match!!
safeer@enjoyfast-lx:~/Scripts$ ./regex-search.pl jerr
No Match
safeer@enjoyfast-lx:~/Scripts$ ./regex-search.pl to
No Match


The flip side of this approach is that the search will fail if an array element has whitespace included in it.

No comments:

Post a Comment