Saturday, January 5, 2008

Script to Split Words On A Null Delimiter

Today's script is going to deal with a problem I've run into from time to time when trying to split words into an array. In a Unix shell script, it's easy, using tools like awk, to split lines into arrays of words; but trying to split a word into an array of characters can sometimes be difficult, if not impossible, given the limitations of the tools at your disposal.

In today's Unix shell script, you'll see that we've written it in sh, for maximum portability between systems. You'll also note that, because of this, we're forced to use some old-style methods to get the results we want. The Bourne (and/or Posix) shell, as wonderful as it is, doesn't provide a lot of the conveniences we've come to expect from the more advanced shells.

Take a look at today's script and notice the prevalent use of expr. There are a million ways you can use this, as a tool in your Unix shell scripting arsenal, to simulate anything the more advanced shells can do. In fact, it would probably be more correct to state that the more advanced shells create their user-friendly built-in commands using these sorts of Unix scripting methods and hiding them from the user. It is, after all, a matter of convenience. No sense in re-inventing the wheel unless you need to ;)

Hopefully, you'll find this interesting and useful. Tomorrow, we'll look at an equally nitty-gritty script that will do the exact opposite.

Cheers,


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/sh

############################################
# shsplit - split words with null delimiter.
#
# 2008 - Mike Golvach - eggi@comcast.net #
#
# Usage - shsplit string
#
# Notes - If string contains spaces, be sure
# to quote it. If you're trying to split a
# string with a delimiter, use awk.
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#
############################################

argvcount=$#
expr=/path/to/your/expr

if [ $argvcount -eq 0 -o $argvcount -gt 1 ]
then
exit 1
else
string=$1
letters=`$expr "$1" : '.*'`
fi

basecount=0
dotcount=1

while [ $basecount -ne $letters ]
do
dots=`$expr substr "$string" $dotcount 1`
spacetest=`$expr "$dots" : ' '`
if [ $spacetest -eq 1 ]
then
eval array$basecount="\\\0"
else
eval array$basecount="$dots"
fi
dotcount=`$expr $dotcount + 1`
basecount=`$expr $basecount + 1`
done

basecount=0

while [ $basecount -ne $letters ]
do
eval echo "\$array$basecount"
basecount=`$expr $basecount + 1`
done


, Mike