Wednesday, August 6, 2008

Using Bash To Feed Command Output To A While Loop Without Using Pipes!

Hey There,

Today's post regards something I just picked up. In fact, it's something that's been driving me nuts for a long long time (Reference our earlier post on piped variable scoping in Linux or Unix). The issue I'm writing about is something I've been puzzling over for quite a while in my spare time. I haven't been mulling over how to "do it myself" creatively, but more about "why has this feature never existed when it seems so essential?" As it turns out, this feature "has" existed, although it was a little hard to find in bash 2.x. With bash 3.x, it's brought to the fore and given the attention it deserves (including it's own name ;)

HUGE HELPFUL HINT: If you don't care about the process I went through to find the answer for bash 2.x, and just want to know how to do it, skip down to the PROBLEM SOLVED section which is named appropriately and in the same SCREAMING typeface ;)

The issue is that of command line output (or, if you prefer to think about it the other way, STDIN command line input) redirection and handling. For instance, if you wanted to avoid a problem with scoping in bash and you were reading your input from a file, you could change this block of code:

cat FILE|while read line
echo $line

to the more appropriate and efficient:

while read line
echo $line
done <FILE

On the other hand, if you were dealing with command output, you "couldn't" switch this block of code:

ls -1d *|while read line
echo $line

with this:

while read line
echo $line
done < ls -1d *

...but that's just common sense, since the redirection input operator would expect ls to be a file and you'd get an error like:

./program: syntax error near unexpected token `-1d'
./program: line 4: `done < ls -1d *'

So, other than the file descriptor exec workaround (which is really just a fancy way of outputting your process's STDIN and STDOUT streams to a file and reading from it; completely contrary to the spirit of having this work naturally) the following might seem like a reasonable way to "feed" your while loop the command output (only showing the last lines of the following code blocks for brevity's sake as we roll through the error scenarios):

done < `ls -1d *`

but, this results in:

./program: `ls -1d *`: ambiguous redirect

and you'd get the same thing using the bash built-in's:

./program: $(ls -1d *): ambiguous redirect

Double redirect doesn't work either (<< ls -1d * - also << `ls -1d *` returns only an error code in errno with no output):

./program: line 4: syntax error near unexpected token `-1d'
./program: line 4: `done << ls -1d *'

And we've considered subshells, which don't work either:

./program: line 4: syntax error near unexpected token `*)'
./program: line 4: `done <(ls -1d *)'


But here's a really neat trick for getting this to work in bash 2.x. If you change your program to be structured like so:

while read line
echo $line
done < <(ls -1d *)

Your outcome will result in success!! You've got the command output and you didn't have to use a pipe to feed it to the while loop!

NOTE: The two most important things to remember about doing this are that:

1. The space between the first < and second < is mandatory! Although, it should be noted that, between the two <'s, you can have as many spaces as you want. You can even use a tab between the two <'s, they just can't be directly connected.

2. The command, from which you want to use output as fodder for the while loop, needs to be run in a subshell (generally placed between parentheses, just like the ones surrounding this sentence) and the left parenthesis must immediately follow the second <, with "no" space in between!

We've already looked at what happens if you ignore rule number 1 and use << instead of < <. If you ignore rule number 2, you'll get:

./program: line 4: syntax error near unexpected token `<'
./program: line 4: `done < < (ls -1d *)'

And here's the "even better part" - In bash 3.x, you don't have to worry about all that spacing anymore, as they've added a new feature which does the same thing (or is it really just an old feature dressed up to make it seem fabulous? ;) In bash 3.x, you can use the triple-< operator. Actually, I believe the <<< syntax is referred to as a "here string," but that's purely academic. They could call it "fudge," as long as it works ;)

So, in bash 3.x, you could write a while loop that takes input from a command without using a pipe like so:

while read line
echo hi $line
done <<< "`ls -1d *`"

NOTE: The space between the <<< and your backticked (or otherwise extrapolated) command output is not necessary and you can have as much space as the shell can stand between those two parts of the "here string." Of course, the three <'s need to be all clumped together with no space in between them.

I hope this has been helpful and/or enlightening for everyone out there, like me, who've been stumped by this issue for a while and always ended up doing some half-arsed workaround. It's a problem that's been bugging me forever. It turns out I was "this close" with bash 2.x, but I'm very happy to see that bash 3.x actually includes the functionality and makes finding it as simple as RTFMP ;)


, Mike

Thanks For This Comment From Richard Bos, which points out a flaw in this post that has been corrected per his remarks:

So your example should actually read:

while read line
echo hi $line
done <<< "`ls -1d *`"

One thing though I use: done <<< "$(ls -1d *)"
This construct is also used on this example page

Thanks, also, for this comment from Douglas Huff, which helps to clarify the underbelly of the process:

A friend of mine pointed me to this article and the
previous one in the series that you wrote [on variable scoping]...

I had two comments on these articles but you seem to have
comments disabled, so I figured I'd email them to you.

First, calling it a "scoping" issue is a bit misleading.
While technically true, understanding the underlying
reasons why this doesn't work as "expected" is key to
understanding how you can work around it in POSIX sh or in
ksh without the zsh/bash syntatical sugar for doing so.

What's going on is that a process cannot modify the
environment of it's parent.

When you do:

something | while read blah; do blah; done

What the shell is doing is first executing a subshell
(separate process) that runs the while with stdin
redirected to read from the unnamed pipe. Then in another
subshell it runs "something" with standard out redirected
to the unnamed pipe.

Knowing this it's quite easy to replicate the behaviour
from bash 2/3 and zsh in POSIX sh and ksh with a bit of
understanding of the underlying mechanics. The trick is to
keep the while inside of the original process (since it is
run by the interpretter and does not require a separate
process) and execute the other command in a subshell.
Which is exactly what the syntactical sugar does for you
behind the scenes in bash2&3/zsh.

Thanks, also, to Vincenzo Di Massa, for shedding even more light on the subject :

the reason why there is the space between < and < in
done < <(ls *.txt)

is the following.

<(ls *.c) gets espanded into a filename

for example try:
$ echo <(ls *.txt)

it will print somehing like /dev/fd/63

the meaning is that <(ls *.txt) gets replaced by the filename
of a special file attached to the output of the ls command.

thus < <( ls *.txt) gets replaced by
< /dev/fd/63
and thus the standard input redirection takes place.

Best Regards Vincenzo