Tuesday, July 1, 2008

Using Strings To Safely Get Program Usage Information On Linux And Unix

Hey There,

We've posted quite a bit about the "strings" command in various past-posts running the gamut from using strings to extract RPM header information to using the basic strings construct in C to make running shells on network sockets possible. Today we're going to take a look at the "strings" command in an entirely new light.

Imagine that you were tasked with running a particular command named, for the sake of argument, BLARG. Unfortunately, in our manufactured situation, BLARG has no man page, and searches for it in Google, and other search engines turn up no useful information. Also your boss just said that you needed to run it, and left it at that, with no further instruction (he also can't be reached. What's wrong with this guy? ;) BLARG is also a compiled binary.

Your basic inclination might be to just run it without any arguments, as many commands (like "mkdir") will give you the usage information you need if you use this method, like so:

host # mkdir
usage: mkdir [-p] [-m mode] dirname ...


However, lots of other programs don't, so it's not the wisest choice. Remember that BLARG could potentially be a very harmful program. Running it without arguments may destroy things you can't afford to lose.

Other options you have, would include (but not be limited to), the following, coupled with their undesirable possible outcomes:

1. You could give the command a bogus switch line, like "BLARG -xKECVDSLdlske" : Assuming that that command line is indeed bogus, lots of programs silently ignore bogus switches and run their default instructions anyway.

2. You could cat the command : This will probably just turn your terminal output into Chinese. Even if you redirect standard error to /dev/null, odds are standard output is going to include a lot of funky characters that might cause more harm than good. You might also note that, a lot of the time, the usage message is printed to standard error and not standard output!

3. You could use eval to run the program, like "eval BLARG" : Unfortunately, even though it seems counterintuitive, eval just evaluates a condition or program's return status. Unfortunately, in order to get that, it has to run the command.

4. You could use commands like crash to get the information : This can be a great way to find out the information you need. By typing "crash -h BLARG" you should, theoretically, get a dump of all the help information you need. Unfortunately, not all distro's of Linux and Unix include it by default and not all distros' versions of crash operate the same. Some require you to be proficient in running a debugger against a dump file, afterward. Way too much hassle.

So far, we've gone through about 5 options, going from worse to better. There are probably a lot more than I'm thinking up here as I type (email them to me at eggi@comcast.net with comments if you'd like, as I'd love to do a follow-up to this post with more of that kind of information).

One way I've found that is virtually foolproof, and works in every distro I've tested, is to use the "strings" command to extract usage information. If you've ever used strings before, you know that distilling what it spits out when you run it against a command to a universally acceptable output of help information for any and/or all binaries is next to impossible. The Linux version of the crash command comes much closer to doing this, and doing it better. But, for the rest of us (even those without the privilege to run "crash"), we can still get the information we need using "strings", like so:

host # strings BLARG 2>/dev/null|egrep -i 'usage|help' <-- Note that strings generally requires the fully qualified name of the binary, like /bin/BLARG or ./BLARG
usage: %s [-abcdefGHIJKv] [file ...]

and you can even add the universal "%s" printf modifier to your egrep if you want to get all the lines that might contain useful help information, if you're not sure that the usage message is limited to a single line of output. This has the side effect of, sometimes, making the output a little messy, although (as some of you may have noted) the above usage display (while better than nothing) doesn't really help you. You'll probably be right 99% of the time if you guess the -v flag stands for verbose or version, but you never know. Using strings and grabbing all the lines with %s can provide more insight, if not a more distracting view of the binary's guts (of course, this output is from another command entirely ;)

host # strings BLARG 2>/dev/null|egrep -i 'usage|help|%s'
%s: %s
%s: directory causes a cycle
%s %*u %-*s %-*s
ls: %s: %s
%s/%s
usage: %s [-abcdefGHIJKv] [file ...]
%ld%s-blocks
%s: unknown blocksize
%s: minimum blocksize is 512
%s:
%s: %m
netgroup: Cycle in group `%s'
%s.%s
(%s,%s,%s)
option requires an argument -- %s
unknown option -- %s
stack overflow in function %s
%.3s %.3s%3d %2.2d:%2.2d:%2.2d %s
%H:%M:%S
%a %b %e %H:%M:%S %Z %Y
%I:%M:%S %p
%s/%s.%d
YP server for domain %s not responding, still trying
<; errno = %s
%s: %s - %s
%s/bt.XXXXXX
%s/_hash.XXXXXX


Worst case, you can just run something like:

host # strings BLARG >OUTPUT 2>&1

and safely cruise the lines of text in the OUTPUT fiel to manually find what you need. You may have to ;)

In any event, you've got a great tool at your disposal to find out what you need to know the hard way. And, sometimes, that's the only way to be absolutely sure :)

Cheers,

, Mike