The Linux and Unix Menagerie: find

Showing posts with label find. Show all posts

Sunday, December 7, 2008

Unix and Linux Horror Stories And Actual Help

Hope everyone's having a great Sunday!

For today's humor post I found a nice page from Pedro Diaz' Technical University of Madrid Homepage. I've actually only included a very small portion of his page on Unix and Linux Horror Stories. The rest of it is well worth the lengthy read.

The few parts I included below are mostly humorous, but you should definitely check out The Entire Horror Stories Page. It contains a lot of mailed in information that ranges from the funny to the straight-up informational, so, while your chortling (nobody uses that word anymore - time to bring back some of the old jargon ;) you may just learn something.

Enjoy :)

*NEW*

From: samuel@cs.ubc.ca (Stephen Samuel)
Organization: University of British Columbia, Canada

Some time ago, I was editing our cron file to remove core more than a day
old. Unfortunately, thru recursing into VI sessions, I ended up saving an
intermediate (wron) version of this file with an extra '-o' in it.

find / -name core -o -atime +1 -exec /bin/rm {} \;

The cute thing about this is that it leaves ALL core files intact, and
removes any OTHER file that hasn't been accessed in the last 24 hours.

Although the script ran at 4AM, I was the first person to notice this,
in the early afternoon.. I started to get curious when I noticed that
SOME man pages were missing, while others were. Up till then, I was pleased
to see that we finally had some free disk space. Then I started to notice
the pattern.

Really unpleasant was the fact that no system backups had taken place all
summer (and this was a research lab).

The only saving grace is that most of the really active files had been
accessed in the previous day (thank god I didn't do this on a saturday).
I was also lucky that I'd used tar the previous day, as well.

I still felt sick having to tell people in the lab what happened.

-----------------------------------------------------------------------------

From: Stephen Samuel
Organization: University of British Columbia, Canada

As some older sys admins may remember, BSD 4.1 used to display unprintable
characters as a questionmark.

An unfortunate friend of mine had managed to create an executable with a
name consisting of a single DEL character, so it showed up as "?*".

He tried to remove it.

"rm ?*"

he was quite frustrated by the time he asked me for help, because
he had such a hard time getting his files restored. Every time he walked
up to a sys-admin type and explained what happened, they'd go "you did
WHAT?", he'd explain again, and they'd go into a state of uncontrolable
giggles, and he'd walk away. I only giggled controlably.

This was at a time (~star wars) when it was known to many as "the mythical
rm star".

-------------------------------------------------------------------------------

From: jjr@ctms.gwinnett.com (J.J. Reynolds)
Organization: Consolidated Traffic Management Services (CTMS)

The SCO man page for the rm command states:

It is also forbidden to remove the root directory of a given
file system.

Well, just to test it out, I one day decided to try "rm -r /" on one of our
test machines. The man page is correct, but if you read carefully, it
doesn't say anything about all of the files underneath that filesystem....--

-------------------------------------------------------------------------------

From: bcutter@pdnis.paradyne.com (Brooks Cutter)

A while back I installed System V R4 on my 386 at home for development
purposes... I was compiling programs both in my home directory, and
in /usr/local/src ... so in order to reduce unnecessary disk space I
decided to use cron to delete .o files that weren't accessed for
over a day...

I put the following command in the root cron...

find / -type f -name \*.o -atime +1 -exec /usr/bin/rm -f {} \;

(instead of putting)

find /home/bcutter -type f -name \*.o -atime +1 -exec /usr/bin/rm -f {} \;
find /usr/local/src -type f -name \*.o -atime +1 -exec /usr/bin/rm -f {} \;

The result was that a short time later I was unable to compile software.
What the first line was doing was zapping the files like /usr/lib/crt1.o
.. and later I found out all the Kernel object files...

OOPS! After this happened a second time (after re-installing the files
from tape) I tracked down the offending line and fixed it....

Yet another case of creating work by trying to avoid extra work (in this
case a second find line)

, Mike

Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Thursday, June 5, 2008

Enumerating Files In The Linux or Unix Shell - More Improvements

Hey There,

Today, we're going to take a little more time and devote it to the folks on all the boards and networking sites who've made excellent suggestions for improvement on some of the shell/Perl scripts and other posts we've put out here over time. As was mentioned the first time we decided to start posting these suggestions and improvements, rather than update old posts that never get any attention, we'll be putting the good stuff out here with updated timestamps so that anyone who reads this blog doesn't ever have to wonder if an issue they find has been addressed. And, also, of course, as a thanks for some great tips for refinement.

Today we're going to look at a couple of improvements and refinements to the shell one-liner to enumerate file types, suggested by folks via email and on the boards over at linuxtoday.com, LXer.com, fsdaily.com, and many other venues.

Again, since our policy regarding privacy is to regard everyone's privacy as equal and well deserved, we will only be referring to the folks who contributed by the nicknames and/or screen-names they used in "talkbacks" which are already posted on the internet. And, then, only if it's relevant and unavoidable.

The suggestions for change to this one-liner were excellent, many, and more brief. They also indirectly pointed out the fact that, when I wrote it, I was obviously obsessing over awk ;) The original one liner was this:

find . -print|xargs file|awk '{$1="";x[$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr

To this, it was first suggested that names with spaces in them wouldn't work. This is absolutely true, and can be countered using a variation on "xargs" in the command line, like this:

find . -print|xargs -Ivar file "var"|awk '{$1="";x[$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr

This effectively "double quotes" the arguments passed to xargs, but, while I was thinking about that I realized that there would also be additional work you'd need to do for single quotes/apostrophes, etc, to keep them from screwing up the command chain, as well. It was beginning to seem more and more like a solution that could definitely use some re-tooling.

So, naturally, one suggestion I received was to do it without using xargs. Good deal: One less hassle, to my way of thinking, and removing a whole lot of issues that didn't have to exist. The difference here is the use of the -exec flag with the find command, rather than piping to xargs:

find . -print -exec file {} \;|awk '{$1="";x[$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr

The next suggestion I received was to do it without using awk, but keeping xargs. This has the advantage of removing one additional external command (standard though it may be) from the process. And removing awk can make things a lot less confusing for most folks (myself included, which is probably why I used it originally. Not out of a twisted desire to cause myself grief, but to try and get more comfortable with it ;) That suggestion looked like this:

find . -print | xargs file -b | sort | uniq -c | sort -nr

But, this took us back to the xargs quoting and space-in-filename issue. This is the final suggestion that came from that community debate, which I think is probably the best (as in most succinct and utile) since it does it without awk, addresses the issues with xargs and can handle all the issues raised above:

find . -print0 | xargs -0r file -b | sort | uniq -c | sort -nr

If you want to check out this interaction, to gain some more insight into the thought behind each version, you can find it here on linuxtoday.com (at least for a while, assuming it will get moved eventually).

And, once again, a huge "Thank you" to anyone and everyone who's helpful criticism proved that, not only is there more than one way to skin a cat, there are far more efficient ways to do it than you or I may imagine in one sitting (but, please, don't skin any cats ;)

There's probably someone out there who knows a way to do it even better. Which is, of course, the beauty of Linux and Unix and why I enjoy working in the shell. It can be as simple or as complicated as you need it to be, and is flexible enough to allow users' the creativity to determine the path and the outcome of virtually everything that can be accomplished using either OS (or both :)

Have a great morning/day/afternoon/evening,

, Mike

Thursday, May 29, 2008

Simple Shell One-Liner To Enumerate File Types In Linux and Unix

Hey there,

Lately, we've been focusing a lot on Perl "one liners," from mass file time syncing to name and IP resolution and I thought it was only fair that we should write a post about a shell "one liner" for once. After all, most standard Unix and Linux shells are perfect for that purpose :)

Here's a very quick way to take an inventory of all the different file types (including directories, sockets, named pipes, etc) that exist in a given directory tree and provide a tally of each file type. I'm not entirely sure that I care if I have 15 ASCII text files and 2 Perl scripts in my current working directory, but this little piece of code must be able to help someone somewhere accomplish something, even if it is only to use as a smaller part of a larger organism ;)

This works in pretty much any standard shell on every flavour of Linux and/or Unix I've tested (ash, sh, bash, jsh, ksh, zsh, even csh and tcsh, which is huge for me since I never use those shells. One day, soon, I will redouble my efforts and just learn how to use them well, which, hopefully, won't result in my writing a whole bunch of posts about "cool" stuff that everyone has known about for the past few decades ;).

This one-liner could actually be written as a script, to make it more readable, like this:

#!/bin/sh
find . -print | xargs -I var file "var"|
awk '{
        $1="";
        x[$0]++;
} 
END {
        for (y in x) printf("%d\t%s\n", x[y], y);
}' | sort -nr

But, since I'm a big fan of brevity (which is about as far away from obvious as possible if you consider my writing style ;), I would run it like this:

host # find . -print|xargs file|awk '{$1="";x[$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr

In my terminal, that all comes out on one line :) And here's the sort of output you can expect:

host # find . -print|xargs file|awk '{$1="";x[$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr
23 Bourne-Again shell script text
11 ASCII text
10 perl script text
5 pkg Datastream (SVR4)
4 directory
4 ASCII English text
2 UTF-8 Unicode English text, with overstriking
2 Bourne shell script text
1 RPM v3 bin i386 m4-1.4.10-1

and, just to confirm the file count, we'll change the command slightly (to total up everything, instead of being particular -- another "one-liner" that's, admittedly, completely unnecessary unless you're obsessive/compulsive like me ;) and compare that with a straight-up find:

host # find . -print|xargs file|awk 'BEGIN{x=0}{x++}END{print x}'
62
host # find .|wc -l
62

Good news; everything appears to be in order! Hope this helps you out in some way, shape or form :)

Cheers,

, Mike

Saturday, February 2, 2008

Shell Script To Fake Linux Locate On Solaris

Hey There,

For this lazy weekend, I thought I'd put together a little shell script to address an issue I run into commonly enough.

If you've used RedHat Linux (or pretty much any Linux distro nowadays), you may have come to enjoy the convenience of the "locate" and "slocate" commands. Basically, all the command really does, in one mode, is create a database of every directory and file on a system (dated right then and there) and, in the most common usage mode, search that database file.

Adhering to these basic concepts, I wrote up a quick script to kind-of do the same thing on Solaris (As per above, this should work on pretty much any Unix system, as well).

Basically, you'll invoke it once to create file and directory databases (using simple find and ls commands), and run the generated scripts to search those text databases. This script has only 3 options:

f - to make a file-listing text database and create a built in script to emulate searching for files.
d - to make a directory-listing text database and create a built in script to emulate searching for directories.
nodat - to not create the additional tree.dat file that is only a directory database; no built-in script.

Depending upon what you want, if you were to run:

host # ./gfind f d nodat

you'd end up with two files, which you could invoke like this:

host # ./filefind FILENAME
host # ./treefind FILENAME

and these two commands would list out all the files matching your query (grep is used under the covers so you can use shortened versions of your desired search string to get back more results. It works just like locate/slocate in that when you run it (just like updatedb for locate/slocate), the results you get are what was on the system at the time you ran it.

If you opt to not include the "nodat" argument on your command line (only valid with the "d" option), the script will also create a totally separate directory database that you can run whatever crazy grep-like commands you want to against it.

Hope it helps you out to a good degree :)

Cheers,

This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/ksh
#
# gfind - file and directory search program producer and
# directory tree mapper
#
# 2008 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

trap 'echo "";echo "Cleaning up and bailing out!!!";rm -f treefind filefind tree.dat johno tempmap maptemp newmap oldmap;exit' 1 2 3 15

rm -f treefind filefind tree.dat johno tempmap maptemp newmap oldmap

function make_treeexec
{
    print "#!/bin/ksh" >> treefind
    print "" >> treefind
    print "for x" >> treefind
    print "do" >> treefind
    print "grep \$x treefind" >> treefind
    print "done" >> treefind
    print "exit" >> treefind
    print "" >> treefind
    chmod 777 treefind
}

function make_fileexec
{
    print "#!/bin/ksh" >> filefind
    print "" >> filefind
    print "for x" >> filefind
    print "do" >> filefind
    print "grep \$x filefind" >> filefind
    print "done" >> filefind
    print "exit" >> filefind
    print "" >> filefind
    chmod 777 filefind
}

function make_filefind
{
    find / -print -depth 2>/dev/null >> johno
    sed '/^\proc/d' johno >> filefind
    rm johno
}

function make_treefind
{
    ls -R / 2>/dev/null >> tempmap
    sed -n '/\//p' tempmap >> maptemp
    sed -e '/^\/dos/d' -e '/^\/proc/d' maptemp >> treefind
    rm tempmap 
}

function make_dirtree
{
    awk '{if ($0 ~ /^\/[^\/]*\/[^\/]*$/)
            {print "        "$0}
        else if ($0 ~ /^\/[^\/]*\/[^\/]*\/[^\/]*$/)
            {print "                "$0}
        else if ($0 ~ /^\/[^\/]*\/[^\/]*\/[^\/]*\/[^\/]*$/)
            {print "                        "$0}
        else if ($0 ~ /^\/[^\/]*\/[^\/]*\/[^\/]*\/[^\/]*\/[^\/]*$/)
            {print "                                "$0}
        else if ($0 ~ /^\/[^\/]*$/)
            {print $0}
        else
        {print "                                        "$0}
    }' maptemp >> newmap
    sed 's/^        \/[^\/]*/       /' newmap >> oldmap
    sed 's/^                \/[^\/]*\/[^\/]*/               /' oldmap > newmap
    sed 's/^                        \/[^\/]*\/[^\/]*\/[^\/]*/                       /' newmap > oldmap
    sed 's/^                                \/[^\/]*\/[^\/]*\/[^\/]*\/[^\/]*/                               /' oldmap > newmap
    sed 's/^                                        \/[^\/]*\/[^\/]*\/[^\/]*\/[^\/]*\/[^\/]*/                                       /' newmap > oldmap
    rm maptemp newmap
    mv oldmap tree.dat
}

if [ $# -lt 1 ] || [ $# -gt 3 ]
then
    echo "Must specify at least one option!!!!"
    echo ".. and no more than three..."
    echo "Usage: gfind ( d ) ( f ) ( nodat )"
    echo "Option \"nodat\" only valid with \"d\""
    exit 1
fi
if [ $# -eq 3 ]
then
    if [[ $1 = d && $2 = f && $3 = nodat ]] || [[ $1 = f && $2 = d && $3 = nodat ]] || [[ $1 = d && $2 = nodat && $3 = f ]] || [[ $1 = f && $2 = nodat && $3 = d ]] || [[ $1 = nodat && $2 = d && $3 = f ]] || [[ $1 = nodat && $2 = f && $3 = d ]]
    then
        make_fileexec
        make_filefind
        make_treeexec
        make_treefind
        rm -f maptemp
        echo "File \"treefind\" created successfully!"
        echo "File \"filefind\" created successfully!"
        exit
    else
        echo "Hmm... Something wasn't right there"
        echo "Your only options are \"d\" \"f\" or"
        echo "\"nodat\", which is only good with \"d\""
        exit 1
    fi
fi
if [ $# -eq 2 ]
then
    if [[ $1 = d && $2 = f ]] || [[ $1 = f && $2 = d ]]
    then
        make_fileexec
        make_filefind
        make_treeexec
        make_treefind
        make_dirtree
        echo "File \"treefind\" created successfully!"
        echo "File \"filefind\" created successfully!"
        echo "File \"tree.dat\" created successfully!"
        exit
    elif [[ $1 = d && $2 = nodat ]] || [[ $1 = nodat && $2 = d ]]
    then
        make_treeexec
        make_treefind
        rm -f maptemp
        echo "File \"treefind\" created successfully!"
        exit
    else
        echo "Ooops, you can only possibly combine"
        echo "\"d\" and \"f\" or \"d\" and \"nodat\""
        echo "with only two options taken!!!"
        exit 1
    fi
fi
if [ $# -eq 1 ]
then
    if [[ $1 = d ]]
    then
        make_treeexec
        make_treefind
        make_dirtree
        echo "File \"treefind\" created successfully!"
        echo "File \"tree.dat\" created successfully!"
        exit
    elif [[ $1 = f ]]
    then
        make_fileexec
        make_filefind
        echo "File \"filefind\" created successfully!"
        exit
    elif [[ $1 = nodat ]]
    then
        echo "Option \"nodat\" only works with option \"d\"!"
        exit 1
    else
        echo "I Don't know that option!!! Try \"d\" or \"f\""
        exit 1
    fi
fi

, Mike

linux unix internet technology

Tuesday, November 27, 2007

Using find and xargs to locate Windows Files

A lot of times, when you're asked to find something on a machine, and you only have a moderate idea of what you're specifically looking for, you'll use the obvious command: find. find is a great command to use because you can use wildcards in your expression argument. So, if you know that you're looking for something like "theWordiestScriptEver," and you have no idea where it's located on your box, you could find it by typing just this:

find / -iname "*word*" -print

This will find every file on the system (even on non-local mounts if you have them set up) and only print the results for files with the word "word" in the name. Note that the "-iname" option matches without regards to case, so h and H both match. This option isn't available in all versions of find. If you don't have this option available to you, you'll get an error when you run the above line (just use "-name" instead). The standard Solaris find does not do "case insensitive" pattern matching, so your best bet is to find the smallest substring that you're sure of the case on, or use another attribute to search for the file (like -user for the userid or -atime for the last access time). Alternatively, you could spend hours stringing together a bunch of "or" conditions for every conceivable combination of upper and lower case letters in your expression.

Now suppose you needed to perform an action on a file you found. You could use find,s built-in exec function, like so:

find / -iname "*word*" -print -exec grep spider {} \;

This will perform the command "grep spider" on all files that match the expression. Which brings us around to the next predicament. What do you do if you have to try and find something simple, have no idea where it is on your box "and" that box hosts file systems that Windows users are allowed to write files to. The above example should work just fine on those. My own advice is, if you can get away with just using find, do so, since it handles all of the rogue characters, tabs and spaces in Windows files on its own.

Now, if you have to do something much more complicated (or convoluted), you'll want to pipe to a program like xargs, which is where all those funny Windows file names and characters (some of which are special to your shell) start to cause issues. Again, this would return ok:

# find . -name "*word*" -print
./word - file's
./word file
./word & file's
./word file's

But this will become an issue if you pipe it to xargs, as shown below:

# find . -name "*word*" -print|xargs ls
xargs: Missing quote: files

Ouch! xargs doesn't deal with those spaces, tabs and special characters very well. You can fix the space/tab problem very simply by using xarg's "named variable" option. Normally, xargs acts on the input it receives (thus: "xargs ls," above, is processing ls on each file name find sends it), but you can alter how it deals with that data in a simple way (at least as far as the spacing issue is concerned). Example below:

# find . -name "*file" -print|xargs -ivar ls "var"
./word file
# find . -name "*word*" -print|xargs -ivar ls "var"
xargs: Missing quote: ./word - files

But, in the second invocation above, you see that it still can't handle the "shell special" characters, like "'" or """ <--- Double quote - so it's time to step it up. I prefer to just sanitize everything that's not kosher, even though I know I don't technically have to avoid the ---> \ / : * ? "< > <--- characters, since Windows won't allow them as parts of file names. It seems easier just to react on anything that isn't a letter or number and pass it along with enough escapes (back slashes) so that xargs can parse it correctly, and get you back good information. Here's how to do that; using sed (and a little grep, to keep it neat), also:

# find . -name "*word*" 2>&1|grep -iv denied|sed "s/$[^A-Za-z0-9]$/\\\\\1/g"|xargs -ivar ls "var"
./word - file's
./word file
./word & file's
./word file's

And now you can use find, combined with xargs, on all the files you have permission to see, no matter what goofy characters are in them :)

, Mike

linux unix internet technology

The Linux and Unix Menagerie

Sunday, December 7, 2008

Unix and Linux Horror Stories And Actual Help

Thursday, June 5, 2008

Enumerating Files In The Linux or Unix Shell - More Improvements

Thursday, May 29, 2008

Simple Shell One-Liner To Enumerate File Types In Linux and Unix

Saturday, February 2, 2008

Shell Script To Fake Linux Locate On Solaris

Tuesday, November 27, 2007

Using find and xargs to locate Windows Files

Bookmark Us!

LXer - Linux News Feed

Linux And Unix Resources

Blog Archive

Top Post-Label Index