Monday, October 20, 2008

Bash Script To Access Wikipedia

Hey There,

This week's Monday Linux/Unix bash shell script brings us one step closer to the end of our "web-based-info-site command line program" project (And we'll never again hire anyone to name a project for us that gets paid by the word ;)

Previous entries, in backward chronological order, you may be interested in include our posts onaccessing the Farmer's Almanac, accessing the International Dictionary, checking out the world's weather, spewing out famous quotations on pretty much any subject, doing encyclopedia lookups, accessing the online Thesaurus, translating between different languages and, of course, using the online dictionary.

This script gets its content from Wikipedia.com and accepts a variable number of arguments. The only real caveat, when executing the program, would be to be sure to put any query with "special" characters in double quotes (like apostrophes, etc). Also, you'll note that this script includes a "pager" variable (which we've set to /usr/bin/more) since the output you get back will most definitely be looooooong. You can edit that variable to set it to your favorite pager very simply. Otherwise, it can be run easily, like so (one example command line per general execution mode):

host # ./wikipedia.sh linux
host # ./wikipedia.sh linux kernel
host # ./wikipedia.sh "fermat's last theorem"


Below are two pictures of the output (from the first two examples above - linux and linux kernel). You'll note that the second example is already on the second page of output. This was done intentionally so that you could see a little bit more of the output and, also, because the first page was incredibly boring ;)

Click the pictures below and prepare to be totally amazed ...but only if you're easily impressed ;)





Have fun using this script. I haven't had time to QA it totally (the downside of that work-for-pay part of life ;), but it's fairly solid. One thing I didn't have time to check thoroughly was how it responds to foreign Wikipedia queries.

NOTE: We're currently working on a way to grab television listings based on your zip code and service provider. We were hoping to release it this week, but it's been a bigger PITA than anticipated. Everyone here has got the next week off and, hopefully, that will allow us the time to nail it. Plus, we really need to catch up on our email. We promise to respond to everyone who's written in as soon as humanly possible :)

Cheers,


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/bash

#
# wikipedia.sh - Forget the regular encyclopedia
#
# 2008 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

numargs=$#

if [ $numargs -lt 1 ]
then
echo "Usage: $0 Your Wikipedia Query"
echo "Ex: $0 linux"
echo "Ex: $0 \"linux kernel"
echo "Quotes only necessary if you use apostrophes, etc"
exit 1
fi

if [ $numargs -gt 1 ]
then
args=`echo $args|sed 's/ /_/g'`
fi

echo

args="$@"
wget=/usr/bin/wget
pager=/usr/bin/more

$wget -nv -O - "http://en.wikipedia.org/wiki/${args}" 2>&1|grep -i "Wikipedia does not have an article with this exact name" >/dev/null 2>&1

anygood=$?

if [ $anygood -eq 0 ]
then
args=`echo $args|sed 's/%20/ /g'`
echo "No results found for $args"
exit 2
fi

$wget -nv -O - "http://en.wikipedia.org/wiki/${args}" 2>&1|sed -e :a -e 's/<[^>]*>/ /g;/</N;//ba'|sed -e '1,/Jump to:/d' -e '/^$/N;/\n$/N;//D' -e '/^.*[.*edit.*].*See also.*$/,$d' -e '/This *disambiguation *page/,$d' -e '/^$/N;/\n$/D'|$pager

exit 0

, Mike




Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.