Monday, October 27, 2008

Get Your Local TV Listings From The Bash Command Line

Hey There,

For this week's Monday Linux/Unix bash shell script, we're finally starting to go after online TV listings. If you've checked out all of our other bash CLI scripts aimed at helping you to not have to open your web browser, please skip the following paragraph. It's redundant, to say the least ;)

Our previous web-based Bash scripts, in backward chronological order, include our posts on accessing Wikipedia, accessing the Farmer's Almanac, accessing the International Dictionary, checking out the world's weather, spewing out famous quotations on pretty much any subject, doing encyclopedia lookups, accessing the online Thesaurus, translating between different languages and, of course, using the online dictionary.

This script gets its content from TV Listings At Zap2It.com and accepts a maximum of two arguments. The US Zip Code is a required argument (we haven't checked to see if this works for overseas, but we highly doubt it, at this point) and you can also pass the script a "nohd" argument so that you only get standard TV listings returned. The default action, for this script, is to return all local television results (including HD Channels) for the current time period (Once we crack the dynamic object code, we'll definitely post an updated version of this script that will allow you to pick different "begin times" and also increase the time period your output can include). Also, you'll note that this script, again, includes a "pager" variable (which we've set to /usr/bin/more) since the output you get, if you don't pass the "nohd" option, will be extremely long and repetitive. You can edit that variable to set it to your favorite pager very simply. Otherwise, the script can be run simply, like so (one example command line per general execution mode):

host # ./localtv.sh 60015
host # ./localtv.sh 60015 nohd
<-- Strongly recommended for now, even if you have HD. It seems that most program listings are repetitious, although getting the output with the HD listings (default) can help you figure out what HD channels are broadcasting in your area.

Below are two pictures of the output (from the first two examples above - "60015" and "60015 nohd"). As you can see, we had to clip the HD version, since there are about 5 HD channels for every "regular" channel ;)

Click on the pictures below to be taken away to a place where things are slightly larger ;)

hd tv listings

tv listings without hd

I hope this script (in its infancy) is somewhat helpful or, at least, amusing for you ;) If you question the output, you can verify it by checking the TV Listings At Zap2It.com. It's highly possible that you may get output that we didn't get when running (literally, running ;) this script through its paces. Feel free to remove the 2 or 4 self-nullifying sed expressions in the script, as well ;)

Cheers,


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/bash

#
# localtv.sh - Get your local regular and HD Tv listings
#
# 2008 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

numargs=$#
date=`date`
hour=`date "+%H"`
minute=`date "+%M"`
if [ $hour -lt 12 ]
then
if [ $minute -lt 30 ]
then
nicetime="${hour}:00 AM"
else
nicetime="${hour}:30 AM"
fi
else
let hour=$hour-12
if [ $minute -lt 30 ]
then
nicetime="${hour}:00 PM"
else
nicetime="${hour}:30 PM"
fi
fi

nohd=0

if [ $numargs -gt 2 ]
then
echo "Usage: $0 USZipCode [nohd]"
exit 1
fi
if [ $numargs -eq 2 ]
then
if [ "$2" == "nohd" ]
then
nohd=1
else
echo "Usage: $0 USZipCode [nohd]"
exit 1
fi
fi

args="$1"
wget=/usr/bin/wget
pager=/usr/bin/more
nicedate=`date "+%m/%d/%y"`

echo
echo "Television Listings for $nicedate - $nicetime"
echo

if [ $nohd -eq 1 ]
then
$wget -nv -O - "http://tvlistings.zap2it.com/tvlistings/ZCGrid.do?method=decideFwdForLineup&zipcode=${args}&setMyPreference=false&lineupId=PC:${args}" 2>&1|sed -e :a -e 's/<[^>]*>/ /g;/</N;//ba' |sed -e '/^[ \t]*$/d' |sed -e '1,/Forgotten password/d' -e '/isFavoritesAvailable/,$d'|sed -e '/[ECMP][SD]T/,+6d' -e "s/'/'/" -e 's/&/\&/' -e '/zc.getAdFrame/d' -e 's/^[ \t]*//;s/[ \t]*$//'| sed -n '/^[0-9][0-9]*$/,+2p'|sed 's/^\([0-9][0-9]*\)$/\n\1/'|sed '/^[0-9][0-9]*$/ {
N
N
s/ *\n/\t/g
}'|$pager
else
$wget -nv -O - "http://tvlistings.zap2it.com/tvlistings/ZCGrid.do?method=decideFwdForLineup&zipcode=${args}&setMyPreference=false&lineupId=PC:${args}" 2>&1|sed -e :a -e 's/<[^>]*>/ /g;/</N;//ba' |sed -e '/^[ \t]*$/d' |sed -e '1,/Forgotten password/d' -e '/isFavoritesAvailable/,$d'|sed -e '/[ECMP][SD]T/,+6d' -e "s/'/'/" -e 's/&/\&/' -e '/zc.getAdFrame/d' -e 's/^[ \t]*//;s/[ \t]*$//'| sed -n '/^[0-9][0-9]*\.*[0-9]*$/,+2p'|sed -e 's/^\([0-9][0-9]*\.*[0-9]*\)$/\n\1/'|sed '/^[0-9][0-9]*\.*[0-9]*$/ {
N
N
s/ *\n/\t/g
}'|$pager
fi

exit 0


, Mike




MikeS had this to add regarding the script. Some of it has been worked into our updated script and some of it will definitely be added to the to-do list!

Hey I love your local TV bash script. I only noticed one
problem it does not provide other options for example if you
have digital service through a local provider or say DISH. I
am pretty new to using bash and sed gives me a headache. I
noticed on their page that if you click on TV listing and
enter your zip code it gives you a list of options consisting
of "cable", "satellite" and "local." I wonder how hard it
would be to parse that out first then let the user select the
best option. I hard coded my option in to your code and it
works great so I can't image it not working well. If you
you're not interested in working on it that's cool. I'll just
have to delve into bash and sed to get it figured out maybe
throw in a little zenity. Anyway it's great. If i do make any
changes I'll bounce it back to you. oh there are some text
issues too. like how the & and " are displayed.

htmlspecialchars can be a pain to handle.

EDITOR'S NOTE: I'm leaving this part out and redirecting to our post on posting code on Blogger since trying to pull this off again makes me nuts ;)

Later,

Mike


Russ had this to add, which ended up being another major contributing factor in the update of this script!

I just took a look at the zap2it page; I see that the GET parameter in the URL for a selected time of 20:00 PDT today is: ?fromTimeInMillis=1225249200000. That appears to be Unix milleseconds from the epoch. (The page source shows a number of day/time values for that variable embedded in JavaScript.) I tried plugging it into the URL in the script and it ran just fine, giving me the 8:00 listing. So the object would be to obtain the proper millisecond value for the desired time from a script date argument. I know how to do that in Python, but not in the shell.

Regarding the 3-hour time block, it appears that all the data is on the page. Each program is in an HTML anchor, with the time in milliseconds given as a tag's "sch" attribute and the channel in a "chn" attribute. So it seems it would be possible to loop through each channel for the time periods and display each one with its clock time. That's quite a bit more complicated than the current script, though, and I don't think I'll tackle it. On the other hand, perhaps someday I'll give it a try in Python, which is my language.

Best,
-Russ


Russ also submitted these script add-ons:

Code for "clean.py"

#! /usr/bin/python
""" Converts HTML entities to their corresponding characters.
"""

import sys, htmlentitydefs, re

pattern = re.compile("&#?(\w+?);")

def descape_entity(m, defs=htmlentitydefs.name2codepoint):
# callback: translate one entity to its ISO Latin value
try:
return unichr(int(defs[m.group(1).strip('&#;')]))
except KeyError:
return unichr(int(m.group(0).strip('&#;')))

def descape(string):
""" Processes the string for HTML entities. """
return pattern.sub(descape_entity, string)

txt = sys.stdin.read() # Read the incoming text from the pipe
sys.stdout.write(descape(txt)) # Process and write to standard out


and code for "tube.sh" - Note that some settings are hard-coded

#!/bin/bash

#
# localtv.sh - Get your local regular and HD Tv listings
#
# 2008 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

numargs=$#
date=`date`
hour=`date "+%H"`
minute=`date "+%M"`
if [ $hour -lt 12 ]
then
if [ $minute -lt 30 ]
then
nicetime="${hour}:00 AM"
else
nicetime="${hour}:30 AM"
fi
else
let hour=$hour-12
if [ $minute -lt 30 ]
then
nicetime="${hour}:00 PM"
else
nicetime="${hour}:30 PM"
fi
fi

args="97045"
wget=/usr/bin/wget
pager=/bin/more
nicedate=`date "+%m/%d/%y"`

echo
echo "Television Listings for $nicedate - $nicetime"
echo

# Get the HTML from the Web site and run through the common filters
$wget -nv -O - "http://tvlistings.zap2it.com/tvlistings/ZCGrid.do?method=decideFwdForLineup&zipcode=${args}&setMyPreference=false&lineupId=PC:${args}" 2>&1|sed -e :a -e 's/<[^>]*>/ /g;/</N;//ba' |sed -e '/^[ \t]*$/d' |sed -e '1,/Forgotten password/d' -e '/isFavoritesAvailable/,$d'|sed -e '/[ECMP][SD]T/,+6d' -e "s/'/'/" -e 's/&/\&/' -e '/zc.getAdFrame/d' -e 's/^[ \t]*//;s/[ \t]*$//' > fil

# Filter channels with no HD listings and direct to the display file (disp)
cat fil | sed -n '/^[0-9][0-9]*$/,+2p'|sed 's/^\([0-9][0-9]*\)$/\n\1/'|sed '/^[0-9][0-9]*$/ {
N
N
s/ *\n/\t/g
}'|egrep -Ew '^2|^6|^8|12|22|32|49' > disp
##else
# Filter channels with HD listings and append to the display file
cat fil | sed -n '/^[0-9][0-9]*\.*[0-9]*$/,+2p'|sed -e 's/^\([0-9][0-9]*\.*[0-9]*\)$/\n\1/'|sed '/^[0-9][0-9]*\.*[0-9]*$/ {
N
N
s/ *\n/\t/g
}'|egrep -Ew '^10' >> disp
cat disp |./clean.py |$pager
# Delete the files
rm fil disp
exit 0



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.