Wednesday, May 21, 2008

Simple Name And IP Resolution Using Perl On Linux Or Unix

Greetings,

Today we're going to take a look at a simple way to demonstrate something that happens pretty much every time you use the web, check your email or even turn on our PC so you can open up whatever software package you're using to read this page right now.

The "something" that always seems to be happening is name/IP resolution. It happens on Linux, Unix, Windows and just about any operating system device or language that does any sort of networking, like Perl, Bash, and C.

Previously, we've looked at it in the context of much more complicated structures in posts on using Perl to run a shell on a socket and non-maliciously scanning for open network ports. In a simplistic sense, someone once described the whole situation like this: "Computers read numbers. Humans read English." Of course, English was his native tongue. Humans generally read whatever languages they have the ability, or need, to.

But, the statement is fairly valuable in its compact way of expressing a much larger and more complicated issue. I've never been one to either defend, or rage against, the nature of this whole process. It's the way it is, because that's the way it is. And, with the assumption that it is that way (because it is), I usually just ask folks if they'd rather remember 130+ dotted quads or 130+ relevant names? Humans generally prefer names. The computers, playing it smart, are not coming out with any stated position on either side of the debate (although they still insist on translating every name to a number ;)

Knowing that we, being the humans and not the computers, "have" to work with both numbers and names makes understanding the translation process all the more valuable. Probably most readers are familiar with the standard "nslookup" or "dig" commands, which provide a nice frontend to the process. For instance, with nslookup (much the same as dig), you can find out a host's IP address if you know the name, like so:

host # nslookup www.lycos.com
Server: dns.xyz.com
Address: 10.1.0.1
<--- This is the IP address of the server that's doing the name/IP resolution

Non-authoritative answer:
Name: lycoshome.bos.lycos.com
<--- Note that this is the name that will show up when we resolve the IP
Address: 209.202.230.30 <--- This is the IP address of www.lycos.com
Aliases: www.lycos.com <--- This is the name we looked up, but it's noted here that this name is an alias for the name listed two lines above (the "actual" hostname, ...probably)

Conversely, we could look up that IP address using the exact same method, with (what should be) equal, but inverse, results:

host # nslookup 209.202.230.30
Server: dns.xyz.com
Address: 10.1.0.1

Name: lycoshome.bos.lycos.com
<--- and, yes, the name here matches the one we originally searched for :)
Address: 209.202.230.30

While this type of hostname/IP translation is valuable and highly useful, it's not necessarily the best tool possible to use when doing network programming or scripting. For instance, if we were writing a Perl script that took the input of either an IP address or a hostname as its only argument, using nslookup or dig would cause our system to have to do more work. For today, we're going to do all of our work using "perl -e" to run mini-scripts (or execution statements) from the command line. Any of this stuff could easily be inserted into a Perl "script":

host # perl -e '$name = `nslookup www.lycos.com|grep Address|sed 1d`;chomp($name);print "$name\n";'
Address: 209.202.230.30


That one line of code populated the $name variable with the value "Address: 209.202.230.30" A value which we'd have to massage even more to remove the unsightly characters "Address: " that appear on the same line as the actual address, which is all we want.

If you're using Perl to do your network scripting, doing hostname/IP resolution in this manner is counter-productive for two big reasons (there are probably more little ones):

1. The backtick operators cause Perl to invoke a system shell in order to execute the "nslookup" command, and all the other commands within the backticks. This is obviously detrimental (to a varying degree) because we logged onto our system in a shell, from which we invoked Perl, to be executed in a subshell of that initial shell. This code is asking to open yet another subshell, within the subshell from which we're running, in order to perform a function that Perl is equipped to handle on its own. If you find that you're using system() or backtick calls a lot in your Perl code, you should, perhaps, consider rewriting it as shell script. If you just do this once, you won't notice any difference in the time it takes your script to run, but the execution time gets longer with every subsequent call.

2. The variable hacking and chopping that goes along with the pipe-chain which contains, at least, one grep and one sed command, is more or less unreliable. If the output of the nslookup command isn't exactly as we expect, our assumptions about how we should parse the data will result in an outcome we didn't expect, also.

So, basically I'm saying that using these sorts of methods are slower (they cause the system to do more work) and not guaranteed to produce meaningful results to a reasonable degree.

Fortunately, as we noted, Perl can take care of hostname-to-IP resolution, and the opposite IP-to-hostname resolution, easily, using built-in functions and methods.

Now let's take a look at those two translations . We'll, again, be executing both Perl commands directly from the shell command line for ease of execution:

First, let's lookup lycos.com again:

host # perl -e 'use Socket; $ip = gethostbyname($ARGV[0]); $name = inet_ntoa($ip);print "$name\n";' www.lycos.com
209.202.230.30


That was simple :) Now let's lookup that IP address and see if we can't get the hostname:

host # perl -e 'use Socket; $name =gethostbyaddr(inet_aton($ARGV[0] ),AF_INET); print "$name\n";' 209.202.230.30
lycoshome.bos.lycos.com


And again, we have a fairly simple response. The only thing we don't see, in the above examples, is that www.lycos.com is an alias for the hostname lycoshome.bos.lycos.com. This can be found out using Perl, as well, but isn't worth our time (assuming it's at a premium) since we can, more easily, do a double-verification of that hostname to make sure that www.lycos.com and lycoshome.bos.lycos.com both resolve to the same IP address and we aren't being "tricked" by a false DNS return:

host # perl -e 'use Socket; $ip = gethostbyname($ARGV[0]); $name = inet_ntoa($ip);print "$name\n";' lycoshome.bos.lycos.com
209.202.230.30


Great news, they both resolve to the same IP (209.202.230.30) so we can be reasonably sure the data is accurate. There are way too many levels to drill down in order to make "absolutely" sure.

Hopefully, you'll be able to integrate Perl's built-in functions for translating hostnames to IP addresses, and vice versa, into your own scripts (or use them on the command line) and save your self some time and trouble :)

Cheers,

, Mike