## Saturday, May 31, 2008

### Looking At Formal Systems On Linux and Unix

Hey There,

For today's post, we're going to put out a brain-teaser that won't get solved (at least on this blog) until tomorrow. It's based on a formal system (which is basically a set of rules, with specific applications, used as a method to solve an equation or problem) invented by Emil Post back in the 1920's and is most commonly referred to as a "post production system."

This basic formal system was re-popularized by Douglas R. Hofstadter in his book Godel, Escher, Bach: An Eternal Golden Braid" when he introduced the concept to folks, who weren't especially crazy about such abstract notions, by way of a little puzzle.

For today's post, we're going to reproduce (paraphrased, of course) that puzzle and let you see if you can solve it (or, if you can prove that it can't be solved, which is a possibility).

For tomorrow's post, we'll be putting up a script for Linux or Unix that will make this whole process much easier and will give you the option to "attack" this problem as quickly or slowly as you prefer. While the script will relieve you of the pleasure of completing this mental exercise (if it's at all possible to complete), it will either get you to the answer very quickly or make you wonder how long it might possibly take to solve. Considering the ingredients, that may turn out to be a huge chunk of your time ;)

And here we go. I hope you enjoy this puzzle as much as I did:

We begin with an absolute. You are starting out with a single string (which is absolutely defined, within the context of this formal system and puzzle, as a set of characters in a specific order - e.g. MI is not the same as IM). That absolute (or starting point) is simply "MI." (Note that all punctuation marks are "not" parts of the strings ;)

Give the string "MI," your goal will be to convert that string into the string "MU." Please note that, although a string like MIMUIM is a valid member of the M I U formal system, given the starting string of MI, you will never be able to have an M anywhere but in the first position. Sadly, this does not make finding the answer easy ;)

There are 4, and only 4, rules in this formal system, and you can only correctly solve the puzzle by applying any and/or all of them, one at a time, for as long as it takes to get you to "MU."

Rule 1: If your string ends with an "I", you can add a "U" to the end of it.

Ex: MI can become MIU.

Rule 2: If you have a string of the form "Mx," you can change that to "Mxx." Note here that the variable x can refer to a string (not necessarily just one character) and that only the letters M, I and U will ever exist in any string you produce by application of these rules. x is simply meant to be used as a variable notation.

Ex: MI can become MII
MIU can become MIUIU

The one thing to remember about this rule is that, once you've picked your character, or string, you can only duplicate it once per invocation of the rule. For instance, this is not acceptable:

Ex: MIU cannot become MIUIUII (duplicating "IU" and then duplicating the "I" before the "U," after the "I." You could do the following, however, in a number of steps:
MIU can become MIIUIU (by duplicating the "I" first, and then duplicating the "IU" from the resultant string)

Rule 3: If the substring "III" appears in your string, you can replace "III" with "U," but you may not do the opposite:

Ex: MIII can become MU
Ex: MU cannot become MIII

Rule 4: If "UU" occurs within your string (at any point), you can remove it from the string:

Ex: MIUUI can become MII

Using these 4 "rules of production" (or "rules of inference") can you take your initial string "MI" and change it to "MU"? Also, if it's not possible, is that provable? And, if it is possible, what's the fastest way to do it?

Have fun trying to figure it out. If you already own the aforementioned book, you may know the solution already (it's hidden somewhere in the approximately 700+ pages). Sometimes, though, your individual path to the solution will teach you a lot more than you'd learn by being told the answer :)

Enjoy,

, Mike

## Friday, May 30, 2008

### Troubleshooting Veritas Cluster Server LLT Issues On Linux and Unix

Hey There,

Today's post is going to steer away from the Linux and/or Unix Operating Systems just slightly, and look at a problem a lot of folks run into, but have problems diagnosing, when they first set up a Veritas cluster.

Our only assumptions for this post are that Veritas Cluster Server is installed correctly on a two-node farm, everything is set up to failover and switch correctly in the software and no useful information can be obtained via the standard Veritas status commands (or, in other words, the software thinks everything's fine, yet it's reporting that it's not working correctly ;)

Generally, with issues like this one (the software being unable to diagnose its own condition), the best place to start is at the lowest level. So, we'll add the fact that the physical network cabling and connections have been checked to our list of assumptions.

Our next step would be to take a look at the next layer up on the protocol stack, which would be the LLT (low latency transport protocol) layer (which, coincidentally, shares the same level as the MAC, so you may see it referred to, elsewhere, as MAC/LLT, or just MAC, when LLT is actually meant!) This is the base layer at which Veritas controls how it sends its heartbeat signals.

The layer-2 LLT protocol is most commonly associated with the DLPI (all these initials... man. These stand for the Data Link Provider Interface). Which brings us around to the point of this post ;)

Veritas Cluster Server comes with a utility called "dlpiping" that will specifically test device-to-device (basically NIC-to-NIC or MAC-to-MAC) communication at the LLT layer. Note that if you can't find the dlpiping command, it comes standard as a component in the VRTSllt package and is generally placed in /opt/VRTSllt/ by default. If you want to use it without having to type the entire command, you can just add that directory to your PATH environment variable by typing:

host # PATH=\$PATH:/opt/VRTSllt;export PATH

In order to use dlpiping to troubleshoot this issue, you'll need to set up a dlpiping server on at least one node in the cluster. Since we only have two nodes in our imaginary cluster, having it on only one node should be perfect.

To set up the dlpiping server on either node, type the following at the command prompt (unless otherwise noted, all of these Veritas-specific commands are in /opt/VRTSllt and all system information returned, by way of example here, is intentionally bogus):

host # getmac /dev/ce:0 <--- This will give use the MAC address of the NIC we want to set the server up on (ce0, in this instance). For this command, even if your device is actually named ce0, eth0, etc, you need to specify it as "device:instance"
/dev/ce:0 00:00:00:FF:FF:FF

Next, you just have to start it up and configure it slightly, like so (Easy peasy; you're done :)

host # dlpiping -s /dev/ce:0

This command runs in the foreground by default. You can background it if you like, but once you start it running on whichever node you start it on, you're better off leaving that system alone so that anything else you do on it can't possibly affect the outcome of your tests. Since our pretend machine's cluster setup is completely down right now anyway, we'll just let it run in the foreground. You can stop the server, at any time, by simply typing a ctl-C:

^C
host #

Now, on every other server in the cluster, you'll need to run the dlpiping client. We only have one other server in our cluster, but you would, theoretically, repeat this process as many times as necessary; once for each client. Note, also, that for the dlpiping server and client setups, you should repeat the setup-and-test process for at least one NIC on every node in the cluster that forms a distinct heartbeat-chain. You can determine which NIC's these are by looking in the /etc/llttab file.

host # dlpiping -c /dev/ce:0 00:00:00:FF:FF:FF <--- This is the exact output from the getmac command we issued on the dlpiping server host.

If everything is okay with that connection, you'll see a response akin to a Solaris ping reply:

0:00:00:FF:FF:FF is alive

If something is wrong, the output is equally simple to decipher:

no response from 00:00:00:FF:FF:FF

Assuming everything is okay, and you still have problems, you should check out the support site for Veritas Cluster Server and see what they recommend you try next (most likely testing the IP layer functionality - ping! ;)

If things don't work out, and you get the error, that's great (assuming you're a glass-half-full kind of person ;) Getting an error at this layer of the stack greatly reduces the possible-root-cause pool and leaves you with only a few options that are worth looking into. And, since we've already verified physical cabling connectivity (no loose or poorly fitted ethernet cabling in any NIC) and traced the cable (so we know NICA-1 is going to NICB-1, as it should), you can be almost certain that the issue is with the quality or type of your ethernet cabling.

For instance, your cable may be physically damaged or improperly pinned-out (assuming you make your own cables and accidentally made a bad one - mass manufacturers make mistakes, too, though). Also, you may be using a standard ethernet cable, where a crossover (or, in some instances, rollover) cable is required. Of course, whenever you run into a seeming dead-end like this, double check your Veritas Cluster main.cf file to make sure that it's not in any way related to a slight error that you may have missed earlier on in the process.

In any event, you are now very close to your solution. You can opt to leave your dlpiping server running for as long as you want. To my knowledge it doesn't cause any latency issues that are noticeable (at least in clusters with a small number of nodes). Once you've done your testing, however, it's also completely useless unless you enjoy running that command a lot ;)

Cheers,

, Mike

## Thursday, May 29, 2008

### Simple Shell One-Liner To Enumerate File Types In Linux and Unix

Hey there,

Lately, we've been focusing a lot on Perl "one liners," from mass file time syncing to name and IP resolution and I thought it was only fair that we should write a post about a shell "one liner" for once. After all, most standard Unix and Linux shells are perfect for that purpose :)

Here's a very quick way to take an inventory of all the different file types (including directories, sockets, named pipes, etc) that exist in a given directory tree and provide a tally of each file type. I'm not entirely sure that I care if I have 15 ASCII text files and 2 Perl scripts in my current working directory, but this little piece of code must be able to help someone somewhere accomplish something, even if it is only to use as a smaller part of a larger organism ;)

This works in pretty much any standard shell on every flavour of Linux and/or Unix I've tested (ash, sh, bash, jsh, ksh, zsh, even csh and tcsh, which is huge for me since I never use those shells. One day, soon, I will redouble my efforts and just learn how to use them well, which, hopefully, won't result in my writing a whole bunch of posts about "cool" stuff that everyone has known about for the past few decades ;).

This one-liner could actually be written as a script, to make it more readable, like this:

`#!/bin/shfind . -print | xargs -I var file "var"|awk '{        \$1="";        x[\$0]++;} END {        for (y in x) printf("%d\t%s\n", x[y], y);}' | sort -nr`

But, since I'm a big fan of brevity (which is about as far away from obvious as possible if you consider my writing style ;), I would run it like this:

host # find . -print|xargs file|awk '{\$1="";x[\$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr

In my terminal, that all comes out on one line :) And here's the sort of output you can expect:

`host # find . -print|xargs file|awk '{\$1="";x[\$0]++;}END{for(y in x)printf("%d\t%s\n",x[y],y);}'|sort -nr 23       Bourne-Again shell script text11       ASCII text10       perl script text5        pkg Datastream (SVR4)4        directory4        ASCII English text2        UTF-8 Unicode English text, with overstriking2        Bourne shell script text1        RPM v3 bin i386 m4-1.4.10-1`

and, just to confirm the file count, we'll change the command slightly (to total up everything, instead of being particular -- another "one-liner" that's, admittedly, completely unnecessary unless you're obsessive/compulsive like me ;) and compare that with a straight-up find:

host # find . -print|xargs file|awk 'BEGIN{x=0}{x++}END{print x}'
62
host # find .|wc -l
62

Good news; everything appears to be in order! Hope this helps you out in some way, shape or form :)

Cheers,

, Mike

## Wednesday, May 28, 2008

### Simple Arithmetic In Bash, Perl and Awk - More Porting

Greetings,

Today we're going to continue along on our series of posts dealing with porting code between the bash shell, Perl and awk. In previous posts, we've looked at the basics regarding simple string variables, simple one-dimensional arrays and associative arrays (sometimes referred to as hashes or "lookup tables.").

Before we move on to logical programming constructs (such as if-conditionals and while-loops), it's important to go over a few other basic concepts that require translation in order to work in the same manner in all three languages. Today we're going to demonstrate how each of bash, Perl and awk deal with simple arithmetic. Note that we won't be dealing specifically with floating point math (although it is possible to do) since that's slightly beyond the scope of this article (but it will be -- huge hint -- the subject of our next post in this series... variable scope, that is ;)

For our purposes today, we're going to assume that we need to solve five different arithmetic equations in each of our three languages. And, coincidentally enough, all five equations need to serve a distinctly different arithmetic purpose. We're going to have to take two integers and perform addition, subtraction, multiplication and division on them, and also extract the remainder of any imperfect divisions (defined herein as any division which doesn't have a remainder of 0). Note that, for all of our "bash" examples, the spaces (or lack thereof) in the equations are "required." Perl and awk will produce results with or without spaces between the operands.

1. Bash: In the bash shell, all of these actions are very simple to perform, and you have a more than a few options at your disposal depending on how you prefer to do them. The one way you can perform shell arithmetic that will work in older versions of the bash shell (or most any other shell) is to use the "expr" commmand. While this is, technically, an entirely separate program, it can come in handy if you get ahold of an older shell that can't perform arithmetic on its own. The syntax for our equations would be the following (We'll be assuming that the side explanations will be the same for Perl and awk to save on space):

host # expr 9 + 2
11
host # expr 9 - 2
7
host # expr 9 \* 2
18
host # expr 9 / 2
<--- Here you can see a limitation of simple integer math in the shell. 9 divided by 2 should be 4.5, but the shell doesn't understand the fraction.
4
host # expr 9 % 2
<--- And here's the reason we want to see the remainder of that division. This shows us that 1 integer was lost in the imperfect division.
1

The bash shell, however, will allow you to do simple arithmetic in a much more simplistic way. And this is it (we'll use the shell built-in "echo" to print the output to the screen):

host # echo \$((9+2))
11
host # echo \$((9-2))
7
host # echo \$((9*2))
18
host # echo \$((9/2))
4
host # echo \$((9%2))
1

2. Perl: Perl, as we've mentioned in previous posts, provides easy access to the shell via the backtick operators and "system" function, but there's almost no need to ever use those, since Perl can do simple arithmetic on its own. In fact, depending upon how heavily you use a "system" resource from within a Perl program, the slower and more cumbersome that program will become. Using Perl's built-in functions and methods is almost always more efficient.

Here's an example of how we'd do the same arithmetic with Perl, using the command line execution statement (-e flag) method (again, we'll use a print function to spit the output to the terminal):

host # perl -e 'print 9 + 2 . "\n";'
11
host # perl -e 'print 9 - 2 . "\n";'
7
host # perl -e 'print 9 * 2 . "\n";'
18
host # perl -e 'print 9 / 2 . "\n";'
<--- Note, here, how Perl naturally deals with fractions!
4.5
host # perl -e 'print 9 % 2 . "\n";'
1

3. Awk: Awk makes performing arithmetic operations incredibly easy, as well. It, like Perl, also understands simple fractions and accounts for simple floating point arithmetic right out-of-the-box:

host # echo |awk '{print 9+2}'
11
host # echo |awk '{print 9-2}'
7
host # echo |awk '{print 9*2}'
18
host # echo |awk '{print 9/2}'
4.5
host # echo |awk '{print 9%2}'
1

And, that's all there is to it. As you can see, the difference between performing simple arithmetic in all three languages isn't that great. As we continue to examine porting code between these languages, you'll notice a lot of similarities (and a few minor differences ;) which will make the translation of one language to another very simple for you with a little practice and patience.

Our next porting post will deal with variable scope, which we'll both define and demonstrate .

Until then!

, Mike

## Tuesday, May 27, 2008

### Using Perl On Linux To Do Mass Synchronization Of File Time Stamps

Hey There,

Today we're going to take a look at another quick and simple Perl command line execution statement that you can use to save yourself lots of time ( If it really does equal money, this post is going to be a lot more valuable than I originally thought ;)

In a previous post, we looked at using Perl to figure out how old all our files really are. This post is a little twist on that; and it's a lot less complicated. It will run on whatever version of Perl you have (unless it's ancient ;) and produced equal results on all tested Linux and Unix flavours I could get my hands on.

Today we're going to be using Perl's stat and utime functions (similar to the way we did in our previous Perl post), but instead of using them do determine the ages of all of our files, we're going to use them make all of our files conform to the specific time and date of one representative file. This trick can come in really handy if you wanted to make a system backup consisting only of files that, let's say, are a certain number of days old, and only some of the files that you know you need to backup are slightly older or newer.

The trick is very simple. All you need to do is find one specific file that has a time stamp you want to mimic, and then apply its time stamp attributes to the group of files that you want to modify the time stamp on. So (for a simple example) if you had a directory consisting of 5 files, like this:

host # ls
file1 file2 file3 file4 file5

and you needed them all to share the same time stamp as the file "file1," I'd first suggest that you take some form of online-backup, using tar or cpio (assuming that you may need to reverse the process later):

host # tar cpf /tmp/backout.tar file1 file2 file3 file4 file5

or

host # tar cpf /tmp/backout.tar f* <--- I rarely glob this globally, but, for our example, all the files we want to back up start with "f" and there aren't any others. Worst case, on a tar backup, you can just write over it if you accidentally tar-copy something you didn't want to.

Now, let's take one more look at those files. We'll use "ls -l" to show the time stamp at this time:

`host # ls -ltotal 4-rw-r--r--  1 eggi  newpeople  90 May 13 22:27 file1-rw-r--r--  1 eggi  newpeople   0 May 25 14:25 file2-rw-r--r--  1 eggi  newpeople   0 May 25 14:25 file3-rw-r--r--  1 eggi  newpeople   0 May 25 14:25 file4-rw-r--r--  1 eggi  newpeople   0 May 25 14:25 file5`

Now, well make them all have the same time stamp.

We need files 2 through 5 to match the time stamp of file1, as this is how the system is going to determine what to back up later in the day (fill in your own hypothetical situation here ;)

And here's all we have to do to make all the files have the same time stamp as file1 :)

host # perl -e '\$x=utime ((stat(\$ARGV[0]))[8,9], @ARGV);print \$x' file1 file[2345]

And we should be all set!

`host # ls -ltotal 4-rw-r--r--  1 eggi  newpeople  90 May 13 22:27 file1-rw-r--r--  1 eggi  newpeople   0 May 13 22:27 file2-rw-r--r--  1 eggi  newpeople   0 May 13 22:27 file3-rw-r--r--  1 eggi  newpeople   0 May 13 22:27 file4-rw-r--r--  1 eggi  newpeople   0 May 13 22:27 file5`

Now all the files have the same time stamp! You'll note, in the command above, that we only modifed the atime and mtime of the files. This is sufficient because Perl's utime function automatically changes the ctime (or inode change time) to the current date and time. You can include the utime in this Perl command line execution statement (no error will be generated), but it's not necessary.

If you ever do need to copy the ctime, just add the 10th index of the array returned by stat to your command line, like this:

host # perl -e '\$x=utime ((stat(\$ARGV[0]))[8,9,10], @ARGV);print \$x' file1 file[2345]

And, of course, you can use this same process on directories as well, without a direct correlation (e.g. copying file timestamps to directories, vice versa and any combination between file types that support time stamps will work :)

Here's hoping this helps you out (at least, a little ;) This simple command can be a very convenient way for you to make massive changes by applying the time stamp of any one file to as many other files as you like. And, no matter how many files you need to do this for, as long as you can wait a few moments for ingenuity to strike, the command line shouldn't get too much longer :)

Cheers,

, Mike

## Monday, May 26, 2008

### How To Fake Associative Arrays In Bash

Greetings,

As promised in our previous post on working with associative arrays in Linux and Unix, we're back to tackle the subject of associative arrays in bash. As was noted, we're using bash version 2.05b.0(1) and (to my knowledge) bash ( up to, and including, bash 3.2 ) does not directly support associative arrays yet. You can, of course, create one-dimensional (or simple index) arrays, but hashing key/value pairs is still not quite there.

Today we'll check out how to emulate that same functionality in bash that can be found in Perl and Awk. First we'll initialize our array, even though we don't necessarily have to:

host # typeset -a MySimpleHash

To begin with, we'll have to consider what bash already does for us and how we want that to change. For our first example, let's take a look at what happens if we just make assignments to a bash array with, first, a numeric and then an alpha value:

host # MySimpleHash["bob"]=15
host # echo \${MySimpleHash["bob"]}
15
host # MySimpleHash["joe"]="jeff"
host # echo \${MySimpleHash["joe"]}
jeff

This seems to be working out okay, but if we look at the values again, it seems that MySimpleHash["bob"] gets reassigned after we assign the alpha value "jeff"to the key "joe" :

host # echo \${MySimpleHash["bob"]}
jeff

This behaviour repeats itself no matter if we mix integers with strings. Bash can't handle this natively (but, it never claimed it could :)

host # MySimpleHash["bob"]="john"
host # echo \${MySimpleHash["bob"]}
john
host # MySimpleHash["joe"]="jeff"
host # echo \${MySimpleHash["joe"]}
jeff
host # echo \${MySimpleHash["bob"]}
jeff

This looks as though it's going to necesitate an "eval" nightmare much much worse than faking arrays in the Bourne Shell! Ouch! However, we might be able to get around it with a little bit of "laziness" if we just construct two parallel arrays. This, of course, would necessitate keeping the values in both arrays equal and consistent. That is, if "bob" is the third key in our associative array, and "joe" is "bob"'s value, both need to be at the exact same numeric index in each regular array. Otherwise translation becomes not-impossible, but probably a real headache ;)

To demonstrate we'll create a simple "3 key/value pair" associative array using the double-regular-array method, like so:

host # typeset -a MySimpleKeys
host # typeset -a MySimpleVals
host # MySimpleKeys[0]="abc";MySimpleKeys[1]="ghi";MySimpleKeys[2]="mno"
host # MySimpleVals[0]="def";MySimpleVals[1]="jkl";MySimpleVals[2]="pqr"

Now, we should be able to "fake" associative array behaviour by calling the common index from each array (In this fake associative array we have key/value pairs of abc/def, ghi/jkl and mno/pqr). Now that we have the key/value pairs set up, we need to set up the "associative array" so that we can request the values by the key names, rather than the numeric array index. We'll do this in a script later, so our command line doesn't get any uglier:

`host # key=\$1host #for (( x=0 ; x < \${#MySimpleKeys[@]}; x++ ))>do>    if [ \$key == "\${MySimpleKeys[\$x]}" ]> then>     echo "\${MySimpleVals[\$x]}"> fi>done`

Testing this on the command line produces suitable, but pretty lame looking results:

host # ./MySimpleHash ghi
jkl

We're going to need to modify the script so that it takes its keys and just returns a value that doesn't need to be cropped (using "echo -n" will solve this nicely):

echo "\${MySimpleVals[\$x]}"

changes to

echo -n "\${MySimpleVals[\$x]}" <--- This is helpful if we want to use the result as a return value :)

Still, the shell is going to balk at whatever we do to try and pass an argument to the script like this:

host # ./MySimpleHash{ghi}
-bash: ./MySimpleHash{ghi}: No such file or directory

So, for now, we'll just make it so it's "almost" nice looking:

host # a=`./MySimpleHash {abc}`
host # echo \$a
def

It'll get the job done, and could be used in a pinch, but is way too klunky to replace Awk or Perl's built-in associative array handling. Nevertheless, at least we have a way we can get around it if we "have" to :)

I've included the script written during this post below, but, if you want to check out a more complex (and, subsequently, much more elegant) script to fake associative arrays in Perl, you should take a look at this associative array hack for bash and also this bash hack that comes at the issue from a whole different angle.

I think you'll like either one of these scripts a lot better than this one, but hopefully we've both learned, at least a little bit, about the way associative arrays work (and how they differ from one dimensional arrays) in the process :)

Cheers,

`#!/bin/bash# TerribleAAbashHack.sh# What was I thinking? ## 2008 - Mike Golvach - eggi@comcast.net## Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License#typeset -a MySimpleKeystypeset -a MySimpleValsMySimpleKeys[0]="abc";MySimpleKeys[1]="ghi";MySimpleKeys[2]="mno"MySimpleVals[0]="def";MySimpleVals[1]="jkl";MySimpleVals[2]="pqr"key=`echo \$@|sed -e 's/{//g' -e 's/}//g'`for (( x=0 ; x < \${#MySimpleKeys[@]}; x++ ))do        if [ \$key == "\${MySimpleKeys[\$x]}" ]        then                echo -n "\${MySimpleVals[\$x]}"        fidone`

, Mike

## Sunday, May 25, 2008

### Safely Patching Your Veritas Root Mirror Disk On Linux Or Unix

Hey there,

It's been a long time since we've taken a look at anything "Veritas" (almost a few months now since we published a few posts regarding disk groups and volume groups in Veritas Volume Manager for Linux and/or Unix. Given the relatively broad nature of this blog, I sometimes wonder how we can ever stay entirely focused on any "one" thing for too long ;)

But, enough about us... For this "Lazy Sunday" post we're going to take a look at patching a root (or boot) mirror disk in VxVM safely. And by safely, we mean that you'll be able to fail-back to your root mirror disk as if nothing ever happened. That is, if something awful actually does happen.

The basic concept is simple, and applies to all brands and methods of root disk mirroring. When you're faced with having to apply patches to your OS (which invariably involves changes to your root disk), you always want to make sure that your root mirror is "golden" before you begin. You also want to make sure that it's taken out of the equation for the initial patch run, so you'll have a perfect failback device (less sweat, no accounting for tears ;)

The first thing you'll want to do, as per above, is to validate your root disk's mirror disk. For Veritas Volume Manager, every volume associated with the root disk must (well, technically, "should") have, at least, a single subdisk for each and every plex on the root disk and the root mirror disk.

For our example today, we'll consider that our root disk is c0t0d0s2 and its mirror is c1t0d0s02. They both belong to the default Veritas Volume: rootdg. Please also note that a lot of this output is "mocked up" to a certain degree since I'm not in a position to actually disassociate volumes on the computers I'm using for the sake of this post :)

You can check the state of your volumes with the "vxprint" command, like so (we'll use the ellipses (...) to indicate output that I've trimmed to keep this post under 50,000 words ;) :

host # vxprint -htqg rootdg <--- This output has been truncated as well, to highlight the mostly one-to-one relationship between subdisks (sd) and plexes (pl). As you can see, each of our two volumes on our rootdisk has at least one subdisk associated with each plex. We're going to ignore root_disk-B0 for this post (or not go into it too much) as this isn't really a "volume" but a way Veritas gets around the fact that it uses the part of the disk that most operating systems reserve (the bootblock - This, again, is enough material for another post entirely)

`Disk group: rootdg dg rootdg       default ...dm root_disk    c0t0d0s2 ...dm root_mirror  c1t0d0s2 ...sd root_diskPriv        - ...v root_volume       - ...pl root_volume-01   root_volume ...sd root_disk-B0 root_volume-01 ...sd root_disk-02 root_volume-01 ...pl root_volume-02   root_volume ...sd root_mirror-01       root_volume-02 ...v swap_volume       - ...pl swap_volume-01   swap_volume ...sd root_disk-01 swap_volume-01 ...pl swap_volume-02   swap_volume ...sd root_mirror-02       swap_volume-02 ...`

Now that we know we're good, even though it may have already been done, I find it's always good practice to install a new bootblock on the root mirror disk from the main root disk. The worst case scenario (assuming no typos ;) would be that you updated an existing bootblock with one that should, theoretically, be an exact match for your primary root disk (which is what we want) :

host # /usr/lib/vxvm/bin/vxbootsetup -g rootdg root_mirror

If you have other partitions on your root disk, that aren't listed in your vxprint output of the rootdg above, you can define them with the vxmksdpart command. You might have your /opt partition on the root disk, but not in the rootdg. Sometimes you'll see /home or even /var on the rootdisk but not associated with the rootdg. While it's considered "best practice" by Veritas to add these partitions to the rootdg before separating the disks, I've found that it's never actually been "necessary." The idea is that you associate the partitions, just so you can disassociate them a few minutes later (???)

Next, we'll disassociate (see what I mean ;) the root mirror disk plexes from the root disk, like so (you can verify that, for instance, swap_volume-02 is associated with the mirror disk in the vxprint output above):

host # vxplex -g rootdg dis root_volume-02
host # vxplex -g rootdg dis swap_volume-02

Now, well simply mount the root filesystem from the disassociated mirror disk on a temporary directory on the root disk and make a few quick file backups and edits, like so (Note that, for most Linux flavours, /etc/system noted below is actually /etc/sysctl.conf and /etc/vfstab is /etc/fstab):

host # mkdir /vxtmp
host # mount /dev/dsk/c1t0d0s0 /vxtmp
host # cp /vxtmp/etc/system /vxtmp/etc/system.old
host # cp /vxtmp/etc/vfstab /vxtmp/etc/vfstab.old
host # cp /vxtmp/etc/vfstab.prevm /vxtmp/etc/vfstab
host # touch /mnt/etc/vx/reconfig.d/state.d/install-db

Now, in the /vxtmp/etc/system file, we'll comment out the following two lines (remember that in the /etc/system file the "*" is the comment character. You probably already know that, but I feel responsible ;) -- Edit the following two lines so that they are now commented:

* rootdev ...
* set vxio ...

Then we'll unmount the root mirror disk on /vxtmp:

host # umount /vxtmp

and we're ready to patch! Assuming that everything goes swimmingly, all we need to do is reattach the root mirror disk plexes to the root disk, like this:

host # vxplex -g rootdg att root_volume root_volume-02
host # vxplex -g rootdg att swap_volume swap_volume-02

The root disk should sync itself up so that the root mirror disk gets updated (which you can monitor with "vxtask") And, you're all set :)

Now... If things go bad... The official explanation is so long and ridiculous (and differs for versions up to 3.5 and newer versions), that I'll refer you to an actual official document from Veritas online support that will show you a neat trick to get around having to jump through 15 or 16 hoops to get this all over with ;) Another glorious example of the system raging against itself :)

Cheers,

, Mike

## Saturday, May 24, 2008

### Simple Perl Script To Demonstrate DNS Lookups In Linux

Hey There,

This weekend post is somewhat of a look-back at a previous post on we did on simple IP and hostname resolution with Perl.

The script we're putting out today is only about half as good as some of the standard hostname/IP checkers out there, and it was specifically written only to accept hostnames and not IP's. If there's a request for it, we could write out the other half, but it's probably best to check out our Perl IP/hostname resolution post and do it for yourself. It's pretty much already written, you just have to reverse the logic (or invert it) somewhat ;)

This Perl script should run on any flavour of Linux, Unix and probably even Windows , assuming you have some sort of Cygwin or MKS *nix-on-Windows setup - or want to go through and manually edit the script to switch the backslashes to forward slashes, etc, so you can use ActivePerl for Windows - Note that, of these three options, only MKS isn't freeware, so if you're working on the cheap, stick with Cygwin or ActivePerl. No offense meant to MKS; it's a fine product but starts out at around 4 or 500 dollars to buy a single developer license.

The script only takes one argument of a hostname, like so:

The script is basically meant to reinforce Perl's basic network lookup functions that we went over a post or so ago, and provide a relatively lame means of double-checking a lookup to determine if it's bogus.

The skeleton of the script (logically) is this:

1. Lookup the IP of the hostname supplied.
2. Lookup the hostname using the IP we got in step 1.
3. Determine if the hostname's gotten in steps 1 and 2 match.
4. If they do, then we quit; assuming everything's okay.
5. If they don't match, we lookup the IP of the second hostname we received.
6. Lookup the IP of the second hostname (that we got in step 2)
7. Determine if the IP's (gotten in steps 1 and 6) match.
8. If they do, then we quit; assuming all is well.
9. If they don't, we still quit; we just complain about it ;)

This script should also aid in examining the opposite of a double-reverse-lookup (The double-reverse-lookup being IP to Name to IP mapping, with this being Name to IP to Name mapping). Some sites use Double Reverse Lookups as a security measure, but whether or not the process protects anything at all, or is just a waste of resources, is debatable. Just do a search for it online and you'll be inundated with polemic for as long as you can stand to read.

For our purposes, this is just another way to learn more about how things work with Perl's networking functions and how you can better work with them :)

This is what you can expect to see from this script (approximately):

Checking Reverse...

64.233.167.99 and 64.233.167.99 match
Everything is probably ok!

Here's hoping this helps you out, and best of luck if you decide to write the opposite (which would be the "real" Double Reverse Lookup :)

Cheers,

`#!/usr/bin/perl## double.pl - Double Check That Hostnames Match The IP's They're Advertising## 2008 - Mike Golvach - eggi@comcast.net## Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License#use Socket;if ( \$#ARGV != 0 )      {        print "Usage: \$0 hostname\n";        exit(1);}\$entry = \$ARGV[0];\$hostname1 = \$entry;\$ip1 = gethostbyname(\$hostname1) || die "error - gethostbyname: \$!\n\n";\$hostip1 = inet_ntoa(\$ip1) || die "error - inet_ntoa: \$!\n\n";print "\nHostname: \$hostname1 = IP: \$hostip1\n\n";\$hostname2 = gethostbyaddr(inet_aton(\$hostip1),AF_INET) || die "error - inet_aton: \$!\n\n";print "IP: \$hostip1 = Hostname: \$hostname2\n\n";if ( \$hostname1 eq \$hostname2 ) {        print "\$hostname1 and \$hostname2 Match!\n";        print "Good Deal!\n\n";        exit(0);}       else    {        print "\$hostname1 and \$hostname2 Do Not Match\n";        print "Checking Reverse...\n\n";        \$ip2 = gethostbyname(\$hostname2) || die "error - gethostbyname: \$!\n\n";        \$hostip2 = inet_ntoa(\$ip2) || die "error - inet_ntoa: \$!\n\n";        if ( \$hostip1 eq \$hostip2 )     {                print "\$hostip1 and \$hostip2 match\n";                print "Everything is probably ok!\n\n";                exit(0);        }       else    {                print "\$hostip1 and \$hostip2 don't match\n";                print "This DNS may be bogus or setup incorrectly!\n\n";                exit(0);        }}`

, Mike

## Friday, May 23, 2008

### Using Who To Find What And When On Linux and Unix

Hello again,

Today's post is yet another in a somewhat disjointed series of posts on "stuff you might not know and you might find interesting" regarding very common commands. And they don't get much more common than the "who" command.

Generally, "who" is used like the last command that we looked at in our previous post. It's generally issued at the command line to determine who (yes, it's not just a clever name ;) is logged on "right now," if anyone is at all.

Unlike "last," however, the "who" command has quite a number of options that make it a great troubleshooting, and statistics gathering, command. And, as luck would have it, the four options that we're going to look at today are exactly the same on SUSE Linux 9.x, Solaris 9 Unix and even Solaris 10 :) We'll go through the options from most to least used (in my experience). Not that it matters. We're only looking at four options, so it's going to be hard to get lost ;) All example output will be from SUSE Linux 9.x

1. "who -r" - Prints the current runlevel. This is somewhat similar to the functionality of the last command that we posted about before, but it gives more limited information. This command is excellent for a quick overview of the system's current runlevel, previous state and last state-transition time. For instance, take the following example:

`host # who -r         run-level 3  Feb 27 16:06                   last=S`

This shows us that our system is currently at "run level 3," was in "Single User" mode (S) previous to that, and that the transition from "Single User" to "run level 3" occurred approximately February 27th at 16:06. I say approximately, because (if we look at last's output, as we did in our previous post on using last to its full potential, we could see that this was actually a reboot).

The last state will usually appear as "S" on a reboot, since it's the last recorded state the system is at before it switches to "run level 3" (Of course run level 2 is executed on a normal boot to run level 3). All the information about switching from "run level 3" to "run level 6," and from "run level 6" to "run level S", and all the reboot and shutdown commands are not reported. Again, we don't know the year, but, since this command reads from wtmpx, you can check out a few older posts on user deletion with Perl and the relevant mods for Linux if you want to use Perl to grab that information, as well.

2. "who -b" - Prints the system boot time. Didn't I just get through a really long-winded explanation of all the information missing from "who -r"? ;) Well, here's some of that. This invocation of "who" prints out the last time the system was booted. Note that this doesn't differentiate between a reboot and a power-cycle:

`host # who -b         system boot  Feb 27 16:06`

3. "who -d" - Prints out a list of all the dead processes on your system. This invocation of the who command is really only useful if you're looking for a problem process and can't seem to find it. Generally, you'd use either lsof or ptree/pfiles to find the rogue process, but, if you don't have those (or find them too messy), this command can sometimes help. Mostly though, it's just a listing of processes which are no longer running and still in memory. Note that, for our example below, all of these processes aren't even in the process table anymore!

`host # who -d                      Feb 27 16:06              2134 id=si    term=0 exit=0                      Feb 27 16:07              4410 id=l3    term=0 exit=0         pts/2        Apr 14 10:40             24532 id=ts/2  term=0 exit=0         pts/1        May  2 20:29             20407 id=ts/1  term=0 exit=0`

4. "who -t" - Prints out the last time the System Clock was changed. Like I mentioned, I saved the least used, and/or obvious, invocation of who for last. You may never have to run the who command with this argument. Still, it's nice to know it's there. As far as I can tell, this setting is not affected by the NTP protocol or any similar software you might have running on your machine (xnptd, etc) to keep the OS clock set correctly. If someone with root (or equivalent) privilege decides to run the "date" command on the server to set an incorrect (or correct) time, this command's output will note it. Unfortunately, it's been a while on the machine I'm using as a test case, and the default output (assuming no change) is nothing. On the bright side, we can be reasonably certain that no one's been goofing with the system clock :)

`host # who -thost #`

Enjoy the rest of your day, and have a great Memorial Day weekend ;)

Cheers,

, Mike

## Thursday, May 22, 2008

### Using Last To Its Full Potential On Linux

Hey There,

This probably comes as no surprise to most Unix or Linux administrators out there (at least this first thing), but I find it's always interesting how rarely the "last" command is used to determine anything other than the users logged in "now" and the "last" time a user logged in.

Granted; the last command doesn't offer too much in the extra-functionality department, but it does have one very useful feature. Normally, if you were to run last, you'd see output like the following:

`reboot   system boot  2.6.5-7.283-smp  Thu Jan 25 18:06          (00:21)user1    pts/1        host.xyz.com    Thu Jan 25 08:03 - down   (00:27)reboot   system boot  2.6.5-7.283-smp  Thu Jan 25 08:01          (00:29)user1    pts/1        host.xyz.com    Thu Jan 25 07:50 - down   (00:06)`

But, if you add the "-x" switch to the "last" command, it gives you a lot more detailed information about system run-level changes, which makes it a more accurate way to determine what happened if, and when, your system ever goes down unexpectedly! Here's output from that same swatch of time using "last -x":

`runlevel (to lvl 3)   2.6.5-7.283-smp  Thu Jan 25 18:06 - 18:28  (00:21)reboot   system boot  2.6.5-7.283-smp  Thu Jan 25 18:06          (00:21)shutdown system down  2.6.5-7.283-smp  Thu Jan 25 08:31 - 18:28  (09:56)runlevel (to lvl 6)   2.6.5-7.283-smp  Thu Jan 25 08:31 - 08:31  (00:00)user1    pts/1        host.xyz.com    Thu Jan 25 08:03 - down   (00:27)runlevel (to lvl 3)   2.6.5-7.283-smp  Thu Jan 25 08:01 - 08:31  (00:29)reboot   system boot  2.6.5-7.283-smp  Thu Jan 25 08:01          (00:29)shutdown system down  2.6.5-7.283-smp  Thu Jan 25 07:57 - 08:31  (00:33)runlevel (to lvl 6)   2.6.5-7.283-smp  Thu Jan 25 07:57 - 07:57  (00:00)user1    pts/1        host.xyz.com    Thu Jan 25 07:50 - down   (00:06)`

Interestingly enough, the "-x" flag still isn't available in Solaris, even in all the versions of the 10.x strain that I've checked out. There are other methods to get the information, but they are more tedious and require the user, or admin, to do enough work that they may as well script it out (or write a wrapper for "last" that allows for a "-x" flag ;)

Generally, you'll notice that this extra information is assigned to the "user" with the name of your "kernel" revision ( usually the value of "uname -r" or "uname -k." 2.6.5-7.286-smp, in our case) so you can run:

`last -x|grep `uname -r``

to restrict your output to this system information and ignore all the user logins/logouts :)

While the information that "last -x" provides may seem extraneous and not generally worthwhile, I'd say that it's exactly the opposite. For instance, in our first, straight-up, last command, we only get the reboot time of (we'll take the last one) January 25th at 8:01 a.m. ( The year is 2008 since we're taking this from the top of the output).

Interestingly enough, again, last does not print the year, although you can get that information if you really want it. For more info on that, check out our previous posts on scripting out user deletion on Unix and the modifications for Linux, which both include Perl routines for tearing open wtmpx so you "can" get the "year" data if you want it :)

With "last -x," for that very same reboot, we know that the reboot command was issued by the system on January 25th at 8:01 a.m. (this helps put into perspective what last, without arguments, is "really" reporting. The "beginning" of the reboot process). We can then see that (and, just as a reminder, we're reading from the bottom of the output up!) the request to switch to "run level 6" (which is "reboot") was actually issued at 7:57 a.m.

The "shutdown" information on the next line is an all-encompassing time. It should always match the entire amount of time spent in all of the states we're looking at. It starts with the switch to "run level 6" at 7:57 a.m. and ends with the switch to "run level 3" (this system's default run level) at 8:31 a.m. Finally, after the "reboot" line, we see the switch to "run level 3" which happens from 8:01 a.m. (the time the "reboot" was called) until 8:31 a.m. (the time the system fully got back to "run level 3").

As you can see, just knowing the "reboot" time doesn't give a very accurate report of the time involved in the reboot, at a glance. We just know that it happened at 8:01 a.m. If we wanted more information, we might need to go look at system logs.

"last -x," however, makes it so that we can, just by reviewing that output, see that the reboot process actually began at 7:57 a.m. and didn't complete until 8:31 a.m. That may not be a long time for this machine (If it is, you'll know to look at the system logs, now :), but the length of time required for a normal reboot is very system-independent and, also, dependant on what sorts of scripts and programs are run on a controlled reboot, etc.

And that's the last I have to say about that ;)

Best wishes,

, Mike

## Wednesday, May 21, 2008

### Simple Name And IP Resolution Using Perl On Linux Or Unix

Greetings,

Today we're going to take a look at a simple way to demonstrate something that happens pretty much every time you use the web, check your email or even turn on our PC so you can open up whatever software package you're using to read this page right now.

The "something" that always seems to be happening is name/IP resolution. It happens on Linux, Unix, Windows and just about any operating system device or language that does any sort of networking, like Perl, Bash, and C.

Previously, we've looked at it in the context of much more complicated structures in posts on using Perl to run a shell on a socket and non-maliciously scanning for open network ports. In a simplistic sense, someone once described the whole situation like this: "Computers read numbers. Humans read English." Of course, English was his native tongue. Humans generally read whatever languages they have the ability, or need, to.

But, the statement is fairly valuable in its compact way of expressing a much larger and more complicated issue. I've never been one to either defend, or rage against, the nature of this whole process. It's the way it is, because that's the way it is. And, with the assumption that it is that way (because it is), I usually just ask folks if they'd rather remember 130+ dotted quads or 130+ relevant names? Humans generally prefer names. The computers, playing it smart, are not coming out with any stated position on either side of the debate (although they still insist on translating every name to a number ;)

Knowing that we, being the humans and not the computers, "have" to work with both numbers and names makes understanding the translation process all the more valuable. Probably most readers are familiar with the standard "nslookup" or "dig" commands, which provide a nice frontend to the process. For instance, with nslookup (much the same as dig), you can find out a host's IP address if you know the name, like so:

host # nslookup www.lycos.com
Server: dns.xyz.com
<--- This is the IP address of the server that's doing the name/IP resolution

Name: lycoshome.bos.lycos.com
<--- Note that this is the name that will show up when we resolve the IP
Aliases: www.lycos.com <--- This is the name we looked up, but it's noted here that this name is an alias for the name listed two lines above (the "actual" hostname, ...probably)

Conversely, we could look up that IP address using the exact same method, with (what should be) equal, but inverse, results:

host # nslookup 209.202.230.30
Server: dns.xyz.com

Name: lycoshome.bos.lycos.com
<--- and, yes, the name here matches the one we originally searched for :)

While this type of hostname/IP translation is valuable and highly useful, it's not necessarily the best tool possible to use when doing network programming or scripting. For instance, if we were writing a Perl script that took the input of either an IP address or a hostname as its only argument, using nslookup or dig would cause our system to have to do more work. For today, we're going to do all of our work using "perl -e" to run mini-scripts (or execution statements) from the command line. Any of this stuff could easily be inserted into a Perl "script":

host # perl -e '\$name = `nslookup www.lycos.com|grep Address|sed 1d`;chomp(\$name);print "\$name\n";'

That one line of code populated the \$name variable with the value "Address: 209.202.230.30" A value which we'd have to massage even more to remove the unsightly characters "Address: " that appear on the same line as the actual address, which is all we want.

If you're using Perl to do your network scripting, doing hostname/IP resolution in this manner is counter-productive for two big reasons (there are probably more little ones):

1. The backtick operators cause Perl to invoke a system shell in order to execute the "nslookup" command, and all the other commands within the backticks. This is obviously detrimental (to a varying degree) because we logged onto our system in a shell, from which we invoked Perl, to be executed in a subshell of that initial shell. This code is asking to open yet another subshell, within the subshell from which we're running, in order to perform a function that Perl is equipped to handle on its own. If you find that you're using system() or backtick calls a lot in your Perl code, you should, perhaps, consider rewriting it as shell script. If you just do this once, you won't notice any difference in the time it takes your script to run, but the execution time gets longer with every subsequent call.

2. The variable hacking and chopping that goes along with the pipe-chain which contains, at least, one grep and one sed command, is more or less unreliable. If the output of the nslookup command isn't exactly as we expect, our assumptions about how we should parse the data will result in an outcome we didn't expect, also.

So, basically I'm saying that using these sorts of methods are slower (they cause the system to do more work) and not guaranteed to produce meaningful results to a reasonable degree.

Fortunately, as we noted, Perl can take care of hostname-to-IP resolution, and the opposite IP-to-hostname resolution, easily, using built-in functions and methods.

Now let's take a look at those two translations . We'll, again, be executing both Perl commands directly from the shell command line for ease of execution:

First, let's lookup lycos.com again:

host # perl -e 'use Socket; \$ip = gethostbyname(\$ARGV[0]); \$name = inet_ntoa(\$ip);print "\$name\n";' www.lycos.com
209.202.230.30

That was simple :) Now let's lookup that IP address and see if we can't get the hostname:

host # perl -e 'use Socket; \$name =gethostbyaddr(inet_aton(\$ARGV[0] ),AF_INET); print "\$name\n";' 209.202.230.30
lycoshome.bos.lycos.com

And again, we have a fairly simple response. The only thing we don't see, in the above examples, is that www.lycos.com is an alias for the hostname lycoshome.bos.lycos.com. This can be found out using Perl, as well, but isn't worth our time (assuming it's at a premium) since we can, more easily, do a double-verification of that hostname to make sure that www.lycos.com and lycoshome.bos.lycos.com both resolve to the same IP address and we aren't being "tricked" by a false DNS return:

host # perl -e 'use Socket; \$ip = gethostbyname(\$ARGV[0]); \$name = inet_ntoa(\$ip);print "\$name\n";' lycoshome.bos.lycos.com
209.202.230.30

Great news, they both resolve to the same IP (209.202.230.30) so we can be reasonably sure the data is accurate. There are way too many levels to drill down in order to make "absolutely" sure.

Hopefully, you'll be able to integrate Perl's built-in functions for translating hostnames to IP addresses, and vice versa, into your own scripts (or use them on the command line) and save your self some time and trouble :)

Cheers,

, Mike

## Tuesday, May 20, 2008

### Tainted Perl On Linux or Unix - Helping You Protect You From Yourself

Hey there,

Generally, when you're writing a Perl script to help you automate any Unix or Linux tasks, you don't really need to worry about security. Aside from the fact that you could delete everything on your system or write an infinitely recursing loop that will chew up all the CPU... on second thought, thinking about security is probably a good idea most of the time ;) Super-heavy security checking isn't really necessary for small things, but is always a good idea to work toward when writing scripts to execute important system function and/or for the use of others.

This is where Perl's "Taint" comes into play. It's kind of like "-w"'s less-tolerant cousin. While running Perl with "-w" will print out all sorts of warnings if your code is suspect, Taint will shut you down. Some builds of Perl may have a "-t" option that acts more like the "-w" flag (stronger checking, but only prints out the warnings).

Perl's Taint mode can be added to any script by simply changing the shebang line from:

#!/usr/bin/perl

to

#!/usr/bin/perl -T

Not too much extra work to get it set up, although now you'll have more things to consider when you write your script ;) Taint mode will cause your script to fail, now, if it feels the script is not secure enough! Interestingly enough, Perl will generally turn Taint mode on automatically if you change any Perl script's permissions to setuid or setgid (which is when it's probably needed the most :)

One of the most important things Perl's Taint does is "not" allow external data (input) to be used in any routine, or action, that will affect other data (or whatever else) external to your script, unless you sanitize that input first.

For instance, the following would not be allowed by Perl Taint (Note that most actions that cause Taint errors are system calls, backtick operations or exec calls):

\$variable = \$ARGV[0]; <--- We've assigned the first argument on our script's command line to the \$variable variable.
system("\$variable"); <--- and now we're executing that argument as a command from within the Perl script!

Obviously, in this example (assuming the name of our script is PerlScript and it runs as a user of sufficient privilege), we could do something like the following and cause a big problem:

host # ./PerlScript "rm -rf *"

Ouch! This sort of thing is actually seen a lot in CGI programming, with the assumption being that the "nobody" user that most folks run their Web Server as, can't do all that much damage. Consider, however, that (in the "nobody" example) the "nobody" user probably does have permission to delete all of your html and cgi-bin files. That would be headache enough, even though you'd still get to keep your operating system ;)

Taint also acts to protect you comprehensively. In our limited example above, adding the -T flag would have protected us against that contaminated \$variable variable. However, Taint will also work on arrays and hashes and, even better, will treat them as collections of scalar variables (which they, essentially, are). So you can, reasonably, end up in a situation where only some values in an array or hash are Tainted, while the rest of them are considered perfectly safe (or unTainted).

If you have the -T flag enabled, whenever Perl runs into a situation where the data (input or output) is considered Tainted, it will terminate the execution of your script with a brief explanation ( like a description of what variables are insecure and/or what insecure dependencies they have on other variables) describing why you might be in trouble if you run your script as-is.

A great thing for your script (and your security) is that it takes relatively little effort to sanitize a Tainted variable.

For instance, in our example above, \$variable was considered Tainted as soon as it became assigned the value of \$ARGV[0] (You wouldn't get the actual error until you tried to use that Tainted variable data). If you wanted to clean that variable before running the system call on the next line, you'd just need to process it.

So, while this chunk of code would be considered unsecure (or Tainted):

\$variable = \$ARGV[0];
system("\$variable");

This chunk of sanitized code (actually, just with a sanitized \$variable) would be considered secure (or unTainted):

\$variable = \$ARGV[0];
\$variable =~ s/;//g;
system("\$variable");

Obviously, there's a lot more to go into when it comes to Perl's Taint, but, hopefully, this has served as an easy-to-understand introduction. Now you can feel free to read the 50 pages of detailed specifications ;)

At the very least, you can use the Taint flag to check your Perl scripts while you write them, and then remove it when you want to put your work out there for everyone to use.

Best wishes,

, Mike

## Monday, May 19, 2008

### Masking Your HTTP Make And Version In Apache For Linux Or Unix

Hey there,

In an older post, we went into some small detail about ignoring HTTP headers when using bash to access the network. Today we're going to look at that from another angle and consider how you can, at least somewhat, protect yourself from a clearly defined outside attack by mocking up your HTTP headers using Apache.

For this simple test, we used the 3 latest versions of the three separate build trees of the Apache HTTPD Server: versions 1.3.41, 2.0.63 and 2.2.8. All three versions were compiled on RedHat Linux, SUSE Linux and Solaris Unix.

While all of these versions provide for some measure of protection within the httpd.conf file (such as turning the server signature on or off, allowing you to fake your hostname, letting you fool with server tokens, etc), if you want to truly utilize security-through-obscurity with Apache, your starting point should be at the source code level. Of course, if you use a pre-packaged binary, you might no be able to do any of this, short of recompiling a vendor package, which might cause more harm than good...

As a quick note to source-code builders who run Apache with mod_ssl, please keep in mind that mod_ssl generally checks the Apache include files to determine whether or not the version of Apache it's being compiled against is the correct one. If you "do" use this sort of setup, be sure to compile mod_ssl first, make these changes and then compile Apache. Or you could compile Apache this way, change the include back to the way it was, compile mod_ssl and go from there. Whatever suits your particular style is OK :)

The main (and by main, I mean "only" ;) trick here is to change what the Apache HTTPD server thinks it is (The version that it spits out of the httpd binary). All of the configuration file mangling you do won't stop Apache from reporting its true version given a particular set of circumstances.

Luckily, this modification is very easily achieved, and can be set in one file (before running configure and make). Depending on how crazy you want to get, you can mislead attackers so that they attempt to use outdated exploits against your site or (if you get too creative) let them know that they can't be sure what version of Apache you're running... if you're running Apache at all. The biggest trick in all this, after a while, might just be remembering what version you really "are" running ;)

For Apache 1.3.x, you can change these fields in the source_dir (wherever you untar the source files) under the src/include directory in the httpd.h file:

#define SERVER_BASEVENDOR "Apache Group"
#define SERVER_BASEPRODUCT "Apache"
#define SERVER_BASEREVISION "1.3.41"

If you want, you could make this:

#define SERVER_BASEVENDOR "HTTPD Consortium Group"
#define SERVER_BASEPRODUCT "SpyGlass"
#define SERVER_BASEREVISION "3.2"

And, instead of a user seeing this when they hit the right error page (or nail your server directly):

Server: Apache/2.2.4 (Unix)

They'd see this:

Server: SpyGlass/3.2 (Unix)

And that's just from a telnet to port 80. The other information could be gotten other ways. In any event, whoever gets it will be misinformed :)

For both the 2.0.x and 2.2.x strains of Apache's HTTPD server, the file you'll need to modify is in the source_dir, in the include directory, in the file ap_release.h.

For 2.0.x - Modify these lines to suit your taste:

#define AP_SERVER_BASEVENDOR "Apache Software Foundation"
#define AP_SERVER_BASEPRODUCT "Apache"
#define AP_SERVER_MAJORVERSION_NUMBER 2
#define AP_SERVER_MINORVERSION_NUMBER 0
#define AP_SERVER_PATCHLEVEL_NUMBER 63

For 2.2.x, the file name and location are the same, and there is only one additional line added that you can/should manipulate:

#define AP_SERVER_BASEVENDOR "Apache Software Foundation"
#define AP_SERVER_BASEPROJECT "Apache HTTP Server"
#define AP_SERVER_BASEPRODUCT "Apache"
#define AP_SERVER_MAJORVERSION_NUMBER 2
#define AP_SERVER_MINORVERSION_NUMBER 2
#define AP_SERVER_PATCHLEVEL_NUMBER 8
#define AP_SERVER_DEVBUILD_BOOLEAN 0

For the AP_SERVER_DEVUILD_BOOLEAN, you can change the value to 1 and the string "-dev" will be added. This is essentially the same as the AP_SERVER_ADD_STRING variable in 2.0.x, but slightly less flexible.

Here's to having fun making up server names or (even better) resurrecting old ones that haven't been around since I was a teenager :)

Cheers,

, Mike

## Sunday, May 18, 2008

### Doing Search And Replace In Multiple Files With Unix and Linux Perl - Easy Or Hard?

Hey There,

For this weeks "Lazy Sunday" post we're going to take a look at the versatility of Perl on Linux or Unix. While I'm fairly certain there's little argument that it's the best tool for most "extraction" and "reporting" functions when used in complicated situations, it's always been more interesting to me because of the wide range of ways you can complete any sort of task.

Today we're going to take a look at two different ways to do search and replace in multiple files using strictly Perl. The first way will be obnoxiously long and the second way will be almost invisible ;)

For both situations, we'll assume that we have 15 files all in the same directory. We'll also assume that we're logged into our favorite flavour of Linux or Unix OS and, coincidentally, in the same directory as those files. All the files are text files and are humungous. And, finally, all of the files are stories where the main character's name is Waldo, they've never been published and the writer's had a change of heart and decided to name his main character Humphrey. It could happen ;)

1. The hard way (or, if you prefer, the long way):

We'll write a script to read in each file and scour it, line by line. For lines on which the name Waldo appears, we'll replace that with Humphrey. We're taking into account, also, that Waldo may be named more than once on any particular line and that the name Waldo may have accidentally been mistyped with a leading lowercase "w," which needs to be corrected. That script would look something like this:

`#!/usr/bin/perl## replace_waldo.pl - change Waldo to Humphrey in all files.## 2008 - Mike Golvach - eggi@comcast.net## Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License# @all_files = `ls -1d *`;\$search_term = "Waldo";\$replace_term = "Humphrey";foreach \$file (@all_files) { \$got_one = 0; chomp(\$file); open(FILE, "<\$file"); @file=<FILE>; close(FILE); foreach \$line (@file) {  if ( \$line =~ /Waldo/i ) {   \$line =~ s/Waldo/Humphrey/gi;   \$got_one = 1  } } if ( \$got_one ) {  open(NEWFILE, ">\$file.new");  print NEWFILE @file;  close(NEWFILE);  rename("\$file.new", "\$file"); }}`

2. The easy way (or, again, the short way):

Assuming the exact same convoluted situation, here's another way to do it (which we've covered in a bit more detail in this older post on using Perl like Sed or Awk):

From the command line we'll type:

host # perl -p -i -e 's/Waldo/Humphrey/gi' *

And we're done :)

Of course, the longer method is better suited for situations in which there are other extenuating circumstances. Or, perhaps, even more work to do. For the sort of limited situation we've laid out today, I will almost always go with the second method (Who wants pie? :)... Unless I have lots of time on my hands ;)

Cheers,

, Mike

## Saturday, May 17, 2008

### Ignoring All Standard Characters Using Perl In Linux Or Unix

Hey there,

Every once in a while, you have to reverse the order of your thinking. For today's example of code to do cherry-picking of values, we'll demonstrate just that. In most scripts, you're either looking for a specific set of values (or a relatively specific set), or you're trying to ignore them. Unix and/or Linux will attempt to print any character it finds in a script if you ask it to(even if it's, technically, unrepresentable, like a control-character sequence), which can make for some interesting output. Here, we're going to look for everything that we don't want to find (??? ;)

Basically, our script attached to today's post is going to look for every character in a given file that counts as a "special" character. And, by special, I mean goofy :) Since we have no idea what kind of insane characters we might not want to see, we have to begin by defining everything that we know and excluding all of that so that we only match, and ignore, things we don't know about.

This starts out simple, of course. We know we want to ignore the alphabet (upper and lower case) and the regular set of numerals. The next step is relatively simple as well: We know we want to ignore all the other "normal" characters. If you recall from our post on generating all possible passwords using Perl, there are 94 regular characters (including the alphaBET and numbers, noted already) that we need to ignore, plus simple stuff like spaces, tabs, newlines, carriage return and bells. There may be more... we'll never know until we don't ignore it ;)

The trickiest part of script-work like this is the Hell of backslash-escaping that you'll inevitably get caught up in. Hopefully, the script we've attached today will help you out in that regard. If you feed this script any file, it should print out only the lines with"bizarre" characters in them.

For example, given a file with these contents:

0O
BBC
hey there
little ÿ¾åbÿ
756_RATT
LiP_\$3rv1c3
ENTER

Running this Perl script will produce the following result (printing only lines with strange characters in them)

host # ./freaky.pl FILE
little ÿ¾åbÿ

Here's to finding out what you don't know :)

Cheers,

`#!/usr/bin/perl## freaky.pl - only print out lines with unknown chars## 2008 - Mike Golvach - eggi@comcast.net## Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License#if ( \$#ARGV != 0 ) { print "Usage: \$0 FileName\n"; exit(1);}\$filename = \$ARGV[0];open(FILE, "<\$filename");while (<FILE>)  {        if ( \$_ =~ /[^A-Za-z0-9\s\t\\r\a`\-=\[\]\\;\',\.\/~!@#\$%^&\*\(\)_+\{\}\|:\"<>\?)]/ )        {                print \$_        }}close(FILE);`