Tuesday, March 31, 2009

Measuring Heavy CPU Usage Over Time On Linux And Unix

Hey There,

Today's bash script is going to be somewhat related to our previous script which tracked idle process time on Linux and Unix, insofar as it deals with trying to rid your system of troubling processes as automatically as possible. Of course, there's no substitute for an eyeball-inspection (of the system, I mean. Unless your eyeballs are hurting ;) but, once you've got a few things down and feel reasonably safe, the more you can take off of your daily plate, the better. ...Just don't make your job so incredibly simple that a machine (or a pre-schooler from a third-world country) could do it ;)

This script is, like most of the stuff we put out here, incredibly easy to run (especially since you set all the variables inside it - change as you see fit), like so:

host # ./munchies

And you're off. In the screenshots below, we'll walk through some basic examples of simple usage, assuming the script's built in parameters. Any process consuming more than 10 percent of the CPU gets added to the blacklist and any process that shows up in the blacklist, 10 times consecutively, will get killed (No screwin' around here ;)

In the first screenshot, we've isolated process id 499 (which happens to be the X server), since it's the only process on the box that meets the "CPU percentage" criteria. Once it finds that process, it adds it to the default temporary file (the simple way to maintain state ;). We then populate the /tmp/munchiestats file with a whole bunch of other PID's (some real, some non-existent) and multiple instances of PID 499 (but, less than 9, so we don't trigger the kill on the next execution) and cat that so you can see the contents:

Click on the picture below. Like water on a sponge ;)

munchies script output 1

In the second screenshot, we run munchies again and see it clear all the PID's in the temp file that are legitimate, but aren't using over 10% of the CPU anymore. We also free any PID's in the temp file that don't exist any more (possibly, from a process exiting, but - in this case - because we just made them up ;). The final run executes the kill of PID 499 and removes it from the temp file:

Click on the picture below and brace yourself for the HUGEness ;)

munchies script output 2

Of course, the script has its faults. The most blatant pain in the arse (to our thinking, at this point - with very little QA'ing done ;) is that we've hardcoded the percentage of CPU (10%) and amount of times a PID is allowed to use that much (10 times) and not made them command line or top-listing variables. If you want to change it in the script, just modify these lines:

For the CPU percentage limit:

if [[ $cpu_percentage_integer -gt 10 ]]

And for the number of consecutive times you'll allow the offending PID to get away with it before you murder (I mean, kill... ;) it:

if [[ $chronic_muncher -gt 8 ]] <-- This is set to 8 since, if the pre-existing number of additions of a certain PID is over 8, it's (at best) 9, and this go 'round will put it at the limit of 10!
elif [[ $chronic_muncher -lt 10 ]]

Another maybe-flaw is that we don't have it set to run backgrounded, or as a daemon. In other words, you need to run it on your own schedule. We have it running in cron every 5 minutes, so a process can abuse the CPU for about 50 minutes before we kill it. If you run it every minute, you can kill it in 10. Of course, all of this is "variable" and you can change it to suit your needs.

And, if you consider this a flaw, the script was written in bash on Solaris 10, but should be easily portable to other Unix and Linux distro's. Let us know if you'd like to see a version for RedHat or Ubuntu!

Here's hoping this helps you out in some way, shape or form. It's probably translatable to a lot of other work-type performance-tuning situations, as well.

Cheers!


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/bash

#
# munchies - eat up processes using over 10 percent of the cpu over 10 iterations...
#
# 2009 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

cpu_munchers_file="/tmp/munchiestats"
sed=`which sed`
awk=`which awk`
ps="/usr/ucb/ps" # Using /usr/ucb/ps on purpose for %CPU stats
grep=`which grep`
mv=`which mv`
wc=`which wc`
sort=`which sort`
xargs=`which xargs`
kill=`which kill`
# kill_signal="-9" # Don't set this if "kill -TERM"/"kill -15" - i.e. "plain vanilla"kill - is acceptable

while read a b c d
do
munch_pid="$b"
cpu_percentage="$c"

if [[ -z "$cpu_percentage" ]]
then
echo "munch_pid $munch_pid is either non-existent or is using less than zero percent of the cpu!"
continue
else
cpu_percentage_integer=$(echo "$cpu_percentage"|$sed 's/^\([^\.]*\)\..*$/\1/')
fi

if [[ $cpu_percentage_integer -gt 10 ]]
then
echo "Got A Bad One Here - munch_pid $munch_pid Is Using $cpu_percentage_integer Percent Of Our Cpu"
if [[ -f $cpu_munchers_file ]]
then
echo "Checking cpu_munchers_file $cpu_munchers_file For munch_pid $munch_pid"
chronic_muncher=$(echo `$grep -w $munch_pid $cpu_munchers_file|$wc -l`)
if [[ $chronic_muncher -gt 8 ]]
then
echo "munch_pid $munch_pid Count Is $chronic_muncher - This Will Put It At 10 Or Higher"
echo "Issuing \"$kill $kill_signal $munch_pid\" And Removing From $cpu_munchers_file now!"
temp_variable=$$
### $kill $kill_signal $munch_pid
$grep -vw $munch_pid $cpu_munchers_file >>${cpu_munchers_file}.$temp_variable
mv ${cpu_munchers_file}.$temp_variable $cpu_munchers_file
elif [[ $chronic_muncher -lt 10 ]]
then
echo "munch_pid $munch_pid, with $cpu_percentage_integer cpu usage, Being Added, Possibly Again, To cpu_munchers_file $cpu_munchers_file"
echo "$munch_pid" >>$cpu_munchers_file
fi
else
echo "No Cpu-Munchers Exist. Creating cpu_munchers_file $cpu_munchers_file And Adding munch_pid $munch_pid"
echo "$munch_pid" >>$cpu_munchers_file
fi
else
if [[ -f $cpu_munchers_file ]]
then
chronic_muncher=$(echo `$grep -w $munch_pid $cpu_munchers_file|$wc -l`)
if [[ $chronic_muncher -gt 0 ]]
then
echo "munch_pid $munch_pid Is Ok And Is In $cpu_munchers_file - Removing"
temp_variable=$$
$grep -vw $munch_pid $cpu_munchers_file >>${cpu_munchers_file}.$temp_variable
mv ${cpu_munchers_file}.$temp_variable $cpu_munchers_file
else
:
fi
else
:
fi
fi
done <<< "`$ps -aux|$awk '{print $1,$2,$3,$NF}'|sed 1d`"

echo "Checking $cpu_munchers_file For Non-Existent munch_pids"
if [[ -f $cpu_munchers_file ]]
then
muncher_array=$($sort -u $cpu_munchers_file|$xargs echo)
for possible_lost_pid in ${muncher_array[@]}
do
is_this_muncher_real=$(echo `$ps -aux|$grep -w $possible_lost_pid|$grep -v grep|$wc -l`)
if [[ $is_this_muncher_real -eq 0 ]]
then
echo "Lost munch_pid $possible_lost_pid Is No Longer Running. Removing From $cpu_munchers_file"
temp_variable=$$
$grep -vw $possible_lost_pid $cpu_munchers_file >>${cpu_munchers_file}.$temp_variable
mv ${cpu_munchers_file}.$temp_variable $cpu_munchers_file
fi
done
echo "All Possible Injustices Have Been Remedied"
fi



, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Monday, March 30, 2009

Idle Process Time On Linux And Unix: How To Find It Again

Hey There,

In our final installment on "finding a process's idle time on Linux or Unix," (last touched upon in our post on echo debugging) we looked at whole lot of ways one could go wrong trying to find the idle time of a process on Linux or Unix. This post is a little more upbeat ;)

All of the previous issues with "who -T" have been worked out. Basically, this means that I've gone over it every which way and could find no good reason to use it, as opposed to "w." Of course, in our particular case, we are looking, specifically, for a single process's idle time (as opposed to a user process's idle time; reported by "who -T"). And, although it's a little bit of a pain (initially), short of programming in C (accessing the pstatus struct on Solaris, to be exact - the name and location may vary from distro to distro of proprietary, or free, Unix and/or Linux), linking the pty information from ps with the "idle time" information from w, seems to be the best way to get this information. So far, it's the most efficient way I could find using simple bash scripting.

Attached to today's post is the final "blog" version of this script. It comes with a few notes (possibly of caution) and may need to be modified for your system/OS (There's the first one ;)

The script runs very simply, and you only need to supply it with a PID. You can, optionally, supply a username as a second argument:

host # ./rip 17787

if you just run it with no arguments, you'll get a usage screen, which may or may not help ;)

host # ./rip
Usage: ./rip PID [user]
User defaults to the value
of $LOGNAME if not specified


Please see our previous post on echo debugging this script for more detailed sample output.

I hope you find some good use for this script, and, without further ado, the oft-dreaded notations of explanation ;)

1. This script has been rewritten to be self-contained. Please see the bottom line for any substitute command you may want to use. Actually, making this command a variable might be a good idea. Just call me "Lazy" ;)

2. You can remove the explicit PATH definition if you like. I put it in there specifically to make sure that the "which ps" variable assignment didn't accidentally grab /usr/ucb/ps on Solaris

3. You can comment out the SIGNAL variable as well, since plain old kill is a sig TERM or 15. The only real reason to set this would be if you wanted to always run kill with a different signal (like SIGKILL,, or 9, for example)

4. I changed the minimum idle time to 30 minutes from 45 (in the previous revisions)

5. All variables appearing in this work are fictitious. Any resemblance to real variables, living or dead, is purely coincidental ;)

Cheers,


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/bash

#
# rip - Kill any processes that we know have been idle for more than 30 minutes
#
# 2009 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

PATH="/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin"
procowner="rftprocowner"
prog="\./rft"
sed=`which sed`
awk=`which awk`
ps=`which ps`
grep=`which grep`
kill=`which kill`
signal="-15"

while read a b c d
do
pid=$b
pid_not_var=$(echo $pid | $grep [A-z])

if [[ ! -z $pid_not_var ]]
then
echo "pid $1 contains non-numeric characters!"
continue
fi

pid="$b"
pid_pty="$c"

if [[ -z "$pid_pty" ]]
then
echo "pid $pid is either non-existent, not owned by \"$procowner\" or not attached to a p/tty!"
continue
elif [[ "$pid_pty" = "?" || "$pid_pty" = "console" ]]
then
echo "pid $pid is not attached to a pty!" # kill OR LEAVE IT?
else
pty_num=$(echo "$pid_pty"|$sed 's/^[^\/]*\///')
fi

proc_time=$(w -sh $procowner|grep $pty_num|grep -v grep|$awk '{if ( $2 == '"$pty_num"' && NF == 4 ) print $3;else if ( $2 == '"$pty_num"' && NF == 3) print "0"}')

proc_is_num=$(echo $proc_time | $grep [A-z])
if [[ ! -z $proc_is_num ]]
then
unset proc_time
fi

ext_proc_time=$(echo $proc_is_num | $grep [A-z])

if [[ ! -z "$ext_proc_time" && -z "$proc_time" ]]
then
echo "killing $pid - $d Up Over 24 Hours: $ext_proc_time $proc_time"
### $kill $signal $pid
elif [[ "$proc_time" = "0" ]]
then
:
else
proc_idle_time=$(echo $proc_time|$grep -v "[:]")
if [[ -z $proc_idle_time || $proc_idle_time -gt 30 ]]
then
echo "killing $pid - $d Up More Than 30 Minutes: $ext_proc_time $proc_time $proc_idle_time"
### $kill $signal $pid
fi
fi
done <<< "`$ps -fu $procowner -o procowner,pid,tty,comm|$grep "$prog"|$grep -v grep`"


, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Sunday, March 29, 2009

Star Wars Vs. Star Trek: Some Slightly Off-Topic Humor

Happy Sunday to everyone,

Hopefully you're still asleep while you're reading this (????) I guess, kind of like I am while I'm writing this ;)

This Sunday, I stumbled upon this goofy, but fun, video on YouTube. It's an HQ mish-mosh of Star Trek and Star Wars. Actually, pretty clever.

I, for the record, am a bigger fan of the original Star Trek, since I watched it as a kid. Then, as a 20-something (into a few questionably-legal past-times ;) I fell in love with it again, mostly because of the bad sets, and blatantly stereotypical characters. You had a womanizer, a grumpy old man, a soulless intellectual and a raging drunk, just to get you started. You can't make up T.V. like that ;) (????)

Enjoy. I'm going back to bed ;)







, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Saturday, March 28, 2009

Some Off-Season Linux And Unix Humor

Hey There,

Today's Linux/Unix humor post is a little early in the arriving. It's only March and I found a decent Christmas joke. If I were in advertising, promo'ing Santa Claus in the first quarter would probably be laudable ;)

Hope you enjoy this little joke, which I found over at BOHOL. Of course, this isn't the only joke they've got up there, but it might be the best. Give 'em a look if you need a good pre-pre-pre-pre-pre-holiday laugh ;)

Cheers,



A penguin Christmas

Similarities Between Santa and Sysadmins

1. Santa is bearded, corpulent, and dresses funny.

2. When you ask Santa for something, the odds of receiving what you wanted are infinitesimal.

3. Santa seldom answers your mail.

4. When you ask Santa where he gets all the stuff he's got, he says, "Elves make it for me."

5. Santa doesn't care about your deadlines.

6. Your parents ascribed supernatural powers to Santa, but did all the work themselves.

7. Nobody knows who Santa has to answer to for his actions.

8. Santa laughs entirely too much.

9. Santa thinks nothing of breaking into your $HOME.

10. Only a lunatic says bad things about Santa in his presence.




, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Friday, March 27, 2009

Unix and Linux Humor: RTFM Man Page

Hey there,

As I prepare for another fun-filled weekend of taking over-the-counter stimulants (to enhance the effects of the caffeine and nicotine already in my system ;) and working from 8am until 4 in the morning the next day, I thought I'd go out and find some more funny stuff on the net.

Today, I happened upon jaegers.net's Unix and Linux humor page, which is packed with goodies. Go check it out. They've got everything from a few fake man pages you may not have seen yet (I opted not to go with the Knife man page, although I liked the description of the Sysadmin Network Interrupt Protocol (SNIP) ;) to jokes and even pictures that don't require words to understand. And that's just part of the site. I'll be checking more of it out later, since they have multiple humor sections and I like to laugh at anything funny (no matter who's feeling's get hurt; even if they're my own ;)

Enjoy this man-page-usage man page. Read The Fine Manual, I believe it stands for. Or some fuckin' thing like that ;)

Cheers,

The man page below can be found, in its original context, online at jaegers.net



rtfm(1) UNIX Programmer's Manual rtfm(1)


NAME


rtfm - read the fucking manual


SYNOPSIS


rtfm


OPTIONS


None, you have to read the manual for an answer.


DESCRIPTION


Used when lazy people ask stupid questions. Normaly cried out in vain.


FILES


/dev/null


ENVIRONMENT


Any.


SEE ALSO


man(1)


DIAGNOSTICS


Is an diagnostic. Since you are reading this you are getting the idea.


BUGS


Ha!





, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Thursday, March 26, 2009

Simple, But Effective. Echo Debugging On Linux And Unix

Hey there,

I took some time and did some simple "echo debugging" and found that the warning I issued about yesterday's script to find a process's idle time was completely backward. Fortunately, it turns out that my mistaken judgement meant that I had a lot less to worry about, in terms of damage control, from the flaw I perceived in my script (I'm not saying there aren't others, of course ;)

It turns out that the problem was not that the script would sometimes consider an active process, that didn't have an idle column value in the "w -s" output, to be idle and worth terminating. The actual problem was that it would consider processes that had been up for more than a day to be active and not worth terminating. This was a much better situation. At least I wouldn't be killing off active processes!

A sample of just using the DEBUG statements I put in yesterday's script pointed out the error very obviously, like so:

DEBUG::::: PIDTTY 6892 pts/119
DEBUG::::: W user1 119 2days ./program
DEBUG::::: LONGTIME TIME
PID 6892 is OK - Not Idle At All - Remove this message!
------------------------
DEBUG::::: PIDTTY 581 pts/232
DEBUG::::: W user1 232 2days ./program
DEBUG::::: LONGTIME TIME
PID 581 is OK - Not Idle At All - Remove this message!
-----------------------------------


And, yes, I felt like a complete moron when I finally took a second to actually look at the output ;) It's a amazing what a few simple echo statements in a script can tell about what problem's it has :)

From that point, I found several other issues and worked on them accordingly:

1. ISSUE WITH IDLE ALPHA DAYS NOTATION = FIX BY CHECKING FOR NON-NUMERIC TYPES

2. ISSUE WITH NO-IDLE MISSING COLUMN = FIX BY SETTING EMPTY VALUE TO NULL PADDED

3. ISSUE WITH MISSING COLUMN ERROR OUTPUT = FIX BY CHECKING COLUMN COUNT IN TIME

4. MUCH BETTER - "NOT IDLE AT ALL" EXCEPTION NEVER CAUGHT - UNNECESSARY NOW - REMOVED

5. REWORKED TIME HANDLING AND SET TO AMBIGUOUS ALPHA MATCH


Pardon my hysterical notes ;) Most of my problem stemmed from the fact that I switched from full-fledged "w" to "w -s" and made some mistakes in updating the relevant columns that I needed to assign to variables.

I should note that I also considered using "who -T" to get around the one time-stealer in this script. Although it did bring the script down to under a second (processing approximately 100 records), "who" only reports on the "user process." This is a huge consideration, since the "user process" can (and usually is) the parent process of the process you want to check the idle time on. I ultimately decided to stick with "w" since using "who" would mean I'd have to check the parent process, cross reference that with the grep output associated with the pty and then end up back at "w" again to get the process's idle time. A lot of extra work for a lot of extra uncertainty. I didn't want to end up in a situation where the "user process" was idle because the user kicked off a script that ran for 6 hours and then terminate the user's main process (which would kill the kids) based on the idle time of the user's session. Sometimes, lack of precision like that can cause you headaches you never imagined you could have ;)

As you can see below, the updates weren't all that impressive, but I did get the execution time down to 30 seconds from 2 minutes. The only way I could get it lower (that I've figure out so far ;) was to compromise the integrity of the script and remove the one awk statement that was holding it back. Notice the last step I took, just to see what would happen, that proved the awk if/else conditional in the script was responsible for a majority of the execution time:

TRIMMED CODE - REMOVED DEBUG AND UNNECESSARY ECHO STATEMENTS - USING BASH TEST AND OPERATORS
OLD SCRIPT EXECUTION TIME FOR 178 PROCS = 1m27.430s
NEW SCRIPT EXECUTION TIME FOR 179 PROCS = 0m56.517s
NEW SCRIPT EXECUTION TIME FOR 110 PROCS = 0m48.940s
SELF-CONTAINED SCRIPT EXECUTION TIME FOR 111 PROCS = 0m51.048s
ADDED TTY TO PS SCRIPT EXECUTION TIME FOR 101 PROCS = 0m29.703s
REMOVING AWK TTY STATEMENT SCRIPT EXECUTION TIME FOR 100 PROCS = 0m29.991s
ADDED ?, console and "continue" SCRIPT EXECUTION TIME FOR 102 PROCS = 0m33.018s
REMOVED W HEADING (REM SED) AND EXPLICIT USER SCRIPT EXECUTION TIME FOR 101 PROCS = 0m33.382s
TEST - HARDCODED UPTIME AND REMOVED AWK STATEMENT - SCRIPT EXECUTION TIME FOR 170 PROCS = 0m8.512s!!!!!!!!!!!


I'm going to work on it some more, because I believe it can be improved tremendously, but - to satisfy any curiosity, here's some of the mid-work that fixed that issue and made the bash script report correctly. I'll post the one with the fixes noted above (and more, I'm sure ;) once I've thoroughly tested them and removed a lot of the redundancy in this script. Redundancy really gets under my skin. I mean it; redundancy really irritates me. Plus, I don't much care for redundancy ;)

Cheers,


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/bash

#
# rip - Kill any processes that we know have been idle for more than 45 minutes - v2-alpha
#
# 2009 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

if [[ $# -lt 1 ]]
then
echo "Usage: $0 PID [user]"
echo "User defaults to the value"
echo "of \$LOGNAME if not specified"
exit 1
fi

PID=$1
ISITAPID=$(echo $PID | grep [A-z])

if [[ ! -z $ISITAPID ]]
then
echo "PID $1 contains non-numeric characters!"
echo "-----------------------------------"
exit 2
fi

PID="$1"
USER=${2:-$LOGNAME}

PIDTTY=$(/usr/bin/ps -fu $USER -o pid,tty |/usr/bin/grep -w $PID|/usr/bin/grep -v grep)

echo DEBUG::::: PIDTTY $PIDTTY

if [[ -z "$PIDTTY" ]]
then
echo "PID $PID is either non-existent, not owned by \"$USER\" or not attached to a p/tty!"
echo "-----------------------------------"
exit 3
else
TTYNUMBER=$(echo "$PIDTTY"|/usr/bin/sed '/TT/d'|/usr/bin/awk -F"/" '{print $2}')
fi

if [[ -z "$TTYNUMBER" ]]
then
echo "PID $PID is not attached to a p/tty!"
echo "KILL OR NOT-----------------------------------"
exit 4
fi

echo DEBUG::::: W $(w -s|/usr/bin/sed 1d|/usr/bin//awk '{if ( $2 == '"$TTYNUMBER"' ) print $0}')

TIME=$(w -s|/usr/bin/sed 1d|/usr/bin/awk '{if ( $2 == '"$TTYNUMBER"' && NF == 4 ) print $3;else if ( $2 == '"$TTYNUMBER"' && NF == 3) print "0"}')
#TIME=$(w -s|/usr/bin/sed 1d|/usr/bin/awk '{if ( $2 == '"$TTYNUMBER"' ) print $3}')
#WCOLUMNS=$(w -s|/usr/bin/sed 1d|/usr/bin/awk '{if ( NF == 4 ) print "4";else print "3"}')

ISITANUMBER=$(echo $TIME | grep [A-z])
if [[ ! -z $ISITANUMBER ]]
then
unset TIME
fi

LONGTIME=$(echo $ISITANUMBER | grep [A-z])

echo DEBUG::::: LONGTIME $LONGTIME TIME $TIME

if [[ ! -z "$LONGTIME" && -z "$TIME" ]]
then
echo "PID $PID is ancient - Idle for $LONGTIME... Killing $PID"
# KILLKILLKILL
elif [[ "$TIME" = "0" ]]
then
echo "PID $PID is OK - Not Idle At All - Remove this message!"
else
TIMEIDLE=$(echo $TIME|grep -v "[:]")
echo DEBUG::::: TIME $TIME
if [[ -z $TIMEIDLE ]]
then
echo "PID $PID has been idle way too long - $LONGTIME $TIME so far... Killing $PID"
# KILLKILLKILL
elif [[ $TIMEIDLE -gt 45 ]]
then
echo "PID $PID has been idle too long - $TIMEIDLE minutes so far... Killing $PID"
# KILLKILLKILL
else
echo "PID $PID is OK - Only idle for $TIME minute(s) - Remove this message!"
fi
fi
echo "-----------------------------------"


, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Wednesday, March 25, 2009

Finding A Process's Idle Time On Linux And Unix

Hey There,

Hopefully yesterday's rant on the simplicity of complexity wasn't too much of a bitter pill. If it was, here's hoping you didn't swallow it ;)

Today, I finally found some time to make a little headway on this project (which should be a lot simpler than it is). Basically, what I'm looking to do is create a way to track specific process's idle times at any given point in time on any given Linux or Unix system. As I mentioned yesterday, there are c structures in Solaris' /proc/PID/status C data files (for one example), but that's just another thing that ended up frustrating me more. As I noted, parts of the OS that are included, should be available for use. The structure is used by the OS, in some shape or fashion, to determine idle times (as we'll see below), but no specific "tool" exists to do what i wanted. Of course, this is limited "to my knowledge." If anyone out there knows of a standard program or command that's managed to elude me, please feel free to email me and tell me all about. I promise to not get offended if you feel the need to belittle me for not having the common sense to look for it where it was at in the first place ;)

Attached to today's post is a rough-draft bash script that attempts to grab a process's idle time. It won't work in all instances, although I've tried to capture as many of those instances as possible. The one big gotcha in this whole mess is that you can't take the output of ps and directly retrieve the idle time for a process from the listing, even if you do your own formatting (I wrote this on Solaris 10 and looked at SUSE Linux 9, but found no love :) Instead, I found that I needed to run ps, extract the pty associated with the process from that (if it existed - which is an exception the script catches) and then use either who or w to retrieve the idle time associate with the pty.

See what I mean? Shouldn't it be a little bit less of a hassle than that?

Okay. I'll admit, if it was, I wouldn't be having half the fun I'm having now trying to script it all out for myself ;) So far, what I've put together works fairly well, although I'm not 100% certain that it's bullet-proof so I would recommend that you leave the "business end" of the code commented out (The stuff that performs unforgivable actions, like killing ;). I have a hard time reproducing it, but I can swear that this code will (every once in a good while) determine that a process that hasn't been idle at all (which removes a column from the "w -s" output) has been idle too long. I'm still working on that part and welcome any suggestions regarding the script, how to make it better, why I'm doing everything the wrong (and/or hard) way when I don't need to and any other constructive criticism :)

The script runs very simply, and you only need to supply it with a PID. You can, optionally supply a username as a second argument:

host # ./rip 17787

if you just run it with no arguments, you'll get a usage screen, which may or may not help ;)

host # ./rip
Usage: ./rip PID [user]
User defaults to the value
of $LOGNAME if not specified


and the following is a sample of the output you might get on a specific run. Here, I've written a command line while loop from a pipe to barbarically hammer out multiple instances at a time ;)

host # time ps -ef|grep "[b]ash"|awk '{print $2}'|while read x;do ./rip $x;done
PID 2664 is not attached to a p/tty!
-----------------------------------
PID 10700 is either non-existent, not owned by "root" or not attached to a p/tty!
-----------------------------------
PID 10855 is OK - Not Idle At All - Remove this message!
-----------------------------------
PID 23217 is OK - Not Idle At All - Remove this message!
-----------------------------------
PID 14730 is either non-existent, not owned by "root" or not attached to a p/tty!
-----------------------------------


Here's another example. This time you'll see what you get if you try to run the script specifying a user other than the user that owns the processes or, in this case, a completely bogus user. This "test" in the script really isn't necessary and I only included it as feeble attempt at damage control. Feel free to remove it if you like:

host # time ps -ef|grep "[b]ash"|awk '{print $2}'|while read x;do ./rip $x joeUser;done
ps: unknown user joeUser
PID 2664 is either non-existent, not owned by "joeUser" or not attached to a p/tty!
-----------------------------------
ps: unknown user joeUser
PID 23633 is either non-existent, not owned by "joeUser" or not attached to a p/tty!
-----------------------------------
ps: unknown user joeUser
PID 10700 is either non-existent, not owned by "joeUser" or not attached to a p/tty!
-----------------------------------
ps: unknown user joeUser
PID 10855 is either non-existent, not owned by "joeUser" or not attached to a p/tty!
-----------------------------------
ps: unknown user joeUser
PID 14730 is either non-existent, not owned by "joeUser" or not attached to a p/tty!
-----------------------------------


I hope you find some good use for this script!

NOTE: Please keep in mind the caveat noted above regarding the sometimes-false-positive I believe this script returns under certain circumstances when it decides a non-idle process (with nothing displayed in the idle column from "w -s" output) has been idle too long! It may never happen again and I may have been seeing spots. Just want to keep you in a "safe" mindset, just in case I'm not completely insane ;)

Cheers,


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License

#!/bin/bash

#
# rip - Kill any processes that we know have been idle for more than 45 minutes
#
# 2009 - Mike Golvach - eggi@comcast.net
#
# Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License
#

if [ $# -lt 1 ]
then
echo "Usage: $0 PID [user]"
echo "User defaults to the value"
echo "of \$LOGNAME if not specified"
exit 1
fi

PID=$1
ISITAPID=$(echo $PID | grep [A-z])

if [ ! -z $ISITAPID ]
then
echo "PID $1 contains non-numeric characters!"
echo "-----------------------------------"
exit 2
fi

PID="$1"
USER=${2:-$LOGNAME}

PIDTTY=$(/usr/bin/ps -fu $USER -o pid,tty |/usr/bin/grep -w $PID|/usr/bin/grep -v grep)

#echo DEBUG::::: PIDTTY $PIDTTY

if [ -z "$PIDTTY" ]
then
echo "PID $PID is either non-existent, not owned by \"$USER\" or not attached to a p/tty!"
echo "-----------------------------------"
exit 3
else
TTYNUMBER=$(echo "$PIDTTY"|/usr/bin/sed '/TT/d'|/usr/bin/awk -F"/" '{print $2}')
fi

if [ -z "$TTYNUMBER" ]
then
echo "PID $PID is not attached to a p/tty!"
echo "-----------------------------------"
exit 4
fi

#echo DEBUG::::: W $(w -s|/usr/bin/sed 1d|/usr/bin//awk '{if ( $2 == '"$TTYNUMBER"' ) print $0}')

TIME=$(w -s|/usr/bin/sed 1d|/usr/bin/awk '{if ( $2 == '"$TTYNUMBER"' ) print $3}')

ISITANUMBER=$(echo $TIME | grep [A-z])
if [ ! -z $ISITANUMBER ]
then
unset TIME
fi

LONGTIME=$(echo $TIME | grep [A-z])

#echo DEBUG::::: LONGTIME $LONGTIME TIME $TIME

if [ -z "$LONGTIME" -a -z "$TIME" ]
then
echo "PID $PID is OK - Not Idle At All $TIME - Remove this message!"
elif [ ! -z $LONGTIME ]
then
echo "PID $PID is ancient - Idle for $TIME... Killing $PID"
# DO_WHAT_YOU_HAVE_TO_DO_TO_THE_PID_HERE
else
TIMEIDLE=$(echo $TIME|grep -v "[:]")
# echo DEBUG::::: TIME $TIME
if [ -z $TIMEIDLE ]
then
echo "PID $PID has been idle way too long - $LONGTIME $TIME so far... Killing $PID"
# DO_WHAT_YOU_HAVE_TO_DO_TO_THE_PID_HERE
elif [ $TIMEIDLE -gt 45 ]
then
echo "PID $PID has been idle too long - $TIMEIDLE minutes so far... Killing $PID"
# DO_WHAT_YOU_HAVE_TO_DO_TO_THE_PID_HERE
else
echo "PID $PID is OK - Only idle for $TIME minute(s) - Remove this message!"
fi
fi
echo "-----------------------------------"


, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Tuesday, March 24, 2009

Finding Simple Solutions To Complex Problems In An Insane World

Hey there,

The title of today's post probably doesn't sound all that impressive. And, to my way of thinking, it really shouldn't. Oddly enough, the only reason I'm writing this post is because I had a bastard of a time figuring out a problem today and drove myself nuts in the process. I'm still working on getting it perfect, so this is (SPOILER ALERT) going to be an opinion piece today. I've been in the biz since 1995 and here I was thinkin' all the new-fangled Operating Systems would have a simple command to do everything and anything with ;)

On my way to scripting myself out of a figurative paper bag, I ran across a few other solutions to my problem involving c structures in the proc filesystem that could get you the very information I was looking for, but I wasn't looking to re-invent the wheel, either. Generally, if something exists on your Unix or Linux system (at the lower levels) it's been implemented in a tool. Sometimes finding that tool can be a challenge. Sometimes it's easy. Sometimes, I'll fret for hours wondering why simple things sometimes seem so complicated. Then I wake up the next morning and realize what a tool I was being ;)

And I've always been hyper-aware of how more-and-more-ridiculous our society has become over the last few decades. I see my fair share of panic attacks and what look like suicides-in-the-making all around me. Back when I was a kid, you'd only see that sort of behaviour in unrepentant alcoholics and good hard-working folk who'd either gotten scammed out of (or somehow, otherwise, managed to lose) their entire life's savings.

It used to be that folks like Michael Milken would have to go out of their way to ruin many many other people's lives. Now it seems like it's happening every day in every way. Was that a millisecond delay in our "Virtual Meeting" with the Hong Kong, Helsinki and Bora Bora offices??? That's all it takes now. Just suck it up and watch a few commercials on TV. All those people are so thankful that they can work no matter where they are in the world. Even on vacation! Am I the only person who finds this sickening? If you have kids, try to get them interested in psychiatry or social work. By the time they grow up, it probably won't even be dangerous anymore. The people who live in middle-to-lower-class neighborhoods will probably be much more together than the corporate elite.

13 years or so ago, I was the proud owner of a DX-50 PC. It had a 512Mb hard drive, 4 Mb's of memory and a screaming processor that I could overclock to 66Mhz. Vroooom!!!!! Some online services still required you to pay a premium to download at 9600 and watching each individual line of a picture paint itself across my screen in slo-mo was captivating. Now I sometimes want to throw my monitor out the window because it takes my computer 2 minutes to boot up. That doesn't even make sense! For some reason, it seems, the monitor is the recipient of a large percentage of misdirected rage in this computer age. The funny thing is that it's usually the only part of your setup that isn't pissing you off. The keyboard usually deserves it ;)

Hopefully (although it's not too late), you got to check out the vi with a Windows Paperclip picture we put up a few days ago. I've always wanted to trounce that thing, or at least bend it in the wrong direction. No matter what you do, it keeps finding a way to reinsert itself back into your life, along with that frickin' dog. I'm having a coronary while he's wagging his tail and licking a book (???) Hopefully we won't see any of this nonsense from OpenOffice. I use that (and UltraEdit) exclusively now. And, even though I screw up majorly every once in a while, at least I don't have to deal with the sideways looks from those to "icons" that loom just a bit too large in our modern Pantheon.

But, as we come back around to the titular portion of this rant, there's always a relatively simple way to do anything. The trick is having the patience to find it. And, even more confusing, the trick to the trick to finding it is to retrain yourself to take stock of what's really necessary in life. You have to be able to see through all the advertising, hype, jargon and pressure and realize that your own (and I'm including myself in the world now ;) insecurity is really the only product that's being trading in the marketplace today. It's like "Keeping up with the Jones's," except now it's really important and you're competing with Jones's from all over the world. What will happen if you make a mistake, or miss a call, or go on vacation to someplace that doesn't have photon beam Internet capability? The answers: You'll feel like an idiot for a little while, you'll eventually call someone back and you'll finally be able to relax on that faraway vacation isle where every asshole and his brother can't reach out and grope you :)

Time is money. In an abstract sense, this is true. I've never worked anywhere they wouldn't pay me to. I can make nothing by exerting a lot less effort ;) Just remember this: When the global economy gets to the point that we can do business in real-time from any point on this earth to any other point on this earth, what will be the next big thing that we'll have to worry about falling behind on? Think about it. There has to be a point where we can go no further (or begin colonizing other planets in our solar system so we can setup remote offices that need to get in touch with us "RIGHT NOW!!!!").

When we reach that point; when we touch that apex, it's going to either be a huge load off of all of our shoulders or (and, my money's on this, since there's money to be made preying upon people's fears) we'll be on a slippery slope to our "global community"'s nadir.

In the meantime, if life seems to hard, consider that you might just be allowing yourself, or others, to be too hard on you. Take a deep breath, relax and come to realize that (if you can get to that place of peace) you're probably going to be the one guy out of 1000 that doesn't have a stress-related breakdown this year. A long time ago, people traded things other than money for services, at a much slower pace, and the fact that we're here now is proof that they didn't screw up all that badly. You see, there's an upside to not freaking out, too.

Simple :)

Cheers,

, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Monday, March 23, 2009

How To Modify Live VCS Clusters On Linux And Unix Without Completely Alienating Your Co-Workers

Hey there,

A happy Monday to everyone and here's hoping you had a halfway-decent weekend. There's nothing quite like 2 days of misery leading up to a Monday, which is why I like to do something fun every weekend that I can. This weekend I slept a lot. It was a blast. I'm awake now, but I'm getting right back up on that horse as soon as I'm done proofing this post ;)

If you work on Veritas Cluster Server a lot (VCS from here on in, and on out ;), or even if you just work on them a little (or sit next to someone who does), you probably know how incredibly annoying it can be to make fixes to them once you've got them up and online. They are, almost by definition, expected to provide high availability for whatever twisted perverse software you run on them and "high availability" almost always translates into "high accountability." Even though a perfectly constructed cluster (of however many individual nodes) is supposed to "make your life easier," I've found that this is hardly ever the case. People tend to go ape$h1t whenever a cluster node experiences an issue, even when it's something as mundane as a flip-flop between NIC's on a single node that are sharing a local floating-IP that is but one of many virtual IP's that the entire cluster uses to float its own virtual IP on top of. If you work on clusters, you're statistically likely to be either incredibly apathetic about these sorts of issues, or a ring-tone away from a total nervous breakdown ;)

In the event that you do have to work on an active Veritas cluster (whether on Linux or Unix or even Windows), there are a few ways you can keep those alarm bells (that you have no choice but to trigger in order to get your work done) from reaching their intended recipients. The optimal end result, of course, being that you get your job done in a timely and efficient manner and your co-workers (who aren't in your department and don't need to know - HUGE ASSUMPTION ;) don't lose any sleep worrying about the money the company is conceivably losing while nothing really "bad" is happening.

We'll run down this short list in order from "best practice" to crude severing of email and other notification avenues and/or faking "upness." You may need to do the group of "last things" on this list, no matter how great your setup is. If you're working in an environment where outside agents (HP OpenView, etc) report on your cluster's condition, turning off notification the proper way may not do you any good at all, anyway ;)

Prior to making any changes to your main.cf (assuming you're just changing it with the intention of restoring it to its exact same state), I would recommend either saving off your main.cf file using simple copy:

host # cp main.cf main.cf.`hostname`.`date +%m%d%y%H%M%S`

or creating a command file from your cf file, so that you can just rebuild it again if anything terrible happens, like so:

host # hacf -cftocmd . <-- assuming you're in the main Veritas configuration directory. If you're not, and even if you are, you can pass the hacf command the full path to your configuration directory, if you prefer:
host # hacf -cftocmd /etc/VRTSvcs/conf/config <-- The same thing as the previous command, in most situations.

You can then use the .cmd file to recreate your .cf file using the "-cmdtocf" argument, with the rest of the hacf command generally remaining the same.

1. The first thing you would "normally" do (on the primary node in your cluster), would be to make sure that VCS's notification system redirects its error messages to an alternate source (like your email instead of everyone else's). So, assuming you have the NotifierMngr resource set up to mail errors to everyone@mycompany.com, you can easily modify this without alarming anyone, like so:

host # haconf -makerw
host # hares -modify NotifierMngr SmtpRecipients "me@nobodyelse.com = Error"
host # haconf -dump -makero


Of course, your setup may be more, or less, complicated, but that should give you the gist of it.

2. The next thing you should consider (still going by the VCS playbook) is that some of your resources may have specific "owners" assigned. These people will be notified of issues with certain resources, as well, and should be neutralized whether or not it actually makes a difference ;) You can do this in approximately the exact same way, like this:

host # haconf -makerw
host # hares -modify SOME_RESOURCE ResourceOwner "me@nobodyelse.com"
host # haconf -makero


3. Those first two steps will only insulate you from VCS's notification services. You still have to worry about the system's you're running on. This step can be a big headache (especially when you see what's coming up) and won't always do you any good. I'm including it for completeness' sake. The next thing you could do, would be to disable sendmail/mail on your systems. This can be done a variety of different ways, depending upon what OS and version you're running, but (generally), this simple method should do the trick:

a. Determine the PID's of your active sendmail/mailer daemons and kill them. Alternately, play it safe and shut them down with their respective shutdown scripts (not as much fun ;)
b. Move your sendmail/mail configuration files from their regular location to a different one (or just rename them) so that they can't restart if they try to.
c. Make sure nothing is running on your system, in cron, perhaps, (like cfengine) that will pro-actively fix the problem you've created to keep mail from getting off of the servers.

4. As noted, step 3 can be a hassle and may be a huge waste of time. Now you have to consider that outside, and inside, systems may also be monitoring your cluster's health. Programs like HP OpenView may have agents installed on your local systems and other computer systems may be set up outside of your cluster in order to test its availability from alternate locations. Any combination of these may still trigger errors and alerts once you start faulting a VCS resource on purpose.

At this point you can go with what might be the best option of them all (combined with steps 1 and 2): Stop your cluster dead, properly, so that none of the resources show as faulted or are affected in any way. You can do this very simply by running the following on your cluster's primary node:

host # hastop -all -force

This will bring down your cluster so that you can work on the configuration file, etc, but will not take down any of the resources that it manages (which should buy you some time, since none of them will report as "faulted"). There's still the outside chance that someone (or something) may be monitoring the "state" of your cluster externally. In that cause, you're probably screwed and should let whomever's keeping tabs know what you intend to do so that they will feel more comfortable ignoring it ;) Keep in mind, of course, that, when you bring your nodes back online, you should try to bring up the primary node first. This will avoid any issues with your primary node doing a "remote build" of its configuration file from any secondary, tertiary, etc, nodes. Again, depending on the complexity of your setup, this may make very little difference.

5. (Bonus step - Not recommended) Some folks prefer to "freeze" the cluster when they want to do maintenance. This is perfectly acceptable practice, as it will allow you to operate on resources within service groups in VCS without risking taking down the cluster or causing undesired failover of resources. It should be noted, however, that if you cause any of the basic functionality built into a resource to fail (which may happen if you're re-tooling a resource to a good degree), that resource will still show up in VCS as faulted until its basic functionality (as understood by VCS) is restored. This faulted state will most probably cause the NotifierMngr and ResourceOwner's to be notified of the error condition!

And, there you have it. I realize this was somewhat of a quick glossing over of the subject, but, as those of us who work on VCS have probably quite-painfully learned, trying to explain VCS (even at the most basic level) can become a huge task if you want to cover every possible aspect or angle. That's why so many 5000 page books are published every year on how to open the box and find the serial number ;)

Cheers,

, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Sunday, March 22, 2009

Funny Unix and Linux Pictures: Almost Windows

Hey there,

For this week's lazy Sunday post we went and found some great Unix/Linux OS photos at fiveprime.org. Some are jokes and some are just interesting ads from an earlier time. Check out the whole shebang if you have the chance :)

My favorite is the "vi paperclip" ;)

Enjoy your Sunday, I'm a little too tipsy to type much more. I can't believe I'm actually staying up to finish this post ;)

Cheers,



NOTE: If you miss the animation in the first photo of the vi Windows paperclip, just refresh the page. You won't be sorry :) It should run in a loop by default











, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Saturday, March 21, 2009

Google Logos Galore

Hey there,

This, being the day before this post says it got published, is officially the first day of Spring!

What does that mean, exactly? Not a whole lot. From what I can tell, in Chicago, on March 19th, 2009 the weather outside is around 50 degrees and balmy. Beginning March 20th, the weather outside becomes around 50 degrees and chilly ;)

Since Google continues to increase its cache of bizarre logos and tributes to various times of year, famous peoples' birthdays and anniversaries of events that may or may not be important to any of us, I thought this collection of Google logos on marketingjive.com

I've included a selection below, but check out the site. It's got pretty much every one they've ever put out - up to a certain point in time (which I can't tell yet because I don't think they're officially done doing this).

Enjoy :)



Top 30 Google Logos 1998 - 2008

#30. Google Beta - yup the one that started it all.

#29. 125th Birthday of Walter Gropius - one of the pioneering masters of modern day architecture.

#28. Google's First Holiday Season - this one is from 1999 according to the Google archive.

#27. Google Celebrates its Fifth Birthday 2003 - Five Years ago, Google was half its age and was still and infant in Search that would quickly go through adolescence and become the mature, innovative "adult of Search" that it is today.

#26. Nasa's 50th Birthday - Google celebrates Nasa's 50th birthday in style.

#25. 50th Anniversary of Understanding DNA - Google is known to display logos coinciding with historic events or birthdays. On April 25th, 2003, Google displayed this logo in celebrating the 50th anniversary of mankind understanding Deoxyribonucleic acid better known as DNA.

#24. St. Patrick's Day 2008 - Google has been known to have some great logos for the many holidays celebrated around the world. This one is from earlier this year as Google celebrated St. Patrick's Day on March 17th. #23. Alexander Graham Bell Birthday Logo - another one from earlier this year was, when on March 3rd, Google celebrated famed Canadian (although born in Scotland) Alexander Graham Bell, the man who invented the telephone.

#22. Dragon Boat Festival - annother favorite of ours is the Google logo for the Dragon Boat festival from June 15 2002.

#21. St. Patrick's Day 2007 - we must have something for St. Patrick's Day because this is the second logo on our list for the St. Patrick's Day holiday. Maybe it has something to do with the green font on the white background.

#20. Fouth of July 2008 - Rounding out the bottom third of our list is Google's logo for Independance Day for our friends in the US. Great logo. #19. Google Celebrates Einstein's Birthday - on March 14th, 2003, Google paid tribute to one of the greatest men in history as they celebrated Albert Einstein's birthday.

#18. 5oth Anniversay of Lego - on January 28th of this year Google celebrated the 50th anniversry of one of the coolest inventions ever... Lego.

#17. Anniversary of the first ascent of Mount Everest - On May 29, 2008 Google displayed this very cool logo representing the anniversary of the first ascent of Mount Everest. Very cool. #16. Halloween 2001 - Google's had some pretty fun Halloween logos, but this one is plain and simple which is why we like it the most of the sppoky Halloween logos that Google has displayed over the years.

#15. Leap Year 2004 - as we reach the mid-way point of our list, we have a logo featuring what else? Yes a couple of frogs celebrating February 29, 2004.

#14. Opening of 2008 Beijing Summer Olympics - last month, Google displayed this fantastic logo signifying the start of the 26th Summer Olympiad, touted as the most expensive in the history of the Olympics.
#13. Veteran's Day 2007 - honoring those who have worked so hard to give us the freedom that we enjoy today. Thank you to all of the men and women in both the US and Canada who have served to protect our great land.

#12. Ray Charles Birthday - on September 23, 2004 Google displayed this logo in honor of the late Ray Charles.

#11. Father's Day 2006 - Google has had many great Mother's Day and Father's Day logos. We selected this one as our favorite. #10. Independance Day 2007 - Entering the top 10 we have this powerful logo celebrating July 4th in the US.

#9. Mozart's 250th Birthday - on January 2006, Google paid tribute to the great Wolfgang Amadeus Mozart celebrating his 250th birthday. you know that you are great when people are still celebrating your 250th birthday.

#8. da Vinci's Birthday 2005 - on April 15th, 2005, Google paid tribute to Leonardo Da Vinci with this creative logo.

#7. Google Loves Canada - love this logo clebrating Canada Day on July 1, 2001. Canada loves Google too eh!
#6. Google Earth Day 2008 - on April 22, 2008, Google was spreading the news about the green initiative with this inspiring Earth Day logo.

#5. Picasso's Birthday - on October 25th, Google celebrated Pablo Picasso's birthday with this intersting logo.

#4. Michelangelo's Birthday - Keeping with the artists theme, Google displayed this logo on March 6th, 2003 in honor of Michelangelo's birthday.

#3. National Teachers Day 2005 - where would we be without out teachers? On May 3, 2005, Google paid tribute to all of the teachers out there with this classsic logo.

#2. Martin Luther King times two - This man is important that we actually had a tie with two of Google's logos for Martin Luther King Day. From 2006 and 2008, Google paid tribute to the man who was an inspiration to many.
#1. Edvard Munch Birthday 2006 - this is simply our favorite Google logo. There is something about it that just seems to be mesmerizing. This logo was featured on December 12, 2006. Featuring "The Scream", this composition was created by Edvard Munch and is said by some to symbolize the human species overwhelmed by an attack of existential angst.

Well there you have it our favorite Google logos from the past ten years. We look forward to seeing many more of these Google logos. In fact we have some ideas for some additional Google logos:

  • Logos paying tribute to the Beatles
  • Logo paying tribute to the original members of the band Kiss (the four original members released solo albums on the same day in September 1978... thirty years ago this month.)
  • a logo paying tribute to the 100th Anniversary of the Montreal Canadians who will be celebrating this milestone this year

  • Heck I'd like to see Google celebrate my birthday on June 21st.... lol well ok but we're still excited to see some of their future logos.

Labels: ,





, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Friday, March 20, 2009

Another Linux/Unix Flash Movie. Still Funny?

Hey there, Another rough night fuggling with VCS - That'll be my next post. Tonight, I'm going to tap a brain cell and leak whatever's left of it down the drain ;) Enjoy this little animation from ubergeek.tv. I know it's dated, which is why it caught my attention (lots of flames out there). I figure anything that upsets so many people so deeply "needs" to be on this blog at least once ;) Cheers,



From Ubergeek.tv




, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.

Thursday, March 19, 2009

Getting Started With CFEngine's cfagent.conf On Linux And Unix

Hey there,

It seems like a year since primary on-call duty came by and knocked me off my high horse. But, as they say, there's a reason we all fall down. It's so that we can learn how to deal with public humiliation, come to know the true meaning of pain and realize just how alone we are in this great big universe ;) It's either that or so we can learn to get back up again. The latter seems more like common sense, though. Does anyone reading this blog need to learn this lesson (after the first time)? Unless you prefer sitting, or spending your days sprawled-out face-down in the dirt, the odds are you've gotten back up before and you'll get back up again. Class dismissed ;)

Today, we're going to take a look at cfengine. It's a great (free) program that can be used for a wide variety of things (think Tripwire, Nagios, Jumpstart/Kickstart pre/post installation helper, etc). It's a decent product, that can be configured to run as simple or as complicated a setup as you prefer. So far, in my usage, I've come nowhere near the boundaries that cfengine can reach. I don't even know, for sure, what those boundaries are, since (as a side effect of not using it to its fullest potential) I've never fully tested its limitations.

That being what it is (and that is what it's being ;), today we're going to look at a skeleton of a configuration file for the cfagent program (part of cfengine, along with cfenvd, cfrun, etc). The cfagent program will allow you to hit the ground running and begin to see what cfengine can do for you. We're assuming that you've already installed cfengine, which you can download from cfengine.org or, if you're running a Linux distro, odds are there are packages available for your OS. That means that installation could just be a command away ("rpm -i cfengineXXXX.i386.rpm" or something like that). If you do need to build it from source, we'll be sure to address that in a future post on this blog. For now, we'll just assume it's already been installed, you can install it or you can find someone who'll install it for you. Whatever works for you. We'll wait ;)

For today we're going to go with a setup so basic, it may be of no use whatsoever. But, then, that's not the point :) The file we'll be using today (to control the actions of cfagent) is called cfagent.conf (Don't be fooled by the clever name ;) If we make a very simple configuration file, we could get away with this (running on a single server, just to keep things simple):

control:
any::
actionsequence = (
files
)

files:
any::
/bin/ls=0555 owner=root group=bin action=fixall


And that's it. We're not going to get into specifics (like we'll only run this command on this set of servers, etc). The basic structure of our config file is pretty much this:

CONTROL:
CLASSES::
VARIABLE/FUNCTION = (
LIST OF FUNCTION OR FACILITY VALUES
)

FACILITY:
CLASSES::
OBJECT RULES


So, today, we'll just examine what those CAPITALIZED references mean. Obviously, they correspond to the contents of the cfagent.conf above :) It should be noted that all of the names we used in our sample configuration file are special to cfengine and don't require any specific definition by you. The only line where we really take any license is with the OBJECT RULES. More on that below!

CONTROL: This is a fundamental section in any cfagent.conf. Without this section, your cfagent run won't do anything! The colon at the end of the name "control:" indicates that the name is complete. This form can also be used to assign a value to a variable under a different context using a slightly different form (not to confuse... sorry; getting off track)

CLASSES:: This is sometimes known as "GROUPS::" although both will work at version 1.4 and above if I'm anywhere near correct. Hopefully you won't have to go back that far to get your setup working (They're on 3.x right now). The CLASSES:: variable ends with a double colon. Here we're not being as specific as we could be. We're setting the class "any" so that any defined class will match and cause cfagent to proceed deeper within the nest. "any" is pre-defined by cfagent and matches everything.

VARIABLE/FUNCTION In the CONTROL section, this generally defines the main flow of execution for the remainder of the configuration file. In this instance our variable is named "actionsequence." The "actionsequence" FUNCTION simply contains a list of the FACILITY's (pardon the improper spelling :) that we want to run through and in what order. If we had more than one, they would generally be listed one per line and executed from top to bottom. The FACILITY we chose to use is called "files"

FACILITY: As we noted just above, the FACILITY: "files" is the first FACILITY: executed by our CONTROL: section above. The "files" keyword (as also noted higher above) is special to cfagent and denotes files ;) That is to say that there are a lot of basic options you can choose from when you define your files and built-in actions that you can use simply by calling them within the configuration. The FACILITY: definition ends with a colon.

CLASSES:: This works the same as above. "any"-thing will match this!

OBJECT RULES This is where we set up what we want to do with our "files." We've only selected one rule, and it breaks down like this. In standard format (and this can get very complicated if you like) your rule definition would begin with the "name" of the file you wanted to act upon, followed by specific "attributes" of that file (if necessary/required) and the "action(s)" you intend to take upon it. So our RULE

/bin/ls=0555 owner=root group=bin action=fixall


could be picked apart like so:

file name = /bin/ls
file permissions = 0555
(Note that we took the liberty of jamming the file "name" and an "attribute" together here: /bin/ls=0555. This could also be written as: /bin/ls mode=0555)
file "attribute" owner = root
file "attribute" group = bin
file "action" = fixall
<-- Again, this action is built-in and instructs cfagent to fix any problems it finds with /bin/ls. Those problems would be defined as "attributes" of the actual file that differ from the "attributes" set for it in our cfagent.conf rule.


So if we did a:

host # ls -l /bin/ls
-rwxr-xr-x 1 root root 174432 Nov 8 14:21 /bin/ls


and then we ran:

host # cfagent -qv ( the -q option turns of host sleeping - also referred to as "splaying" (???) and the -v option is the expected "verbose")

it would find /bin/ls, note that the file mode was incorrect (0755 instead of 0555 and with the group root instead of bin) and fix that for us, so that our next ls would show the "corrected" file, like so:

host # ls -l /bin/ls
-r-xr-xr-x 1 root bin 174432 Nov 8 14:21 /bin/ls


And I think that's enough for today. You never realize how much there is to explain about something until you try to explain it. But, as they say, that's why we learn to fall down... Or am I mixing my metaphors again? ;)

Cheers,

, Mike




Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!



Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.