Dockercraft

I woke up this morning not feeling like I was enough of a hipster admin so I decided to write this keyword..er.. educational post about docker.

Continue reading

Advertisements

When strace isn’t enough Part 1

An important tool in any linux admin’s toolkit is the venerable strace command. It enables us to get insight into what a program is actually doing. As awesome as strace can be, it doesn’t tell us everything. This series of articles will get you familiar with some of the other commands and approaches to gain insight into program execution.

Continue reading

Quick Tip: View view linux process limits

I have on several occasions needed to troubleshoot issues which wound up being problems with linux limiting the number of open files for a given process. This can be an annoying issue to troubleshoot since many programs do not gracefully handle this condition and linux does not provide log information which indicates the situation by default. This really applies to all of the linux process limits, not just open files.

So what do we do about it?

When you google the problem, you will typically find references to running ulimit as the service user to determine what the existing limits are. You will quickly discover, that this doesn’t work. For one thing, most service users don’t have shells. Additionally, as you will see in my next post, many services already have configuration which attempts to set the limits on program startup.

Enter the proc filesystem…

cat /proc/805/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             7807                 7807                 processes 
Max open files            1024                 4096                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       7807                 7807                 signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us

805 is the pid of the process you want to check.

Have fun…

Bash Nagios plugin

Today lets have a look at one way to construct a nagios plugin in bash. I would usually write these in perl, but sometimes that is not possible. This plugin is actually written to be executed using NRPE.

#!/bin/bash
# bash nagios plugin

###
# Variables
###
OK=0
WARNING=1
CRITICAL=2
UNKNOWN=-1
TO_RETURN=${OK}
TO_OUTPUT=''

# Print usage information and exit
print_usage(){
    echo -e "\n" \
    "usage: ./check_uptime -w 20 -c 30 \n" \
    "\n" \
    "-w <days>    warning value\n" \
    "-c <days>    critical value\n" \
    "-h           this help\n" \
    "\n" && exit 1
}

###
# Options
###

# Loop through $@ to find flags
while getopts ":hw:c:" FLAG; do
    case "${FLAG}" in
        w) # Warning value
            WARNING_VALUE="${OPTARG}" ;;
        c) # Critical value
            CRITICAL_VALUE="${OPTARG}" ;;
        h) # Print usage information
            HELP=1;;
        [:?]) # Print usage information
            print_usage;;
    esac
done

###
# Functions
###

log_date(){
    echo $(date +"%b %e %T")
}

error() {
    NOW=$(log_date)
    echo "${NOW}: ERROR: $1"
    exit 1
}

warning() {
    NOW=$(log_date)
    echo "${NOW}: WARNING: $1"
}

info() {
    NOW=$(log_date)
    echo "${NOW}: INFO: $1"
}

# Do something
get_cmd_output(){
    #generate output
    echo `uptime | sed 's/.*up \([0-9]*\) day.*/\1/'` || error "failed to run command"
}

###
# Program execution
###
[ "${HELP}" ] && print_usage

if [ ${WARNING_VALUE} ] && [ ${CRITICAL_VALUE} ]
then
    CMD_OUTPUT=$(get_cmd_output)
else
    print_usage
fi

if [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${CRITICAL_VALUE} ]
then
    TO_RETURN=${CRITICAL}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${WARNING_VALUE} ]
then
    TO_RETURN=${WARNING}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt 0 ]
then
    TO_RETURN=${OK}
else
    TO_RETURN=${UNKNOWN}
fi

if [ $TO_RETURN == ${CRITICAL} ]
then
    TO_OUTPUT="CRITICAL "
elif [ $TO_RETURN == ${WARNING} ]
then
    TO_OUTPUT="WARNING "
elif [ ${TO_RETURN} == ${OK} ]
then
    TO_OUTPUT="OK "
else
    TO_OUTPUT="UNKNOWN "
fi

TO_OUTPUT="${TO_OUTPUT}| uptime=${CMD_OUTPUT};$WARNING_VALUE;$CRITICAL_VALUE"

echo "$TO_OUTPUT";
exit $TO_RETURN;

Lets break it down…

OK=0
WARNING=1
CRITICAL=2
UNKNOWN=-1

We define some readable names for the return codes.

TO_RETURN=${OK}

Set the initial return value to OK.

# Do something
get_cmd_output(){
    #generate output
    echo `uptime | sed 's/.*up \([0-9]*\) day.*/\1/'` || error "failed to run command"
}

Function to obtain the value we want to check. In this case uptime.

if [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${CRITICAL_VALUE} ]
then
    TO_RETURN=${CRITICAL}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt ${WARNING_VALUE} ]
then
    TO_RETURN=${WARNING}
elif [ "${CMD_OUTPUT}" ] && [ ${CMD_OUTPUT} -gt 0 ]
then
    TO_RETURN=${OK}
else
    TO_RETURN=${UNKNOWN}
fi

Check the value of uptime against our warning and critical values.

if [ $TO_RETURN == ${CRITICAL} ]
then
    TO_OUTPUT="CRITICAL "
elif [ $TO_RETURN == ${WARNING} ]
then
    TO_OUTPUT="WARNING "
elif [ ${TO_RETURN} == ${OK} ]
then
    TO_OUTPUT="OK "
else
    TO_OUTPUT="UNKNOWN "
fi

Set the visible output of the plugin. This output is not used by nagios.

TO_OUTPUT="${TO_OUTPUT}| uptime=${CMD_OUTPUT};$WARNING_VALUE;$CRITICAL_VALUE"

Construct the output string according to the nagios plugin developer guidelines.

Stay tuned. The perl version will be out soon.

For more information see:
http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201

Auto Deploy .bashrc to Linux Servers with Bash

Any one who has been sysadmin with more than about 10 servers under their control knows how big a pain it can be to set up your environment on each box. I wrote this shell script to deploy my .bashrc file to some servers that I have an account on. Everything is pretty simple. My favorite feature is the versioning. This script can be used to deploy or set up pretty much anything because you can run any commands you wish with ssh.

The first thing we do is check to see if ssh key authentication has been set up. If key authentication is good we check to see if a scratch directory exists and creates it if it does not. After that we copy the existing .bashrc to the scratch directory and append the date to the filename. Once all that is complete we copy the new file in place. Pretty simple, but incredibly useful(at least to me).

#!/bin/bash
#Deploy some useful files to a new server you have access to.

#Loop through the server hostname/ips in the serverlist file
for server in `cat ./serverlist`; do

	# Quick check to make sure the ssh key has been deployed to the server
	if `ssh -o BatchMode=yes $server 'uptime 2>&1' | grep -q average`; then

		#check to see if there is a scratch directory and create one if it does not
		scratch_result=`ssh $server 'if [ -d ~/scratch ]; then echo 0; else echo 1; fi'`
		if [ $scratch_result == 1 ]; then
			#create the directory
			ssh $server 'mkdir ~/scratch'
			if [ $? != 0 ]; then
			echo "Directory Creation Failed on $server"
			exit 0
			fi
		elif [ $scratch_result != 0 ]; then
			echo "Directory Check Failed on $server"
			exit 0
		fi
		#copy the bashrc file to the server
		echo -n "Copying bashrc to backup on $server..."
		ssh $server 'cp .bashrc ~/scratch/.bashrc`date +%F`.bak'
		if [ $? == 0 ]; then
			echo " Success"
			echo "Copying new bashrc to $server..."
			scp ./.bashrc $server:
		else
			echo " Fail"
		fi
	fi
done

Difficult People

Most of us have at one point or another encountered someone we truly did not work well with. These people range from minor annoyances to the bane of existence. It is very easy to place the fault of these failed work relationships squarely on the shoulders of the other without taking any responsibility. My point here is not to point fingers, but rather to point out a missed opportunity for self betterment. Continue reading