Brian Tanaka

Published Articles
This article originally appeared in the December 1998 issue of Sys Admin Magazine.

Writing Maintainable Perl Programs and Shell Scripts

Brian Tanaka


It's all too common for system administrators, especially novices, to fall into the trap of writing sloppy Perl programs and shell scripts. Sloppy programs and scripts are problematic because, while they may function perfectly well, they are harder to maintain, pass on to others, enhance, and debug than tidy, well-organized scripts.

Fortunately, by using a handful of simple techniques, you can write tidy programs and scripts, saving time and aggravation in the long run--while simultaneously finishing your scripts with speed and efficiency in the short run. By using these techniques you will make your job and the jobs of your colleagues easier.

This article is not intended as an exhaustive exploration of good programmatic style. Rather, it describes some techniques that have worked well for me over the years. They are easy to use and go a long way toward making programs and scripts easier to read and maintain.

The techniques I will cover can be summarized as follows:

  • Be consistent

  • Include an informative prologue

  • Use whitespace effectively

  • Comment your code

  • Be terse with caution

  • Include documentation


Consistency of Style

How you indent, line up your braces, use whitespace, place comments, and so on constitutes your style. How you decide to handle each element of your style is up to you, provided it is logical. But, once you find a style you like, use it consistently.

For instance, if you like to brace your while loops in Perl with the opening { on the same line as the while like this:

    while <FOO> {
	&dostuff;
    }

then always do it that way. If you are ever faced with the urge to write:

    while <FOO>
    {
	&dostuff;
    }

resist it. Being reliably consistent throughout your programs and scripts significantly increases readability.


Prologue

The beginning of your program or script is very important. A quick review of the first dozen lines or so should provide the reader with certain critical pieces of information that will help her interpret or modify the code.

Here is an example of a good prologue:

#!/usr/bin/perl
# diskhogs - This script generates a report of the ten users who use
#            the most diskspace on this host. 
#
#   Author: Brian Tanaka
#     Date: Mon Jul  6 18:21:14 PDT 1998
#

By skimming this handful of lines, the reader knows which interpreter is being used (Perl), what the name of the script is (diskhogs), what it is supposed to do (report the top disk hogs), who wrote it, and when it was written.

The statement of purpose is particularly important. Don't make your reader read through your whole script just to find out what it's supposed to do. Be as verbose as you need to be, but make sure you have a good clear summary of the purpose of the program.

If revisions are made, use this section to document who made the revisions, when they were made, and provide a brief description.

The top of your program or script is also a great place to put special instructions to the reader. For instance, this is a great place to note that the program uses a configuration file that can be modified to customize behavior, or to mention which variables to check and modify if the program or script is moved to a new host.


Whitespace

Liberal and logical use of whitespace is one of the most effective ways of making your programs and scripts more readable. Be aware of the amount of whitespace between words, between lines, and between groups of lines. Don't crowd everything together.

For instance, you can emphasize the boundaries between functional or logical units within your program by increasing white space between them.

So, instead of:

...
# Write a timestamp to the log file to show when the script ran
$date = `date`;
chomp($date);
open (LOGFILE, ">>$logfile") || die "could not open $logfile\n";
print LOGFILE "$date: ";
# Count how many times the string "deblobulator.html" appears
# in the web access log
open (ACCESS, "$access") || die "could not open $access\n";
while ($line = <ACCESS>) {
    $count++ if ($line =~ m/deblobulator\.html/i);
}
close (ACCESS);
# Write the result to the log file
print LOGFILE "deblobulator.html hits: $count\n";
close (LOGFILE);
...

you might do something like this:

...
#
# Write a timestamp to the log file to show when the script ran
#
$date = `date`;
chomp($date);
open (LOGFILE, ">>$logfile") || die "could not open $logfile\n";
print LOGFILE "$date: ";



#
# Count how many times the string "deblobulator.html" appears
# in the web access log
#
open (ACCESS, "$access") || die "could not open $access\n";
while ($line = <ACCESS>) {
    $count++ if ($line =~ m/deblobulator\.html/i);
}
close (ACCESS);



#
# Write the result to the log file
#
print LOGFILE "deblobulator.html hits: $count\n";
close (LOGFILE);
...

Although this is a trivial example, you can see that the second example is easier to understand because the functional groups are visually differentiated rather than run into one another. In a long complex program, visual clarity becomes even more critical.

Indentation is another important form of whitespace. A chunk of code like this:

while <FOO> {
    &dostuff;
    if ( $stuff ne "blah" ) {
	&domorestuff;
    }
}

is much easier to read than:

while <FOO> {
&dostuff;
if ( $stuff ne "blah" ) {
&domorestuff;
}
}

Be as predictable as possible so that your use of whitespace reliably means something to the reader (even if only subconciously). Given that, you should pay attention to your use of whitespace within single lines as well.


Comments

Inheriting a long, complicated program or script that lacks useful comments is aggravating. Comment your code. Too many comments are better than not enough.

I adhere to the following guidelines:

  1. Comments that introduce a new, major section of code are formatted as follows:
        #
        # Description of following section goes here. It's as detailed
        # as the situation requires.
        #
    

    Note that I use at least three blank lines before this type of comment in order to set it apart from the section above it since it heralds a new distinct logical chunk.

  2. Comments that explain a line or set of lines precede the line or lines and are formatted as follows:
        # Explanation of following line or lines goes here.
    

  3. Very short descriptive comments can be included on the same line, like so:
        $datafile = "/home/btanaka/data"; 		# Important data file
        $tmpfile = "/var/tmp/deblobulator.$$";	# Temporary blob file
    

    Note that the short comments line up.

A richly commented program is easier to understand and modify. Get into the habit of explaining how each section of your code works. If you're ever in doubt about whether a section or line requires a comment, go ahead and write one. It's better to err on the side of too much information than not enough.


The Long Road vs. The Short Road

As the Perl slogan says, "There's More Than One Way To Do It." This is often true in shell scripts as well. Some ways of doing a given task are more terse than others.

It's up to you to decide how terse any given chunk of code should be. However, bear in mind that in terms of readability, it's better to be less terse unless there's a compelling reason to do otherwise.

If you do have a compelling reason to be very terse you can mitigate the negative effect on readability with good, clear, concise comments.

In general, I prefer to use a less terse style unless by doing so I incur some performance cost that I cannot afford.


Documentation

A well-written program or script is not complete without an appropriate amount of documentation. In many cases the comments in the code are enough. In other cases more thorough documentation is called for and it's worth taking the time to create a README file, a man page, or POD format documentation.


Summary

Be consistent, always include an informative prologue, use whitespace effectively, comment your code, be terse with caution, and include documentation. By doing these simple things you will make your Perl programs and shell scripts much easier to read and maintain.

If you would like to learn more about stylistic principles in programming, many books about programming devote at least a short section to it. For Perl programmers, _Progamming Perl, 2nd Edition_ by Wall, Christiansen, and Schwartz from O'Reilly and Associates covers style issues in Chapter 8: Other Oddments.


- 30 -