It's all too common for system administrators, especially novices,
to fall into the trap of writing sloppy Perl programs and shell
scripts. Sloppy programs and scripts are problematic because, while they may
function perfectly well, they are harder to maintain, pass on to others,
enhance, and debug than tidy, well-organized scripts.
Fortunately, by using a handful of simple techniques, you can write tidy programs and scripts, saving time and aggravation in the long run--while simultaneously finishing your scripts with speed and efficiency in the short run. By using these techniques you will make your job and the jobs of your colleagues easier.
This article is not intended as an exhaustive exploration of good programmatic style. Rather, it describes some techniques that have worked well for me over the years. They are easy to use and go a long way toward making programs and scripts easier to read and maintain.
The techniques I will cover can be summarized as follows:
For instance, if you like to brace your while loops in Perl with the opening { on the same line as the while like this:
while <FOO> { &dostuff; }
then always do it that way. If you are ever faced with the urge to write:
while <FOO> { &dostuff; }
resist it. Being reliably consistent throughout your programs and scripts significantly increases readability.
Here is an example of a good prologue:
#!/usr/bin/perl # diskhogs - This script generates a report of the ten users who use # the most diskspace on this host. # # Author: Brian Tanaka # Date: Mon Jul 6 18:21:14 PDT 1998 #
By skimming this handful of lines, the reader knows which interpreter is being used (Perl), what the name of the script is (diskhogs), what it is supposed to do (report the top disk hogs), who wrote it, and when it was written.
The statement of purpose is particularly important. Don't make your reader read through your whole script just to find out what it's supposed to do. Be as verbose as you need to be, but make sure you have a good clear summary of the purpose of the program.
If revisions are made, use this section to document who made the revisions, when they were made, and provide a brief description.
The top of your program or script is also a great place to put special instructions to the reader. For instance, this is a great place to note that the program uses a configuration file that can be modified to customize behavior, or to mention which variables to check and modify if the program or script is moved to a new host.
For instance, you can emphasize the boundaries between functional or logical units within your program by increasing white space between them.
So, instead of:
... # Write a timestamp to the log file to show when the script ran $date = `date`; chomp($date); open (LOGFILE, ">>$logfile") || die "could not open $logfile\n"; print LOGFILE "$date: "; # Count how many times the string "deblobulator.html" appears # in the web access log open (ACCESS, "$access") || die "could not open $access\n"; while ($line = <ACCESS>) { $count++ if ($line =~ m/deblobulator\.html/i); } close (ACCESS); # Write the result to the log file print LOGFILE "deblobulator.html hits: $count\n"; close (LOGFILE); ...
you might do something like this:
... # # Write a timestamp to the log file to show when the script ran # $date = `date`; chomp($date); open (LOGFILE, ">>$logfile") || die "could not open $logfile\n"; print LOGFILE "$date: "; # # Count how many times the string "deblobulator.html" appears # in the web access log # open (ACCESS, "$access") || die "could not open $access\n"; while ($line = <ACCESS>) { $count++ if ($line =~ m/deblobulator\.html/i); } close (ACCESS); # # Write the result to the log file # print LOGFILE "deblobulator.html hits: $count\n"; close (LOGFILE); ...
Although this is a trivial example, you can see that the second example is easier to understand because the functional groups are visually differentiated rather than run into one another. In a long complex program, visual clarity becomes even more critical.
Indentation is another important form of whitespace. A chunk of code like this:
while <FOO> { &dostuff; if ( $stuff ne "blah" ) { &domorestuff; } }
is much easier to read than:
while <FOO> { &dostuff; if ( $stuff ne "blah" ) { &domorestuff; } }
Be as predictable as possible so that your use of whitespace reliably means something to the reader (even if only subconciously). Given that, you should pay attention to your use of whitespace within single lines as well.
I adhere to the following guidelines:
# # Description of following section goes here. It's as detailed # as the situation requires. #
Note that I use at least three blank lines before this type of comment in order to set it apart from the section above it since it heralds a new distinct logical chunk.
# Explanation of following line or lines goes here.
$datafile = "/home/btanaka/data"; # Important data file $tmpfile = "/var/tmp/deblobulator.$$"; # Temporary blob file
Note that the short comments line up.
A richly commented program is easier to understand and modify. Get into the habit of explaining how each section of your code works. If you're ever in doubt about whether a section or line requires a comment, go ahead and write one. It's better to err on the side of too much information than not enough.
It's up to you to decide how terse any given chunk of code should be. However, bear in mind that in terms of readability, it's better to be less terse unless there's a compelling reason to do otherwise.
If you do have a compelling reason to be very terse you can mitigate the negative effect on readability with good, clear, concise comments.
In general, I prefer to use a less terse style unless by doing so I incur some performance cost that I cannot afford.
If you would like to learn more about stylistic principles in programming, many books about programming devote at least a short section to it. For Perl programmers, _Progamming Perl, 2nd Edition_ by Wall, Christiansen, and Schwartz from O'Reilly and Associates covers style issues in Chapter 8: Other Oddments.