Archive for July, 2008

Great post on DSL

I was recently browsing the Geek Smithology site and reread one of his older posts about Domain Specific Adventures. Go check it out if you haven’t read it already. It’s a great read.

Single Exit Philosophy

Some developers prefer to follow the philosophy that any given function should have a single exit point. While this is not as true as the adage “GOTOs are EVIL” that it is derived from it still has some benefits.

The single biggest reason to follow this philosophy according to some of its followers is that It makes debugging easier. It is possible that the developer will miss a return statement in a multiple exit function that can let some bugs crop through. This does not happen with a single exit function because there is only one return, and it is at the bottom of the function most of the time. I will let this slide for now and cover it in a future post.

The benefit that intrigues me most about a single exit function is that it can avoid missing clean up calls. This is language specific because some languages have mechanisms to handle this if used properly (Java with try/finally, and C++ with destructors, etc). I have seen (and maybe even created) a few bugs crop into code during maintenance because of an improperly placed return.

For an example, consider a piece of code that opens a resource, processes items, and closes the resource. The following code snippet does that properly.
r = new Resource()
data = processItems(r.items)
r.close()
return data

Down the road the user notices that processing the items with pie in it always returns empty. As well, it takes a long time to process the items. A quick optimization might be to return empty as soon as we discover that pie is in our results.
r = new Resource()
return empty If (r.items.contains(‘pie’))
data = processItems(r.items)
r.close()
return data

Unfortunately this fix introduces a bug. Any time pie is found it returns and doesn’t clean up after itself. If we had a single return statement then the chances of not making this bug is higher.
r = new Resource()
data = !r.items.contains(‘pie’) ? processItems(r.items) : empty
r.close()
return data

Since we had to get the results of the data to the bottom of the method the cleanup was not missed and a bug was not introduced.

I am not advocating that people follow the single exit philosophy, but it is always good to get to know both ends of the argument. Stay tuned for a follow up article that will list a few of the positive points of multiple exit functions.

Bad Code – Arrays for multiple returns

Java does not allow you to return multiple objects in a single function. To get around this, some people decide to collect all the data into an array and return that. I have to admit that I hate this pattern.

public static final int INDEX_NAME = 0;
public static final int INDEX_JOB = 1;
public Object[] getDetails() {
Object[] o = new Object[2];
o[INDEX_NAME] = name;
o[INDEX_JOB] = job;
return o;
}

The above pattern has one major problem. It ties you down to specific indexes for your data since you cannot guarantee that the invoker of this function will use constants provided by your class. This may not be an issue if it is private and never used anywhere outside of the class, but I like seeing my internal interfaces adhere to the same standards as the external ones.

However, this is not why I hate it. I find that it is ugly. It is readable with the proper use of constants, but it will require a casting operation, and data[INDEX_NAME] is just not as nice as data.getName();

Getting around these issues is as simple as defining a data class. I know it looks like a no brainer, but most great ideas do.
public class Info {
private String name;
private String job;
/** getters and setters */
}

public Info getDetails() {
Info I = new Info;
i.name = name;
i.job = job;
return i;
}

Not only is this more readable, but you are more free to modify the info class in the future if desired.

Stupid Trick #2 – Uniqueness with an associative array

Many modern languages come with an associative array. Java calls it a Map, C# calls it a Dictionary, Perl calls it a Hash, etc. They enable a number of algorithm simplifications in addition to providing a flexible data structure. The following Perl script demonstrates extracting the unique words in a sentence.

#!/usr/bin/perl
use strict;

my $sentence = "say hello world and the world will say hello to you";
my @words = split /\s/, $sentence;
my %hash;

foreach (@words) {
$hash{$_} = 1;
}
@words = keys %hash;

print "Before: $sentence\n";
print "After: ", join(' ', @words), "\n";

Runs with the following output:
Before: say hello world and the world will say hello to you
After: the to you will hello world say and

The thing I like about this algorithm is that is very simple and easy to follow. It can also be reimplemented very similarly in both C# and Java.

My next example solves the problem without the use of hashes. Thanks to Perl, it is almost as concise, and I almost prefer it when using Perl.

#!/usr/bin/perl
use strict;

my $sentence = "say hello world and the world will say hello to you";
my @words = split /\s/, $sentence;

my @uniqueWords;

foreach my $word (@words) {
my $wordFound = 0;
push @uniqueWords, $word unless grep (/$word/, @uniqueWords);
}

print "Before: $sentence\n";
print "After: ", join(' ', @uniqueWords), "\n";

Runs with the following output:
Before: say hello world and the world will say hello to you
After: say hello world and the will to you

Unfortunately, most languages don’t have a grep function. This makes accomplishing this task more verbose since it requires an additional foreach loop making it less readable.

Bad Code – Generics

Generics are really easy to use badly. There are very few places in which they are a good idea, and most of them are implemented for us in the standard library. I ran across one of these usages recently that I cannot resist commenting on. (I’ll only be covering the generics in this post, and not nitpicking over the silliness of this methods existence. e.g. why is there a wrapper for ds to begin with?)

public static T getTypedValue(IDataStore ds, String key, Class clazz) {
if (clazz.equals(String.class)) {
return (T) ds.getString(key);
} else if (clazz.equals(Integer.class) {
return (T) ds.getInteger(key));
} else if (...) {
/* more casting */
}
return (T) ds.getObject(key);
}

// usage:
String value = getTypedValue(ds, "name", String.class);

There is no added type safety in the above code. The generics used here moves the casting from elsewhere in the code base to this method. This can definitely be improved upon.

Lets take a look on how this method would have been created without generics.

public static Object getTypedValue(IDataStore ds, String key, Class clazz) {
if (clazz.equals(String.class)) {
return ds.getString(key);
} else if (clazz.equals(Integer.class) {
return ds.getInteger(key));
} else if (...) {
/* more casting */
}
return ds.getObject(key);
}

// usage:
String value = (String) getTypedValue(ds, "name", String.class);

The method looks cleaner. There isn’t any generic syntax to add clutter, and no casting needed since it returns an object. The casting is done outside the method. I think it is nicer, but not nice enough.

The reason why the generics were introduced to this method is because they were trying to remove casting. This can be done by breaking this method down to smaller ones. Each with its own different return type.

public static String getStringValue(IDataStore ds, String key) {
return ds.getString(key);
}

public static Integer getIntegerValue(IDataStore ds, String key) {
return ds.getInteger(key);
}

// usage:
String value = getStringValue(ds, key);

I do beleive that this is the best way to go. The new methods are trivial and easy to read. There is no generic clutter. Plus there is no casting at all. It is a win on all fronts.