Sunday, November 28, 2010

HashMap of a HashMap of a HashMap Problem -- Part 2 - Adding Complexity

In our previous part we started discussing how even basic collection usages require careful use of meta-data.  Our example was simple but now we are going to begin to add some complexity to it.  This complexity will begin to lead us down the path to the problem.
 Let us make some modifications to our example to add a little complexity.  In all likely-hood, a person will have more than one nickname.  We want to be able to retrieve all their nicknames based upon their name.  Therefore, we modify our property to be as follows:
/// <summary>
/// Gets a map of a user's name (the key) to their nicknames
/// (the value) so we can look up their nicknames quickly
/// </summary>
public Dictionary<string, List<string>> NameToNicknamesMap { get; }

This still is not horrible but it is more complex.  We now have a nested collection that causes us some grief in writing code to access the values.  This is not a big deal it just makes our code less clean.  We still can easily understand it if the property is well documented.
 Let’s step back for a second and make an observation of this structure.  Why would this structure exist?  It exists because we want to make some sort of connection between name and nicknames.  Doesn’t that sound an awful lot like what we should use a class for?  Let’s write a class that provides the same connection:
public class Person
{
    public string Name { get; }
    public List<string> Nicknames { get; }
}

We then would also change our property to:
public List<Person> People { get; }

There is one thing that we are missing in this implementation.  That is we’ve lost the ability to quickly lookup the nicknames based upon the name.  We will ignore this for right now and revisit it later on once we develop our structure further.
Our class has created as strong relationship between our name and nickname.  The relationship is explicit and there is no reliance on meta-data to understand the relationship.  Using the old structure if we wanted to also map name to another value, like birthday, we’d have to create another property like this:
/// <summary>
/// Gets a map of a user's name (the key) to their birthday
/// (the value) so we can look up their birthday quickly
/// </summary>
public Dictionary<string, DateTime> NameToBirthdayMap { get; }

In creating this property, we have to ensure that we provide the same detail in the meta-data and naming to keep straight what we are mapping.  If we want to create the same relationship using our class we just add a new property for the birthday and we have created the same relationship.  Our Person class now looks like this:
public class Person
{
    public string Name { get; }
    public List<string> Nicknames { get; }
    public DateTime Birthday { get; }
}

We now have strong links between the name, nicknames, and birthday.  This is simple object-oriented encapsulation and nothing revolutionary.  We also are using an example that leads us naturally to think of an object.  A person is an object type that is naturally created when we want to represent people in a system.  We know we want to encapsulate data into a person object early in design because we know that a person has many characteristics we want to put together.  We are using a person only as a concrete example.  The type of situation we are discussing here is one where originally, there is a single relationship and over time, similar relationships are added.  If our original relationship is done with a map then our future relationships will most likely be added with a map as well.  This is natural because programmers tend to follow the pattern of existing code.  Unless someone takes a step back at some point you easily could wind up with 5, 10, 20 maps mapping name to some other values.  The class containing these mappings feeds on itself because as the pattern becomes more pervasive there is less incentive to change.  This probably won’t happen with person data which can clearly be thought of in object terms.  It can easily happen with abstract concepts within a system that are hard to translate into a real world object because people probably aren’t thinking of the relationships between values as object defining relationships. 
Let’s take a step back and circle back to my previous point about circumventing the object orientedness of the language.  When you use collections to make connections between name and nicknames or name and birthday you are creating a weak relationship.  The relationship only exists as far as that Dictionary or HashMap exists.  You haven’t broken the typing system and syntactically you are still correct, but you aren’t following good object oriented design practices.  When you create the class that encapsulates the data you create strong relationships that are self evident to someone reading and using your code. 
Again, this is not the full problem, but another stepping-stone onto seeing the problem and the remedy for it.  It still can be manageable to handle data with only the collections if there is a small number of relationships.  It is also easy to refactor if we begin to recognize we should encapsulate these relationships together.  Next time we’ll start to see when writing code in this manner quickly becomes a problem as we add more levels.

1 comment: