In this article by Sudipta Mukherjee, author of .NET Generics 4.0 Beginner's Guide, we will take a look at:
- Generic container patterns: There are several patterns that are used more than the others in code bases that use Generics. Here, we shall walk through some of these very popular generic structures.
- Best practices: Here we shall walk through a list of best practices with succinct causes to back them.
Generic container patterns
There are several generic containers such as List, Dictionary, and so on. Now, let's take a look at some of the patterns involving these generic containers that show up more often in code.
How these are organized
Each pattern discussed in this article has a few sections. First is the title. This is written against the pattern sequence number. For example, the title for Pattern 1 is One-to-one mapping. The Pattern interface section denotes the interface implementation of the pattern. So anything that conforms to that interface is a concrete implementation of that pattern. For example, Dictionary is a concrete implementation of IDictionary. The Example usages section shows some implementations where TKey and TValue are replaced with real data types such as string or int. The last section, as the name suggests, showcases some ideas where this pattern can be used.
Pattern 1: One-to-one mapping
One-to-one mapping maps one element to another.
Pattern interface
The following is an interface implementation of this pattern:
IDictionary
Some concrete implementations
Some concrete implementations of this pattern are as follows:
Example usages
The following are examples where TKey and TValue are replaced with real data types such as string or int:
- Dictionary
- SortedDictionary
- SortedList
- Dictionary
Some situations where this pattern can be used
One-to-one mapping can be used in the following situations:
- Mapping some class objects with a string ID
- Converting an enum to a string
- General conversion between types
- Find and replace algorithms where the find and replace strings become key and value pairs
- Implementing a state machine where each state has a description, which becomes the key, and the concrete implementation of the IState interface becomes the value of a structure such as Dictionary
Pattern 2: One-to-many unique value mapping
One-to-many unique value mapping maps one element to a set of unique values.
Pattern interface
The following is an interface implementation of this pattern:
IDictionary>
Some concrete implementations
Some concrete implementations of this pattern are as follows:
- Dictionary>
- SortedDictionary>
- SortedList>
- Dictionary>
Example usages
The following are examples where TKey and TValue are replaced with real data types such as string or int:
Some situations where this pattern can be used
One-to-many unique value mapping can be used in the following situations:
- Mapping all the anagrams of a given word
- Creating spell check where all spelling mistakes can be pre-calculated and stored as unique values
Pattern 3: One-to-many value mapping
One-to-many value mapping maps an element to a list of values. This might contain duplicates.
Pattern interface
The following are the interface implementations of this pattern:
Some concrete implementations
Some concrete implementations of this pattern are as follows:
- Dictionary>
- SortedDictionary>
- SortedList>
- Dictionary>
Example usages
The following are examples where TKey and TValue are replaced with real data types such as string or int:
- Dictionary>
- SortedDictionary>
- SortedList>
- Dictionary>
Some situations where this pattern can be used
One-to-many value mapping can be used in the following situations:
- Mapping all the grades obtained by a student. The ID of the student can be the key and the grades obtained in each subject (which may be duplicate) can be stored as the values in a list.
- Tracking all the followers of a Twitter account. The user ID for the account will be the key and all follower IDs can be stored as values in a list.
- Scheduling all the appointments for a patient whose user ID will serve as the key.
Pattern 4: Many-to-many mapping
Many-to-many mapping maps many elements of a group to many elements in other groups. Both can have duplicate entries.
Pattern interface
The following are the interface implementations of this pattern:
Some concrete implementations
A concrete implementation of this pattern is as follows:
IList>>
Example usages
The following are examples where TKey and TValue are replaced with real data types such as string or int:
Some situations where this pattern can be used
Many-to-many mapping can be used in the following situations:
- If many independent values can be mapped to a set of values, then these patterns should be used. ISet implementations don't allow duplicates while ICollection implementations, such as IList, do.
- Imagine a company wants to give a pay hike to its employees based on certain conditions. In this situation, the parameters for conditions can be the independent variable of the Tuples, and IDs of employees eligible for the hike can be stored in an ISet implementation.
For concurrency support, replace non-concurrent implementations with their concurrent cousins. For example, replace Dictionary with ConcurrentDictionary.
A special Tuple pattern
Since Tuple is a new inclusion in the .NET 4.0 Framework and can prove to be handy in a lot of situations, I thought it would be a great place to showcase an unusual use of this beautiful data representation technique.
Let's say we have situations involving four integers. This is generally represented using nested if-else blocks, as shown in the following code. However, we can use a Tuple to refactor this code, as we shall see in a while:
- Create a console app and add these four variables:
static int x = 11;
static int y = 9;
static int z = 20;
static int w = 30;
- Now, add these nested if-else blocks in the Main() method:
if (x > 9)
{
if (y > x)
{
if (z > y)
{
if (w > 10)
{
Console.WriteLine(" y > x and z > y and w > 10 ");
}
else
{
Console.WriteLine(" y > x and z > y and w }
}
else
{
if (w > 10)
{
Console.WriteLine(" y > x and z 10 ");
}
else
{
Console.WriteLine(" y > x and z }
}
}
else
{
if (z > y)
{
if (w > 10)
{
Console.WriteLine(" y y and w > 10 ");
}
else
{
Console.WriteLine(" y y and w }
}
else
{
if (w > 10)
{
Console.WriteLine(" y 10 ");
}
else
{
Console.WriteLine(" y }
}
}
}
- If you run this, you shall see the following output:
y y and w > 10
- Although this code performs the job, it has the following problems:
- It is very long. Most programmers will lose track of what they were reading.
- It is impossible to re-use any part of the code that is available under a specific branching.
- It is very long. Most programmers will lose track of what they were reading.
Time for action - refactoring deeply nested if-else blocks
Follow the given steps:
- We must acknowledge that by using a maximum of four variables there can be 16 different branchings. Of these only eight are shown here. This type of code can be remodeled using the following structure. Now, delete everything in the Main() method and copy the following code:
//Creating rules list
List>>
rules =
new Liststring>>>();
//adding rules.The benefit is that entire nested if-else is
//flatend.
rules.Add(new Tupleint, string>>
(x > 9, y > x, z > y, w > 10, DoThis));
rules.Add(new Tupleint, string>>
(x > 9, y > x, z > y, w rules.Add(new Tupleint, string>>
(x > 9, y > x, z 10, DoThis));
rules.Add(new Tupleint, string>>
(x > 9, y > x, z rules.Add(new Tupleint, string>>
(x > 9, y y, w > 10, DoThis));
rules.Add(new Tupleint, string>>
(x > 9, y y, w rules.Add(new Tupleint, string>>
(x > 9, y 10, DoThis));
rules.Add(new Tupleint, string>>
(x > 9, y
Console.WriteLine ("Printing using tuple");
//Finding the first rule that matches these conditions.
Tuple>
matchingRule
= rules.First
(
rule =>
rule.Item1 == GetExpression(x, 9)
//represents x > 9
&& rule.Item2 == GetExpression(y, x)
//represents y > x
&& rule.Item3 == GetExpression(z, y)
//represents z > y
&& rule.Item4 == GetExpression(w, 10)
//represents w > 10
);
//Find the Matching function.
Func function = matchingRule.Item5;
//Invoke the function.
Console.WriteLine(function.Invoke(x,y,z,w));
- This will not compile yet because the DoThis() and GetExpression() methods are not defined. So add these methods as follows:
private static string DoThis(int x, int y, int z, int w)
{
string partOne = y > x ? " y > x " : " y string partTwo = z > y ? " z > y " : " z string partThree = w > 10 ? " w > 10 " : " w return partOne + "and" + partTwo + "and" + partThree;
}
private static bool GetExpression(int x, int y)
{
return x > y;
}
What just happened?
Now if you run the program, you shall see the same output as before:
y y and w > 10
So you saw how the if-else blocks were removed while the rules are still intact. Moreover, the code is more readable now.
Let's see how this worked. Notice carefully that the last parameter of the Tuples used in the rules is Func. This means, we can place any function that takes four integer parameters and returns a string. The DoThis() method matches this requirement and so we place it in the list:
Tuple>
The previous list represents a relationship between four Boolean expressions and a function. These Boolean expressions are independent and the associated function is found using the following concern:
Func function = matchingRule.Item5;
Invoke() is a method to call the function.
Best practices when using Generics
The following are the best practices to be followed while using Generics:
- Don't use Generics when you know the situation won't always be generic.
- Use Stack for a Last In First Out (LIFO) list implementation.
- Use Queue for a First In First Out (FIFO) implementation.
- Use List for random access lists with zero-based indexing.
- Use LinkedList to implement a deque because it offers faster insertion and deletion at both ends as opposed to other collections.
- If you have to frequently insert random locations in the list, prefer LinkedList over List because LinkedList is a constant time operation to insert one item in between two elements in a linked list.
- If the random list does not have a duplicate, use HashSet instead. It is the fastest.
- Don't use a for loop over a IDictionary implementation. It can give incorrect results. Use a foreach loop over the keys collection instead.
- Avoid using ElementAt() or, its cousin, the ElementAtOrDefault() method on generic collections that natively don't support zero-based integer indexing, (for example, Stack or Dictionary); since dictionary elements are not guaranteed to be available on an index where they were added.
- Don't use a Tuple with more than seven parameters. Create a class with those parameters and then create a list of that class' objects. Using a Tuple with more than four parameters makes the code look messy and it's not efficient.
- Use SortedDictionary just to get the entries sorted. Don't use it to store simple associative "one-to-one" and "one-to-many" relationships as keeping the keys sorted isn't absolutely necessary. It is very slow compared to a normal dictionary.
- Use HashSet to create a generic set. It's the fastest set implementation available in the .NET Framework.
- Use SortedSet only to create a sorted generic set. Don't use it when you don't want the set to be sorted because it is slower than HashSet.
- Don't expect indexing to work on an array or a list as it would on sets or dictionaries, because sets and dictionaries use hashing or tree-based internal data structures to place elements in the collection.
- Use the available bare minimum implementation for your customized need or resort to creating your own custom collection class from the interfaces. This is a design guideline. Try to make sure your code is as lightweight as possible. If you want a queue, use a Queue instead of a List.
- Use LinkedList when you need to insert elements at arbitrary positions but you don't need to access them randomly via integer indexing.
- Use KeyValuePair to store a key value pair. Avoid using Tuple> for this purpose because KeyValuePair is more efficient.
- Prefer Dictionary> over List> whenever possible for representing a one-to- many relationship between two entities. Lookup speed will be much faster and client code will be less clumsy.
- Prefer List> to represent a many-to-one relationship between two entities over Dictionary,TValue> if there is a duplicate.
- If no more entries need to be added to a collection, call TrimExcess() to free up the extra memory.
- Prefer LINQ Standard Query Operator Where() over Contains(), Exists(), or Find() methods to check whether a value is present in a List instance.
- If you implement IEnumerable in your custom collection, don't forget to implement IEnumerable also. This is to ensure backward compatibility and the Liskov substitution principle.
- Use SortedList if you want a concrete implementation of IDictionary that supports native indexing. However, it is slower than Dictionary. Avoid SortedList when you don't want indexing.
- Don't use the ElementAt() or the ElementAtOrDefault() method on IDictionary implementations except SortedList, because dictionary elements are not guaranteed to be in the index where they are added.
- Avoid conversions between data structure formats (for example, array to list, or vice versa using LINQ operators) as far as possible. Use other language constructs. For example, consider using a foreach loop than converting an IEnumerable instance to a list using the ToList() method before performing some operations on each item in the list.
- Use the OrderBy() or the OrderByDescending() method to sort any IEnumerable.
Selecting a generic collection
Not all generic containers are geared to do each job well, as we already know it. Here is a table that will let you pick a generic container that supports some features listed in the lefthand column.
You can download the GenGuru app from the support website. It's a kind of wizard that will recommend the generic containers available in .NET Generics that you should use. It will ask you few questions and recommend a collection. You can see it in action at http:// sudipta.posterous.com/private/niixaeFlmk.
Best practices when creating custom generic collections
The following are the best practices while creating custom generic collections:
- Try to use built-in generic classes for maintaining internal data structures.
- Try to use interfaces as return type and method arguments for publicly exposed methods for your collection. For example, consider using IEnumerable over List.
- Try to keep the names aligned as much as possible towards the names available in frameworks. For example, if you are creating a concrete implementation of IDictionary then try to use the suffix Dictionary in the name of the class, such as MultiDictionary.
- Try to provide overloaded constructors to offer flexibility such that the class can be created from many diverse data sources.
- Use Generics constraints judiciously. Remember that these constraints are like boomerangs. You should know what exactly you can allow as an input parameter for a constrained generic type. Otherwise, these can backfire and hurt you and the users of your collection.
- Make sure to always implement the IEnumerable, IEnumerable, ICollection, IDisposable, and IClonable interfaces.
- Make sure that your collection confirms to the Liskov substitution principle. Divide into subclasses only when it makes sense, otherwise use composition.
- Offer a constructor overload to create a thread-safe instance of your collection.
- Make sure it supports LINQ, which will be obvious if you implement IEnumearble properly. LINQ is changing the way we see and solve problems. So if you miss LINQ, your collection will probably fail to impress a large section of the target audience.
- Thrive for performance. Make sure that performance is as best as it could be.
- Make sure your collection is as lightweight as possible. If a Queue can do what you want done internally in your collection, use it. Don't consider using a more versatile list implementation such as List.
- Don't write raw events by yourself to monitor the collection; rather expose ObservableCollection to offer this functionality.
- Don't provide functionalities that can be achieved by a simple combination of exposed functions. For example, take a look at the Date Time API from the .NET Framework. There is a method called AddDays() that can take a positive or negative integer such that you can go to a past or a future date. However, the framework doesn't provide a method or a public property to compute Tomorrow() or Yesterday() as they can be easily calculated using the AddDays() method. While exposing public functions for your generic collection, remember to expose only those methods that can be used as building blocks for several reasons.
- However, you have to strike a balance. The framework also has the AddYears() method because it will be cumbersome, although technically achievable, to duplicate AddYears() using the AddDays() method.
Remember these points while coming up with your own generic collection.
Summary
In this article, we took a look at generic container patterns and best practices for Generics. - Create a console app and add these four variables:
- If many independent values can be mapped to a set of values, then these patterns should be used. ISet implementations don't allow duplicates while ICollection implementations, such as IList, do.
- Mapping all the grades obtained by a student. The ID of the student can be the key and the grades obtained in each subject (which may be duplicate) can be stored as the values in a list.
- Dictionary>
- Dictionary>
- Mapping all the anagrams of a given word
- Dictionary>
- Mapping some class objects with a string ID
- Dictionary