Friday, January 28, 2011

The Power of Covariance and Contravariance in Generics

While reading forums and other discussions, I often notice that it is difficult for developers to  see practical applications of Covariance and Contravariance in Generics. This is due in part because of our Greco-Roman approach to learning. In this approach you learn as much as you can and when you face a real life problem you will try to dig through your memory archives in a futile attempt to find needed pieces of information or theoretical concepts that you have learned 10-20 years ago. This approach has not served me well over the years. I had 5 years of advance chapters of math in college but most of what I remember to this day is the names of those chapters and not much of the content. I recently needed to use Fourier series and could not remember the formula, thanks to God for giving wonderful ideas to some people who invented search engines. What we would be doing these days without search engines - I don’t know. 

Therefore I am a firm believer of a different approach often seen in some middle eastern and Asian cultures. Face the problem first, try to solve it and learn as you go. I found that knowledge retention in this approach is so much greater. In addition you see practical application of a concept right there. Seeing this becomes even easier when your salary depends on solving such problems :).  This was the case with Covariance and Contravariance for me. I vaguely remembered the concept from computer science lectures, while the concept is simple, it is easy to forget it without needed practice. So when I faced a real problem, initially I have not associated it with the concept I learned earlier, but I went on to discovery route and found the answer!

The “Aha” moment filled my brain :) “this is something I learned before” I thought to myself. So I had to re-introduce myself to the concept.

To cover some of these learning gaps I decided to write this short article starting it with a discussion about a problem!

Let’s say there is some hierarchy of classes in a program and there is class Child which inherits from class Parent. Now there is a need to implement a method which will take a collection of objects of type Parent and do something very generic over these objects. So the method signature might look something like this:

public void DoSomething(List<Parent> collection)
{
foreach (Parent p in collection)
{
p.SomeMethod();
}
}

Pretty simple, no magic here. Now the problem comes when you want to pass a collection of Child classes which also implement SomeMethod(). Logically it is known to the compiler that Child implements SomeMethod and is of type Parent and there should be no problem, but in reality it might not work. When you have something like the following it will work without problems:
public void DoSomething(Parent object)
{
object.SomeMethod();
}
Wait a minute, isn’t it a simple polymorphism? Yes it is. But when will passing List<Child> not work? In all versions of C# prior to version C# 4.0 . Why? Because it doesn’t support contravariance for generic types.While Parent and Child polymorphic through inheritance, List<Parent> and List<Child> are not “polymorphic” :). Let me get back to this a little later, but for now let me give a real world example first.
Microsoft has implemented great WPF binding mechanism. Suppose you have a common interface which supports binding to List<Parent> and there is a specific data template which can expose some of the common properties of objects of type Parent. When you try to implement collections of classes inherited from Parent and pass it via WPF data binding mechanism the data template will not be applied to the list but instead implementation of ToString() will be used. So instead of seeing neatly formatted objects you will get a list of object names:
MyNamespace.Child
MyNamespace.Child
MyNamespace.Child
etc.
Here is an example of what you may see when DataTemplate is not properly applied:
image
There are some other scenarios when such behavior may be observed, but it is a separate discussion.
In this case default implementation of ToString(), unless it is explicitly overridden in Child class, is simply object.GetType().ToString().
So what most people end up doing is using LINQ or lambda function to quickly convert objects from type Child to type Parent like so:
List<Child> list = new List<Child>(); 
// populate list
...
// call DoSomething which accepts List<Parent>
DoSomething(list.Select(t => t as Parent)); 
Not a big deal in a single case, but on large projects you might end up writing a lot of converters or manually converting collections like in the above example. This conversion (boxing-unboxing) may also negatively affect performance, although I have not tested it. Calling converters from WPF data binding pipeline generally reduces performance, this is one of the best practices advice for Windows Phone 7 development. 
Now let’s get back to the previous example where we had DoSomething(Parent object) and explain in detail what happens in here when instance of Child is passed as a parameter. This is very common practice among developers and easy to understand, so let’s now define this process.
In theoretical Computer Science this process is called Contravariance. Here is a definition from Wikipedia (http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science) ):

“Within the type system of a programming language, covariance and contravariance refers to the ordering of types from narrower to wider and their interchangeability or equivalence in certain situations (such as parameters, generics, and return types).

  • covariant: converting from wider (double) to narrower (float).
  • contravariant: converting from narrower (float) to wider (double).
  • invariant: Not able to convert.”
In our example we were implicitly converting from a narrower Child type to a wider Parent type. In this case it is also called polymorphism and was supported for a long time in many programming languages.
Let’s take a look at our very first example. This however is not polymorphism but it does look like a good case for contravariance. Up until C# 4.0 type List<Child> could not be implicitly converted to List<Parent> while Parent and Child are from the same hierarchy and can be converted implicitly. That is why there was a need for explicit conversion from a narrower type to a wider type.
Now in C# 4.0 a support for Covariance and Contravariance in Generics was added. So such lists may be converted implicitly as in simple polymorphism. There are some limitation, however, for more details on this please refer to the following MSDN article (http://msdn.microsoft.com/en-us/library/dd799517.aspx).
Hope this helps, please let me know if I introduced any misconceptions. :) and if it another ISolvable<T> problem.

Happy coding!