Friday, December 12, 2008

Understanding Lambda Expressions

C# 3.0 introduced a new feature called lambda expressions. Using this syntax, you can shorten your code immensely, however, if you've never seen it before, it can be extremely confusing. I'm going to do my best in this post to explain how they work, but first, it's important to lay the groundwork. In that spirit, let's have a history lesson...

C# 1.0: Delegates and Methods

Delegates are essentially placeholders for methods. They define incoming and outgoing parameter types to create a specific signature for a method. They can be constructed, and in their constructor, you would pass in a pointer to a method whose signature matches the signature of the delegate. For instance, take the following delegate as an example:

public delegate int GetLengthDelegate(string value);


This delegate could be constructed just as any class instance can be constructed. Let's say you have the following method:

public int GetLength(string value)

{

    return value.Length;

}


Now, I can do the following:

public void RunDelegate()

{

    GetLengthDelegate getLength = new GetLengthDelegate(this.GetLength);

    Console.WriteLine(this.GetLength("Hello World!"));

    Console.WriteLine(getLength("Hello World!"));

}


In the code above, I'm first calling GetLength without using a delegate, and then I'm calling the very same method via the use of a delegate. In both situations, I'm getting the same result, because I'm passing in the same value.

The joy of delegates, is that they can be passed into a method. Let's say I added this code:

public void PrintLengths(GetLengthDelegate del)

{

    del("Hello World!");

}


And then I changed my RunDelegate method to look like this:

public void RunDelegate()

{

    GetLengthDelegate getLength = new GetLengthDelegate(this.GetLength);

    PrintLengths(getLength);

}


Now, I'm creating a delegate called getLength that points to the method this.GetLength, and then I'm passing it into another method, called PrintLengths, which will then call the delegate.

Delegates can be combined with each other, meaning that you could set up a delegate so that when you call del("Hello World"), you could trigger off several methods, which would be called one after another. This is called combining delegates, and it's a little out of scope for our topic.

In any case, all the syntax you see above is old, and has been around since the inception of C# 1.0.

C# 2.0: Anonymous Delegates

C# 2.0 introduced anonymous methods. This enabled programmers to write methods anonymously, without actually naming the method. These anonymous methods were used to pass into the constructors of delegates, in order to allow methods that take delegates as parameters call delegates that might be too short or too simple to really merit being programmed in as a member of the class. Anonymous delegates had the following syntax:

delegate(string value) { return value.Length; }


The first thing you'll notice here is a lack of definition for the return value. This is because the C# compiler would infer the type of the return value, based on the return values coded into the delegate.

The second thing you might notice is the use of the word delegate. Yes, the C# team decided that the term delegate can be used in two contexts. Used in one context delegate defines a delegate type, and in another context it defines an anonymous method. I personally feel they could have named it something else to be less confusing, but it is what it is.

The last thing you should notice about the anonymous method above, is that it has the exact same functionality as the GetLength method. In fact, I can actually use this anonymous method in the constructor to the GetLengthDelegate. This means that I could change my RunDelegate method to this:

public void RunDelegate()

{

    GetLengthDelegate getLength =

         new GetLengthDelegate(delegate(string value) { return Value.Length; } );    PrintLengths(getLength);

}


In fact, I could even go further and declare it without explicitly calling the GetLengthDelegate constructor, nor storing the instance of the delegate in the local getLength variable.

public void RunDelegate()

{

     PrintLengths(delegate(string value) { return value.Length; });

}


As you can see, the anonymous method syntax was a great improvement over the previous way of doing things, where an actual method had to be declared.

One caveat, however, anonymous methods cannot be declared without a context. That is, you can't just type a anonymous method into another method and expect it to work. They're designed to be assigned to delegates, not called straight out.

C# 3.0: Lambda Expressions

In C# 3.0, lambda expressions were introduced. Lambda expressions are nothing but syntactical sugar built upon anonymous methods. Our hero, the GetLength method, would look like this in an anonymous method:

s => s.Length


That's it. Just like the anonymous methods, however, they need to be assigned to a delegate. Thus, you could do this to create the same method as we've had above:

GetLengthDelegate getLength = s => s.Length;


Now, your first question is going to be obvious: "What is s?" Let me put it this way, I could have also written the above line as follows:

GetLengthDelegate getLength = value => value.Length;


Do you see it now? s and value actually refer to the string parameter sent into the lambda expression. We know that this is a string parameter, because we're assigning this lambda expression to the delegate GetLengthDelegate, which expects an incoming string parameter. Therefore, the C# compiler infers the type not only of the return value, but also of the incoming parameter.

What's more, the return keyword is conspicuously absent from the lambda expression. The C# compiler is smart enough to know that the value created on the right side of the => symbol is the return value of the delegate.

So, to put it briefly, the value on the left side of the => is the incoming parameter, and the value on the right is the outgoing parameter.

Putting It All Together

In .NET 3.5, LINQ was introduced. LINQ methods allow a user to filter, search, and do all sorts of other funky things to collections of items simply by providing a set of extension methods that accept delegates. For instance, there's the Where method, which takes a delegate of type Func. Now, what is Func? Basically, it's a generic delegate whose second parameter has been defined, but the first has not been. The actual definition of Func is the following:

public delegate TResult Func<t,>(T arg);


So it's a generic delegate. This means, that declaring it as a Func would mean that you're passing in a string as the incoming parameter, and expecting a bool as the return parameter. What does this mean for the Where method?

Well, basically, the Where's return type is IEnumerable<T>, where T refers to the generic type parameter that fills in the collection off of which you call the method. So, if you have a List<string>, the return of calling Where will return an IEnumerable<string> and you should pass in a method with the same signature as Func<string>. If you have a List<int>, the return parameter will be IEnumerable<int>, and you should pass in method with the same signature as Func<int>. The Where method will call the delegate you pass in to it for every object contained in your enumerable, and will return an IEnumerable<T> that contains all the objects for which this method returns true. Maybe an example will help:

List<string> list = new List<string>();

 

list.Add("David");

list.Add("Bob");

list.Add("Michael");

 

foreach (string name in list.Where(s => s.Length == 5))

    Console.WriteLine(name);


In the example above, I'm creating a list, and populating it out with three names. "David", "Bob" and "Michael". I'm then creating an IEnumerable<T> (in this case, an IEnumerable<string> containing only the names that are 5 characters long. In my example, that would only be "David", but if I had added "Jimmy" to the list, that would also be returned. The Where method will go through each of the items in list and evaluate the lambda expression I passed into the Where statement. In other words, it will call that method on each of the items in list. The if the return of the call to the lambda expression is true, the Where method will return that value as part of it's result, thus filtering the list to include only the string values that are 5 characters long.

I hope this explanation has been enlightening if you've been wondering what lambda expressions are. There is a bit more information on lambda expressions on the MSDN Library, as well as some generally out on the internet. Nevertheless, there's no replacement to learning the C# language, other than to code in it, so pull out your Visual Studio, load up a list, and give it a go. I think you'll find that these expressions can save you alot of time, and simplify your code greatly.