To me, one of the most interesting aspects of LINQ query expressions is that they produce lexical closures (for a detailed look at closures in C#, see my article on the topic). To illustrate this point, consider the following code:
This query retrieves all of the public methods on System.String whose names contain the text represented by the "filter" variable (in this case "Compare"). When compiled and run, the output is what you might guess:
That behavior should be perfectly natural to any C# developer. Here's where things get a little tricky:
Can you guess what that code will output to the console?
Your answer to that question depends on your understanding of what closures are and how they work. A closure is produced when a variable whose scope extends beyond the current lexical block is bound to that block. That's a bit of a mouthful, isn't it? Allow me to clarify what I mean with a simple example.
In the code above, an anonymous delegate ("a") references a variable ("x") that is declared outside of the anonymous delegate's body. This implies a lexical closure, and the variable "x" is bound to the method body of "a." The important point is that "a" is bound to the variable "x" and not its value. In other words, the value that "a" writes to the console depends upon the value of "x" at the time of its execution. Because 1 is assigned to "x" immediately before "a" is executed, 1 is output to the console.
Precisely the same thing happens in our query expression. A closure is produced for the lambda expression of the "where" clause because it references the "filter" variable, which is declared outside of the query expression. The closure binds to the variable "filter"—not its value. So, changing the value of "filter" after the query expression is defined will change the results returned by the query. In fact, if you run that code, you'll get this:
Let's try to exploit this closure in a more practical way.
This slightly different query expression returns the names of all of the public methods on System.String that don't match the value of the variable "filter." By modifying "filter" in each iteration of the foreach loop, we are effectively filtering out all duplicate method names. This works as advertised, but there's one potential bug: it is assumed that all overloads of a method are grouped together. If there are overloads of, say, String.CompareTo that aren't adjacent in the source array, the filtering won't work properly. What we really need to do is sort the array using the "orderby" query operator.
WHOOPS! That doesn't work. When we execute that query, all of the method names are output to the console, including duplicates. Our modifications to the "filter" variable in the foreach loop are completely ignored. Why is that?
The reason is that "orderby" forces the entire query to be evaluated when the first element is requested. This behavior is unavoidable and breaks the normal delayed evaluation of a query expression. However, we can still make the closure work properly by ensuring that the sort happens before filtering.
Now we get the output that we want:
The moral here is to be careful. Exploiting closures in query expressions can be powerful but tricky to get right.
Page rendered at Wednesday, August 20, 2008 4:18:35 AM (Eastern Standard Time, UTC-05:00)