25 February 2013

Dynamic LINQ Queries with Expression Trees

It's possible to build up dynamic LINQ queries or queries with several conditional criteria. In fact there are several options for doing this, including the use of expression trees.

Like it or not, LINQ is here to stay. Personally I think this is a good thing. After some initial hesitation, I have sipped the Kool Aid and it’s not that bad.

LINQ brings a lot of benefits:

  • No magic strings (makes refactoring tools much more effective)
  • SQL-like syntax outside of the database
  • LINQ providers for various data sources (SQL, Entities, Objects, nHibernate, XML, etc)

Initially, I thought that there was a big limitation in not being able to build up dynamic queries or queries with conditional criteria. Fortunately, this turned out to be a deficiency in the samples and documentation, not LINQ itself. There are actually lots of options available.

In this article, we will review various approaches for building dynamic LINQ queries, starting with some fairly straightforward options and moving on to building the query from scratch with Expression Trees. None of this will require the System.LINQ.Dynamic namespace.

Hopefully you’ll enjoy the ride.

A Little Background

There are two flavors for expressing a LINQ query; they go by different names depending on the source you are reading, the two most common culprits are: Query (or Query Expression) and Method Chaining (or Fluent). In some circles, debates will range long and loud over which method to use as well as what to call it.

For purposes of this discussion, we will use the latter syntax and call it Method chaining. We are chaining together method calls and it’s a fluent interface. So I give props to both names. This is personal preference, but I believe that IntelliSense kicks in quicker with Method chaining. I also think this syntax makes a couple of implications more obvious. Also, I should point out that Visual Studio and C#/ VB.Net don’t really care which format you use. You can use either one, or you could use both. For the sake of your team and transitioning from section to section or project to project, I recommend that you pick an approach and then stick with it. I prefer Method chaining.

Let’s see what a simple query will look like in Query syntax and Method chaining syntax. Here we want to query a list of files to retrieve the ones that imported in the last week.

With Query syntax, it may look like this:

With Method chaining it may look like this:

These two queries look very similar and they will do exactly the same thing.

If you wanted to add additional criteria, you could simply add them to the “Where” section.

So far everything seems comfortable and if you look at most of the examples on the Internet or MSDN this is what you would see. As the query gets more complex, the “Where” section gets more complex. As we will soon see, this is not the only option.

From here on out, we will include the sample code using Method chaining syntax. Rest assured, in most cases, you can still do the same thing with either format. In fact, Resharper even has helper links to automate switching back and forth.

Simple Dynamic Query With Multiple Where Statements

In a traditional SQL statement, we are limited to a single WHERE statement and all of the criteria goes in that one WHERE statement. When you are first learning LINQ, it looks as if you have the same limitation and, in fact, you will search high and low on the Internet before finding examples that don’t reinforce this perception. Facing such restrictions, dynamic queries are tough. Fortunately, we don’t face such restrictions, we know we can have any number of WHERE methods in a LINQ query. Consider this:

Our last example could be written like this:

The two queries are identical in meaning, but note the subtle difference here:

This looks very similar to the previous query but will have dramatically different results. In this last query, the initial filter requiring that the Import date had to be in the last seven days is ignored. This is because the Queryable is immutable. The first call to the Where method on the files object does nothing to the original files object. Instead, it creates a brand new object and returns it. This is why when we call the Where method on the files object the second time we have lost what happened to the first call. This is a subtle but very important difference.

When we chain methods, we are automatically operating on the returned value from the previous call.

So the correct way to write the last example would be something like this:

Now to make the query dynamic, we change things up slightly:

Now, depending on the value of the pastOnly parameter, the second filtering criteria may or may not be included in the query.

Type Safe Dynamic Sorting

It is common to display a grid of data to a user and allow the user to sort it any way that they want. We can easily accommodate this dynamically by adding the appropriate OrderBy method call to our query.

If we know that there are only a handful of options that the user could specify, we could do something like this:

This is very intuitive and it makes it very obvious what you are doing. It can be very tedious as well. If we need to add a new property that could be sorted by, we would have to change this code. It also has the name of the property hard coded in the logic. We will also have to duplicate this logic everywhere that we want to allow dynamic sorting on the FileImport object.

One way to improve this will be to encapsulate the OrderBy logic into its own method.

We start by noting the signature of the OrderBy method. The method signature that we are interested in looks like this:

We are passing in a Lambda expression that is a function expecting TSource and returning TKey. We can encapsulate the sorting logic by defining a method that will take a string for the property name and return such a function.

By pulling the conditional logic out of the query, we limit the number of places that will need to change as new sort options become available.

Armed with such a function, our dynamic query can be rewritten like this:

Now our dynamic query looks good. There are no magic strings, and it does not have to be changed when we add new sorting options.

But we can do even more for the EvaluateOrderBy method.

Improving the EvaluateOrderBy Method

The EvaluateOrderBy method has a couple of problems:

  • It will only work with the FileImport object.
  • It will only sort by the options hard coded in.
  • It still has to be updated when we add new options.

Ideally we would like to have a method without the magic strings. We would like to have a method that will work with any TSource. Well that’s a pretty tall order, but I think we are up for it.

This is where express trees come into play. Expression Trees allow us to build up the logic for the method (lambda expression) that we need to return. Expression Trees may seem a little intimidating at first, but don’t worry, the expression that we need to build up is very simple, we need to build a property reference expression. The expression that is created will look like this:

We are going to ignore the more complex cases of:

It is possible that the property references could go on indefinitely. This is possible, but rare in actual practice. If you do find yourself having to manage such a complex object hierarchy, you can use reflection to map out the relationship and then adopt recursion to build up the property references.

For such a simple case, the GenericEvaluateOrderBybecomes this:

There is a lot going on for such a short method.

We start by determining the type for the generic argument. This will also be the type for our lambda expression.

Next we define the parameter by specifying the name and type.

Next we specify the only thing that we are going to do in this function, reference a property. We specify that it is a property of the parameter and we specify the name. If you wanted additional protection, you could use reflection to ensure that the specified property is a property of the type that you found earlier. This is also where you would need to build up a complex structure to support nested property references.

Now that we have all of the “logic” in place, we simply need to convert it to a lambda expression and compile it.

Alternately, we could make this an extension method to the Queryable itself:

With this approach, we pass in the IQueryable and return an IQueryable. The big difference is that now we will explicitly call “OrderBy” passing in the lambda expression that we created earlier.

From a usage perspective, we now have:

This will now allow us to sort any IQueryable by any property regardless of the object being referenced or who the LINQ provider is.

Using Expression Trees to Build a Query from Scratch

For this bit of magic, we are not going to be able generically add any filter criteria in place, but we can add some nice reusable logic.

When we define filter criteria, it will generally take one of just a handful of forms:

The Expression class in the Expressions namespace provides static methods to cover all of our bases.

Here we will focus our attention on three static methods:

  • Equal(Expression, Expression)
  • LessThanOrEqual(Expression, Expression)
  • GreaterThanOrEqual(Expression, Expression)

The other methods available will follow the same pattern of taking two expressions, a Left Expression, a Right Expression, and the method itself defining the binary operator.

In most cases, the two expressions passed into our comparison expression will be a property reference expression and a constant expression.

To create a simple comparison method, we might use something like this:

This will do a simple equality comparison. Simply change the Expression.Equal to any of the other comparison functions to get the different types of comparisons. Everything else stays the same.

The resulting function that is returned is suitable for passing directly into a Where method on our IQueryable.

In terms of usability, this is a giant step backwards. The original syntax is much easier to write and read. Not to mention that we now have a magic string in our query, but we are just getting started.

Now that we know how to programmatically create a function that can be included in the where clause, we are ready to explore some exciting options. Consider how we could write a method to require that all dates have passed.

We start by getting the type for the Generic Argument. Then we build up a list of the dates that are in this type. We then initialize our parameter expression as we have done in the past. Next we setup a constant expression that we will compare each date property against.

Now things get a bit more interesting. We have a couple of different possibilities in the final comparison depending on how many dates are in the object. Ultimately, we will return a lambda based on filterExpression

  • If there is only one date, filterExpression will use the property reference and thedateCap as left and right.
  • If there are two dates, filterExpression will be a “logical and” expression with one comparison being the “left” and the other comparison being the “right”
  • If there are more than two dates, filterExpression will be a “logical and” expression with one comparison being the “left” and the “right” being a new “logical and” expression. This pattern can go on indefinitely regardless of how many dates there are.

So this one method can handle everything from there being a single property to dozens of date properties. Regardless of how many date properties there are, we can call it like this:

This simple pattern can easily be extended to create methods such as:

  • AllDatesAreBlank
  • AllBooleansAreTrue
  • AllBooleansAreFalse
  • AllStringFieldsHaveValues
  • AllIntegersArePositive

Conclusion

We live in a brave new world. For years programmers have dreamed of being able to unify data access logic regardless of source. With LINQ we are closer to realizing that goal, and Expression Trees give us complete access to all the magic.

Keep up to date with Simple-Talk

For more articles like this delivered fortnightly, sign up to the Simple-Talk newsletter

This post has been viewed 100909 times – thanks for reading.

Tags: , , , , ,

  • Rate
    [Total: 91    Average: 4.1/5]
  • Share

Nick Harrison is a Software Architect and .NET advocate in Columbia, SC. Nick has over 18 years experience in software developing, starting with Unix system programming and then progressing to the DotNet platform. You can read his blog as www.geekswithblogs.net/nharrison

View all articles by Nick Harrison

  • Anonymous

    My mind is officially blown
    Definitely some strange voodoo going on here.

    I have never even heard of Expression Trees. Wow!

    Makes me want to rethink everything that I have ever done with linq.

  • TheSQLGuru

    Generated SQL?
    Could you do me a favor and provide the generated sql for those samples above that are built to hit a SQL Server database? I am wondering about what the TSQL looks like and how it might perform. Thanks in advance!

    Best,

    Kevin G. Boles

  • nick harrison

    RE: Generated SQL?
    Kevin,

    You bring up a good point. You do need to keep an eye on the SQL generated. For the examples shown in this article, the generated SQL is trivial. For other cases, the SQL could be much more complex.

    Hibernating Rhinos makes profilers specifically designed to monitor and identify problems from an ORM http://www.hibernatingrhinos.com/products.

    Also the Ants Performance Profiler has some nice features to put the SQL into context of everyhing else that is happening to the database. https://www.simple-talk.com/dotnet/performance/under-the-orm-hood-revealing-the-sql/ provides more background on this. http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/

    Each LINQ provider may generate different SQL. You may see subtly different SQL from the same LINQ query by using Entity Framework or NHibernate or any other LINQ provider.

    You really need to run a good profiler against your application hitting your data using your provider.

  • nick harrison

    LINQ Providers
    Another point that may not be readily apparent, a LINQ provider does not necessarily need to target a database.

    Charlie Calvert maintains a list of LINQ providers here http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/

    The individual providers take the expression tree expressed by your query and converts it into something meaningful for the platform it is targeting. In the most common case, this is converting the expression tree to a SQL Statement to run against a database, but that is not the only option.

  • Anonymous

    Code Errors
    You might want to revisit the code snippet for the extension method GenericEvaluateOrderBy since it will not compile (VS2012).

    Otherwise a very interesting article.

  • nick harrison

    RE: Code Errors
    What error do you get when you try to compile?

    It may be that you have the wrong references for Expression. Make sure that in your using section you include:

    using System.Linq.Expressions;

    Let me know if you are getting a different compile error.

  • Henry Minute

    RE: RE: Code Errors
    The following two lines are problematic
    var parameter = Expression.Parameter(type, "p");
    var propertyReference = Expression.Property(pe, propertyName);

    In the first the var is called ‘parameter’.
    In the second it is referenced as ‘pe’

    It looks as if a little cut and pasting has been going on 🙂

  • nick harrison

    RE: RE: Code Errors
    Henry,

    You are correct. Or actually an incomplete Rename Variable refactor 🙂

    I tried to convert to more meaningful names at the end and missed a couple.

    Sorry about that, and I do hope that this does not cause too much confusion.

  • JoNSmith

    Have a look at Microsoft’s System.Linq.Dynamic
    I have just been writing a server sorting and filtering class to work with JqxWidgets grids in MVC4. I looked at a number of solutions but finally used System.Linq.Dynamic written by Microsoft. It allows you to define Linq Command dynamically using text, e.g.
    var query =
    db.Customers.
    Where("City = @0 and Orders.Count >= @1", "London", 10).
    OrderBy("CompanyName").
    Select("new(CompanyName as Name, Phone)");

    The assembly is available as a download or as a NuGet package put up by King Wilder.

    To find out more about Linq.Dynamic follow this link: http://www.lcs.syr.edu/faculty/fawcett/handouts/CoreTechnologies/CSharp/samples/CSharpSamples/LinqSamples/DynamicQuery/Dynamic%20Expressions.html

  • JoNSmith

    Have a look at Microsoft’s System.Linq.Dynamic – more
    Another useful link for you:
    http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx

  • nick harrison

    RE: Have a look at Microsoft’s System.Linq.Dynamic
    Depending on your needs this is a viable alternative.

    The main concern that I have with this approach is that if you rename properties on the objects involved, refactoring tools will not catch the references in these strings.

    Depending on what you are doing, this may not be a problem.

    The other concern is the need to verify the data in the strings being used. Data originating from a user would need to be scrubbed and sanitized before he could be used like this.

    I don’t think that you would be vulnerable to a traditional SQL injection attack, but still keep the validations in place. Internally the string passed to the Dynamic methods will eventually be converted to Expression Trees which should catch malicious code, but you want to catch potential attacks as early in the process as you can.

    John, I would love to hear how this project works out for you.

  • JoNSmith

    Re: Have a look at Microsoft’s System.Linq.Dynamic – more
    Hi Nick,

    Thanks for your comments. I am developing a generic jQWidgets GridBuilder class that looks at the classes, including the DataAnnotations, to build the grid. This means that any changes in the properties will be picked up automatically.

    The strings are produced by the jQWidget grid sort and filter, so they won’t be wrong (although JSON data and .NET date are fun). I worry about a malicious attack and plan to us a GET version of the MVC AntiForgeryToken (I have a link about that).

    Project is coming on well, but still early days. Impressed by jQWidgets but takes some digging to find the things you need.

  • nick harrison

    RE: Have a look at Microsoft’s System.Linq.Dynamic
    I am not familar with jQWidgets. I will have to look into them.

    Dealing with dates can be a pain.

    I use this library for my client side date manipulation. http://www.datejs.com/ It is often easier to format the date on the server side rather than sending a raw date over JSON.

    Sounds like you are taking some interesting approaches. Please post back to keep us up to date on how this goes.

  • nick harrison

    RE: Have a look at Microsoft’s System.Linq.Dynamic
    Depending on your needs this is a viable alternative.

    The main concern that I have with this approach is that if you rename properties on the objects involved, refactoring tools will not catch the references in these strings.

    Depending on what you are doing, this may not be a problem.

    The other concern is the need to verify the data in the strings being used. Data originating from a user would need to be scrubbed and sanitized before he could be used like this.

    I don’t think that you would be vulnerable to a traditional SQL injection attack, but still keep the validations in place. Internally the string passed to the Dynamic methods will eventually be converted to Expression Trees which should catch malicious code, but you want to catch potential attacks as early in the process as you can.

    John, I would love to hear how this project works out for you.

  • dyarosh

    How to implement complex object heirarchy
    I am currently using Expression Trees with my objects but now need to use them with complex objects which you ignored in the article 🙁 Can you point me in the right direction for using reflection to map out the relationship and then adopt recursion to build up the property references? Do you have another article on that topic?

  • james405

    First time with expressions and extensions
    I followed your example sending an custom object, but get the error:

    No generic method ‘OrderBy’ on type ‘System.Linq.Queryable’ is compatible with the supplied type arguments

    below is the code: Not sure what I did wrong…

    List<myNumber> myNumbers = new List<myNumber>();

    myNumbers.Add(new myNumber() { aNumber = 5, name = "Fifth" });
    myNumbers.Add(new myNumber() { aNumber = 2, name = "Second" });
    myNumbers.Add(new myNumber() { aNumber = 80, name = "80" });
    myNumbers.Add(new myNumber() { aNumber = 56, name = "Big Five Six" });

    IQueryable<myNumber> iq = myNumbers.AsQueryable<myNumber>();
    iq = ExpressionTreeSort(iq, "name");

    public static IOrderedQueryable<TSource> GenericEvaluateOrderBy<TSource> (this IQueryable<TSource> query, string propertyName) {

    try
    {
    //get the type of the source
    var type = typeof(TSource);
    //get the parameter
    var parameter = Expression.Parameter(type, "p");
    //create the property reference
    var propertyReference = Expression.Property(parameter, propertyName);

    var sortExpression = Expression.Call(typeof(Queryable),
    "OrderBy",
    new Type[] { type },
    null,
    Expression.Lambda<Func<TSource, string >>
    (propertyReference, new[] { parameter}));

    //return – note in the tutorial this cast was not present, but is needed since going from IQueryable to IOrderedQueryable.
    return (IOrderedQueryable<TSource>)query.Provider.CreateQuery<TSource>(sortExpression);
    }
    catch (Exception ex)
    {

    throw new Exception(ex.Message);
    }

    }

    Could you possibly shed some light on where I messed up..
    Thanks

  • nick harrison

    RE: First time with expressions and extensions
    Looks like something may have gotten lost in translation. Try this for the implementation of the GenericEvaluateOrderBy

    public static IQueryable<TSource>
    GenericEvaluateOrderBy<TSource>
    (this IQueryable<TSource> query,
    string propertyName)
    {
    var type = typeof (TSource);
    var property = type.GetProperty(propertyName);
    var parameter = Expression.Parameter(type, "p");
    var propertyReference = Expression.Property(parameter, propertyName);
    try
    {
    var sortExpression = Expression.Call(typeof(Queryable),"OrderBy",
    new[] {type, property.PropertyType},
    query.Expression,
    Expression.Lambda<Func<TSource, string>>
    (propertyReference, new[] { parameter }));

    return (IOrderedQueryable<TSource>)query.Provider.CreateQuery<TSource>(sortExpression);
    }
    catch (Exception e)
    {

    throw new Exception(ex.Message);
    }
    }

    Let me know if you have any further issues

  • bradb

    groupby implementation
    thank you very much for the article. I have considering how one would create an expression tree for a groupby() , any suggestions would be very appreciated.