Tuesday, January 24, 2012

Extending DynamicLINQ language: Specifying class name in "new" clause

Dynamic Linq (1) is a library provided in source code by Microsoft which provides dynamic linq capabilities - i.e. you can construct queries as strings instead of type-safe programming language constructs as in the default linq. The library provides additional extension methods for IQueryable like Where and Select which accept strings which represent queries which are parsed in the runtime into the adequate lambda expressions for linq expression tree.

It allows you to write queries like the following:
var query = YourDB.Products.Where("CategoryID = 2 And Price < 10").OrderBy("StockCount");
What can constitute the string is a simple specific DynamicLinq language which is well documented in the documents attached to the DynamicLinq package. The heart of DynamicLinq is actually an implementation of a parser for this microlanguage which parses the given string into Expression object.

The part of the language is an ability to instantiate new objects of anonymous classes, especially useful in Select and SelectMany directives. You can do e.g.:
cars.Select("new (name, year, engine.type as engine_type)");
This will construct the object containing properties: name, year and engine_type.

In standard Linq as in e.g. C#, apart from dynamic "typeless" objects, you are able to construct the query in such a way that the objects of already existing type are returned. This is, however, not possible with the DynamicLinq as provided by Microsoft.
Nevertheless, with some knowledge of how compilers (or more precisely parsers) work, some understanding of lambda expression trees in .NET and after analyzing the source code of Dynamic.cs (file containing the implementation of the DLinq), it is relatively easy to extend it with the capabilities to name the existing types to be created in new clauses.

Currently, the grammar for new expression in DynamicLinq language is as following:
new (expr1 as name1, expr2 as name2, ..., exprn as namen)
where name# is the name of the property in the resultant object which will hold the value evaluated from expr#. If the expression boils down to the property getter, the as name# part can be omitted and the property in the resultant object will be exact to the name of evaluated property. We can extend the grammar in the following manner without introducing ambiguity:
new Namespace.TypeName (expr1 as name1, expr2 as name2, ..., exprn as namen)
Not providing the TypeName will still denote the instantiation of an anonymous object. If we take a look at ExpressionParser class, we quickly localize the method ParseNew. This is, indeed, a method responsible for parsing the new expression. Currently, the parse method looks more or less like the following:
  1. Consume "new" keyword.
  2. Consume opening parenthesis
  3. Loop doing the following:
    1. Parse expression
    2. If next token is "as" consume it and the following token as an identifier
    3. Store the dynamic property definition using the obtained name and expression
    4. If the next token is comma, consume and continue; otherwise break loop.
  4. Consume closing parenthesis
  5. Synthesize and instantiate the anonymous type based on the accumulated dynamic property definitions.
  6. Return the expression tree node for type instantiation parametrized by the obtained type.
In order to support our new grammar, we will have to add more steps between 1 and 2 which before consuming the opening parenthesis will consume as many as possible identifiers separated by dot(.) which will constitute the name for the existing type. Additionally, if any such identifier is actually present, we will toggle the flag signalizing that we are constructing the object of the existing type.
With the flag on, instead of points 5 and 6, we will instantiate the existing type using Type.GetType and bind the expressions values to its already present properties.

Finally, the ParseNew method will look like the following:
Expression ParseNew() {
    NextToken();

    bool anonymous = true;
    Type class_type = null;

    if (token.id == TokenId.Identifier)
    {
        anonymous = false;
        StringBuilder full_type_name = new StringBuilder(GetIdentifier());
        
        NextToken();
        
        while (token.id == TokenId.Dot)
        {
            NextToken();
            ValidateToken(TokenId.Identifier, Res.IdentifierExpected);
            full_type_name.Append(".");
            full_type_name.Append(GetIdentifier());
            NextToken();
        }
        
        class_type = Type.GetType(full_type_name.ToString(), false);    
        if (class_type == null)
            throw ParseError(Res.TypeNotFound, full_type_name.ToString());
    }

    ValidateToken(TokenId.OpenParen, Res.OpenParenExpected);
    NextToken();
    List<DynamicProperty> properties = new List<DynamicProperty>();
    List<Expression> expressions = new List<Expression>();
    while (true) {
        int exprPos = token.pos;
        Expression expr = ParseExpression();
        string propName;
        if (TokenIdentifierIs("as")) {
            NextToken();
            propName = GetIdentifier();
            NextToken();
        }
        else {
            MemberExpression me = expr as MemberExpression;
            if (me == null) throw ParseError(exprPos, Res.MissingAsClause);
            propName = me.Member.Name;
        }
        expressions.Add(expr);
        properties.Add(new DynamicProperty(propName, expr.Type));
        if (token.id != TokenId.Comma) break;
        NextToken();
    }
    ValidateToken(TokenId.CloseParen, Res.CloseParenOrCommaExpected);
    NextToken();
    Type type = anonymous ? DynamicExpression.CreateClass(properties) : class_type; 
    MemberBinding[] bindings = new MemberBinding[properties.Count];
    for (int i = 0; i < bindings.Length; i++)
        bindings[i] = Expression.Bind(type.GetProperty(properties[i].Name), expressions[i]);
    return Expression.MemberInit(Expression.New(type), bindings);
}
We will also have to add comment for the introduced exception to Res class:
public const string TypeNotFound = "Type {0} not found";
Now, we are able to construct queries like following:
cars.Select("new MyApp.CarInfo(name as name, year as year, engine.type as engine_type)");
You can also nest news:
cars.Select("new MyApp.CarInfo(name, year, new EngineInfo(engine.type as type, engine.info as my_info) as engine_info)");
Of course, you can also, for example, nest typed object inside an anonymous one:
cars.Select("new (name, year, new EngineInfo(engine.type as type, engine.info as my_info) as engine_info)");
Feel free to use the modified DynamicLinq, it is uploaded to Google Code.(3)

(1) Dynamic Linq is a part of a package available here.

(2) Dynamic Linq is described thoroughly in this blog post.

(3) The full modified Dynamic.cs is available here. Note: The version from HEAD has further updates not described here.

(4) Original StackOverflow answer where I presented the changes: Dynamic LINQ: Specifying class name in new clause

6 comments:

  1. DyanmicLINQ language has made our life simpler otherwise we might have engaged in righting a lots of codes just to obtain same factor which we could do in a more lightweight way. It allows you create a general narrow component which is quite scalable with regards to including filtration to new entities.

    ReplyDelete
  2. I just tried using this code, but it doesn't work when I pass in a generic type. For instance:

    If I have something like this:

    public object Example(string param)
    {
    //do some code
    //IQueryable o
    o.Select(new T(a as something, b as something2));
    }

    How could I get this to work?

    ReplyDelete
  3. As a work around, I just got the type of T and then passed it instead.

    See here:

    Type t = typeof(T);
    o.Select("new " + t.Namespace + "." + t.Name + "(a as something, b as something2)");

    ReplyDelete
  4. Hi I tried your code for grouping purpose but its throwing error when i am using multiple grouping. my code is below.

    var res = kk.AsQueryable()
    .GroupBy("new (TypeName as Q,Name as Y)", "it")
    .Select("new (it.Key as Key,it.Count() as Count)");

    ReplyDelete
  5. You sir are a genius. Cheers!!!

    ReplyDelete
  6. Generic results are not supported ? Like:

    public class Result
    {
    public TKey Field { get; set; }
    public IGrouping Employee { get; set; }
    }

    ReplyDelete