Jacob Carpenter’s Weblog

March 3, 2008

Chained generic argument dependence revisited

Filed under: csharp — Jacob @ 6:36 pm

My last post had a fatal flaw: it looked like it was about F# and pattern matching.

The bit I was most proud of was the interesting application of generic parameters, constraints, and type inference—a sort of chained generic argument dependence. But I tricked you into thinking I was solving a very specific problem (that precious few of you care about).

But just today, I encountered a more generalized case where this type of generic abuse makes perfect sense!

A brief interruption

Raise your hand if you know what “functional composition” is.

For those of you with your hands firmly planted on your mice, functional composition is a really simple concept:

Say you have a function, let’s call it f, that takes an x and returns a y. You have another function, g, which takes a y and returns a z. Using functional composition, you can create a new function (h, for instance) which takes an x and returns a z.

That is, in C#, given:

Func<TX, TY> F;
Func<TY, TZ> G;

F and G can be composed as:

Func<TX, TZ> H = x => G(F(x));

Simple enough, right? Well, pretend we think this is useful enough to abstract to a utility method.

The definition would look something like:

public static Func<T1, TResult> Compose<T1, T2, TResult>(this Func<T1, T2> func1, Func<T2, TResult> func2)
{
    return t1 => func2(func1(t1));
}

Did you notice anything about that method signature?

We’ve introduced a form of chained generic argument dependence. The second parameter’s type depends on the first. Specifically, the second parameter must be a function whose argument matches the type of the return value of the first function.

In the case of Compose, this dependence isn’t a difficult API design decision; it’s merely an outcome of what we’re trying to express. But in some situations harnessing this chained dependence (and the compiler’s type inference capabilities) can result in an interesting API.

Real-world example

I’m surprised more people haven’t included some form of a null propagating operator on their C# 4.0 feature request lists. Well, Ed blogged our take on the operator awhile ago.

Let’s use that definition of IfNotNull to create a “real world” example of chained generic argument dependence. And how much more “real world” can you get than Customers and Orders?

class Customer
{
    public string Name { get; set; }
    public IList<Order> Orders { get; set; }
}

class Order
{
    public string Product { get; set; }
}

public static class IfNotNullExtension
{
    public static U IfNotNull<T, U>(this T t, Func<T, U> fn)
    {
        return t != null ? fn(t) : default(U);
    }
}

Let’s say we want to get the name of the first product of the first order. Using the above IfNotNull extension method, the code would look something like:

static void Main(string[] args)
{
    var customers = new[] {
        new Customer { Name = "Alejandro C. Dazé", Orders = new [] { new Order { Product = "Widget" } } },
        new Customer { Name = "Brad S. Grahne", Orders = null },
        null,
    };

    foreach (Customer cust in customers)
    {
        string productName = cust.IfNotNull(c => c.Orders)
            .IfNotNull(orders => orders.FirstOrDefault())
            .IfNotNull(o => o.Product);

        Console.WriteLine(productName ?? "<null>");
    }
}

The output of which is:

Widget
<null>
<null>

Works great! But we have an opportunity here to use chained generic argument dependence to remove a couple of those calls to IfNotNull.

Chained generic argument dependence

Using our previous definition of IfNotNull, we can add the following (naively implemented) overload:

public static TResult IfNotNull<T1, T2, T3, TResult>(this T1 t,
    Func<T1, T2> fn1, Func<T2, T3> fn2, Func<T3, TResult> fnResult)
{
    return t.IfNotNull(fn1).IfNotNull(fn2).IfNotNull(fnResult);
}

Then, our calling code can become:

string productName = cust.IfNotNull(c => c.Orders,
    orders => orders.FirstOrDefault(), o => o.Product);

And I know I keep showing this, but because of generic type inference, intellisense helps out as you write each lambda expression:

capture

I think that’s pretty cool.

Conclusion

I hope this post held your interest a little longer than the last one did. And I hope you can start to see uses for this sort of chained generic argument dependence in your APIs.

I’d be really interested if you’re already doing something like this or if you have a better name for this pattern. Let me know in the comments.

And finally, if this post was somehow unfulfilling and you still feel like you need to learn something cool, check out how to calculate square roots by hand!

February 20, 2008

Generic abuse: Pattern matching in C#

Filed under: csharp — Jacob @ 1:59 pm

Sorry it’s been so long since a post, dear faithful aggregator. Apart from working, I’ve been busily readying my entry for the WPF in Finance contest (which, coincidentally has inspired a potential post or two).

Yesterday, however, I encountered Dustin’s post on pattern matching in F#. Go read it; I’m not going to reiterate the concepts, and you need to be familiar with the feature for this post to possibly begin to make any sense.

His post inspired the most creative (perhaps twisted) use of generics I’ve ever written. And I have to share.

A couple of caveats though: pattern matching is built into F#. It is therefore much, much cooler than the following experiment. Please don’t comment telling me that mine is lamer than F#’s.

Also, I’m usually morally opposed to screenshots of code (rather than the actual code, itself), but I need to visually highlight some stuff to help explain it. I’ve attached the full source at the bottom of the post.

Intro

If I were writing a pattern matching API in C#, one of my first thoughts would be “fluent interface“. At it’s simplest, pattern matching needs to support:

  1. Some variable number of guard clauses (only at the beginning of the pattern match).
  2. Some variable number of pattern expressions (a predicate of some sort and an associated action).
  3. A single final catch-all expression (action only).

Fluent interfaces can represent this type of constrained chaining pretty nicely. In fact, LINQ sort of supports constrained chaining with OrderBy/ThenBy. OrderBy returns an IOrderedEnumerable, which supports ThenBy (via extension). You can’t call ThenBy on an ordinary IEnumerable; you have to call OrderBy first.

The problem with fluent interfaces for solving this problem, though, is that successive method calls don’t really feel like adding expressions to a pattern match:

image

Note that it’s only whitespace (which the compiler ignores) that communicates that .When(…).Return(…) adds one expression, whereas .Guard(…) and the last .Return(…) each correspond to a single expression.

Also, we haven’t looked at the object model yet, but note that a final call to .Compile is required to transform whatever type the last .Return returns into a Func<int, int>. I hate that.

There are other options. We could write a method that took a params array of loosely typed pattern expressions. But we would sacrifice compile-time expression order constraint.

My crazy solution

Through numerous rewrites and after going around in circles for some time, I decided on the following pattern:

  1. Define an object model of “match contexts” that (similar to LINQ’s OrderBy/ThenBy) return strongly typed contexts defining what operations are supported from the current context.
  2. In my pattern match building method, use Funcs and generics (with type constraints) to allow the strongly typed contexts to flow from parameter to parameter.

Let’s look at the object model:

public abstract class ClosedMatchContext<T, TResult>
{
}

public abstract class MatchContext<T, TResult> : ClosedMatchContext<T, TResult>
{
    public abstract IntermediateMatchResultContext<T, TResult> When(Func<T, bool> condition);
    public abstract ClosedMatchContext<T, TResult> Return(TResult result);
    public abstract ClosedMatchContext<T, TResult> Return(Func<T, TResult> resultProjection);
}

public abstract class OpenMatchContext<T, TResult> : MatchContext<T, TResult>
{
    public abstract OpenMatchContext<T, TResult> Guard(Func<T, bool> failWhen, Func<T, TResult> failWith);
}

public abstract class IntermediateMatchResultContext<T, TResult>
{
    public abstract MatchContext<T, TResult> Return(TResult result);
    public abstract MatchContext<T, TResult> Return(Func<T, TResult> resultProjection);
}

Beginning with OpenMatchContext, we have a context that supports: Guard, When, and Return operations. Moving up the hierarchy, the more general MatchContext supports only When and Return. Finally at the top level, ClosedMatchContext doesn’t support any further operations.

When, defined on MatchContext, returns an IntermediateMatchResultContext which requires a call to Return to get back to a MatchContext.

Now it’s getting interesting…

Or perhaps, difficult to understand?

public static class Match<T, TResult>

… defines a number of On methods that take series of Func arguments. Let’s look at the signature of the 4 parameter one:

image

Now take a deep breath.

The first parameter is a Func<OpenMatchContext, TCtx1>. An open context comes in and some MatchContext (according to the constraint) comes out.

The second parameter takes a Func<TCtx1, TCtx2>. The context returned by the first parameter is given to the second parameter, and some new MatchContext comes out. The third parameter is very similar.

Finally, the last parameter takes a Func<TCtx3, ClosedMatchContext>. That is, we expect to get some closed context as the final output.

Let’s see if a little highlighting can help?

image

My graphic design skillz could use some work, but do you follow the flow of the types?

This means that when you’re using On the parameters are strongly typed:

image

Following a Guard expression, you can add an additional Guard. But…

image

Following a When+Return, you can no longer Guard.

There’s more interesting type stuff going on in this example, but I’ll have to discuss it in a following post.

I hope reading this has produced a pleasantly painful sensation. If not, go read Wes Dyer’s post on Monads.

Source listing

I’ve attached the source to this post, but I have to lie about the extension. Rename the attached Program.cs.REMOVE.DOC to Program.cs.

UPDATE: I’ve made the source more accessible by simply including it after the “more” link.

I’d encourage you to copy it into a ConsoleApplication’s Program.cs and play around with it. Let me know what you think.

Enjoy!

(more…)

February 4, 2008

DispsoseAfter

Filed under: csharp, extension methods — Jacob @ 9:17 am

My development team at work has recently started a technical blog: http://code.logos.com.

I just contributed my first (real) post: DisposeAfter.

Enjoy.

January 3, 2008

C# abuse of the day: Avoiding typecasting null

Filed under: csharp — Jacob @ 3:30 pm

With two posts in as many days, you might get the wrong idea. Let me reassure you: you cannot count on me posting regularly.

That being said, I just found another neat C# abuse and had to share.

It’s occasionally useful (or necessary, or both) to use null as a starting value or to explicitly return a null value. Aside from initializing a local variable, another (more interesting) example of the former is the Enumerable.Aggregatefold for you functional programmers—overload that takes a seed.

In certain circumstances, specifying null can cause difficulty for the compiler. With the previously referenced method, a null seed will break generic type inference and force you to either:

  1. cast null to the correct type (yuck!) or
  2. explicitly specify the two type arguments (double yuck!)

Casting null is also required when trying to use null as a result for the ternary operator (<condition> ? <true-result> : <false-result>). The compiler needs to know (and verify) the type of that expression, and gripes about “no implicit conversion” when it sees a null literal.

It turns out there’s another option, though.

When the .NET Framework 2.0 introduced generics, there was a need for a way to express the default value for a generic type (since generic types could be either value or reference types). So the default keyword (of switch fame) was overloaded with a new function.

What isn’t explicitly mentioned in that documentation is that you can use default with concrete types, too. And an expression using default is strongly typed.

So, now instead of saying:

(ComplexObject) null

You have the option of saying:

default(ComplexObject)

I’m not sure yet if I actually prefer this to typecasting null. It certainly looks odd, but that could just be unfamiliarity.

But I thought it had potential and wanted to share. Thoughts?

January 2, 2008

C# abuse of the day: Functional library implemented with lambdas

Filed under: csharp, extension methods, functional programming — Jacob @ 6:09 pm

With all the cool kids writing about F# and functional programming, I started thinking about a possible blog post.

One of my goals was to use lambda syntax to express the functional method implementations. To my eyes, lambdas are great at succinctly expressing higher-order functions. And using the => operator multiple times in a single line rocks! Without thinking about it too hard, I figured I could use static readonly fields to accomplish this goal.

Once I started writing the example code, though, I ran into a bit of a hitch with the generic parameters for the fields’ Func types.

Joseph Albahari (or perhaps his brother and coauthor, Ben) puts it well in C# 3.0 In a Nutshell [which, incidentally, is proving to be a great C# book] (pg. 99):

Generic parameters can be introduced in the declaration of classes, structs, interfaces, delegates [...], and methods. Other constructs such as properties [or fields] cannot introduce a generic parameter, but can use one.

Meaning, if I want to declare a field that contains a generic parameter, that generic parameter has to be declared by the containing type.

Specifically:

public static class Functional
{
    public static readonly Func<Func<X, Y, Z>, Func<X, Func<Y, Z>>> Curry =
        fn => x => y => fn(x, y);
}

Won’t compile. Instead you’ll get a few of these:

error CS0246: The type or namespace name ‘X’ could not be found (are you missing a using directive or an assembly reference?)

You’d need to modify the class definition like so:

public static class Functional<X, Y, Z>

Now we could add the parameters to our Functional class, but then the calling code would be hideous:

Func<int, int, int> add = (x, y) => x + y;
Func<int, Func<int, int>> addCurried = Functional<int, int, int>.Curry(add);

I mean, I know this is C#, but that is just way too much type decoration. Especially since the three type arguments to Functional should all be inferable.

Ideally, the calling code should be an extension method:

Func<int, int, int> add = (x, y) => x + y;
Func<int, Func<int, int>> addCurried = add.Curry();

And then it dawned on me: we can define generic extension methods on a static FunctionalEx class and delegate the implementation to a nested generic class (with generic fields).

That is, we can hide the ugly syntax of invoking a delegate field of a generic class, while utilizing the ugly syntax of implementing our functional methods using lambdas!

public static class FunctionalEx
{
    public static Func<T1, Func<T2, TResult>> Curry<T1, T2, TResult>(this Func<T1, T2, TResult> fn)
    {
        return Implementation<T1, T2, TResult>.curry(fn);
    }

    public static Func<T2, T1, TResult> Flip<T1, T2, TResult>(this Func<T1, T2, TResult> fn)
    {
        return Implementation<T1, T2, TResult>.flip(fn);
    }

    private static class Implementation<X, Y, Z>
    {
        public static readonly Func<Func<X, Y, Z>, Func<X, Func<Y, Z>>> curry = 
            fn => x => y => fn(x, y);
        
        public static readonly Func<Func<X, Y, Z>, Func<Y, X, Z>> flip =
            fn => (y, x) => fn(x, y);
    }
}

Notice how the Curry extension method is implemented by the curry field of the nested generic Implementation class. Also notice how [un-?]readable the lambda implementation is. (Seriously though, if you look at the flip lambda long enough, it should start to make sense.)

Here’s some (far from practical) sample calling code to help:

class Program
{
    static void Main(string[] args)
    {
        Func<int, int, int> add = (x, y) => x + y;
        Func<int, Func<int, int>> addCurried = add.Curry();
        Func<int, int> increment = addCurried(1);

        Func<int, int, int> subtract = (x, y) => x - y;
        Func<int, Func<int, int>> subtractFlipped = subtract.Flip().Curry();
        Func<int, int> decrement = subtractFlipped(1);        

        Console.WriteLine("Expected: {0}; Actual {1}", 5, add(2, 3));
        Console.WriteLine("Expected: {0}; Actual {1}", 7, increment(6));

        Console.WriteLine("Expected: {0}; Actual {1}", 6, subtract(9, 3));
        Console.WriteLine("Expected: {0}; Actual {1}", 4, decrement(5));
    }
}

The output of which is:

Expected: 5; Actual 5
Expected: 7; Actual 7
Expected: 6; Actual 6
Expected: 4; Actual 4

I never did get around to writing that post on functional programming, but I now know how I’m going to implement the library if I do.

If you want to read more on currying in C#, I’d recommend Dustin’s post. In fact I should probably skip writing any further posts on functional programming with C#, since he’s got that topic pretty thoroughly covered.

November 24, 2007

Another set of extension methods

Filed under: csharp, extension methods, Ruby — Jacob @ 3:16 pm

In addition to the interesting diversions we’ve taken, I do want to continue presenting potentially useful code samples too. So here’s a fresh set of ruby inspired extensions:

public static IEnumerable<IndexValuePair<T>> WithIndex<T>(this IEnumerable<T> source)
{
    int position = 0;
    foreach (T value in source)
        yield return new IndexValuePair<T>(position++, value);
}    

public static void Each<T>(this IEnumerable<T> source, Action<T> action)
{
    foreach (T item in source)
        action(item);
}

public static void EachWithIndex<T>(this IEnumerable<T> source, Action<T, int> action)
{
    Each(WithIndex(source), pair => action(pair.Value, pair.Index));
}

I’ll include the mundane definition of IndexValuePair<T> (along with parameter validation) at the end of this post. But spend some time looking at these very simple methods.

diagram 1

Notice how once we’ve defined WithIndex and Each, we can combine them to define EachWithIndex. When we chain Each to the result of WithIndex, the type of the Action must be converted accordingly:

diagram 2

This is easily accomplished by the statement:

pair => action(pair.Value, pair.Index)

So with the added definition of IndexValuePair<T> and parameter validation, we add to our collection extensions the following:

using System;
using System.Collections.Generic;

public static class CollectionEx
{
    public static void Each<T>(this IEnumerable<T> source, Action<T> action)
    {
        if (source == null)
            throw new ArgumentNullException("source");
        if (action == null)
            throw new ArgumentNullException("action");

        foreach (T item in source)
            action(item);
    }

    public static void EachWithIndex<T>(this IEnumerable<T> source, Action<T, int> action)
    {
        if (source == null)
            throw new ArgumentNullException("source");
        if (action == null)
            throw new ArgumentNullException("action");

        Each(WithIndexIterator(source), pair => action(pair.Value, pair.Index));
    }

    public static IEnumerable<IndexValuePair<T>> WithIndex<T>(this IEnumerable<T> source)
    {
        if (source == null)
            throw new ArgumentNullException("source");

        return WithIndexIterator(source);
    }

    private static IEnumerable<IndexValuePair<T>> WithIndexIterator<T>(IEnumerable<T> source)
    {
        int position = 0;
        foreach (T value in source)
            yield return new IndexValuePair<T>(position++, value);
    }
}

public struct IndexValuePair<T>
{
    public IndexValuePair(int index, T value)
    {
        m_index = index;
        m_value = value;
    }

    public int Index
    {
        get { return m_index; }
    }
    public T Value
    {
        get { return m_value; }
    }

    readonly int m_index;
    readonly T m_value;
}

November 19, 2007

Named parameters { Part = 2 }

Filed under: csharp — Jacob @ 7:19 am

Note: the following is presented merely as an interesting diversion. It’s not a technique I’m legitimately proposing to the C# community, though I certainly welcome any feedback you may have.

Continuing our discussion from last time, we’re going to look at what we can do with object initialization syntax to enable immutable type initialization. But first we’re going to need a contrived example immutable type:

public class CoffeeOrder 
{ 
    public string DrinkName 
    { 
        get { return m_drinkName; } 
    } 
    public bool AddSugar 
    { 
        get { return m_addSugar; } 
    } 
    public bool IncludeRoom 
    { 
        get { return m_includeRoom; } 
    } 
    public bool IsDecaf 
    { 
        get { return m_isDecaf; } 
    } 
    public int SizeOunces 
    { 
        get { return m_sizeOunces; } 
    } 

    readonly string m_drinkName; 
    readonly bool m_addSugar; 
    readonly bool m_includeRoom; 
    readonly bool m_isDecaf; 
    readonly int m_sizeOunces; 
}

This is obviously not a post on type design.

So we want to be able create instances of these using initialization-like syntax. We also don’t want to define a number of constructors to accommodate all of the combinations of optional parameters.

The simplest way we can achieve these goals would be to create a CoffeeOrder constructor that took an object. Then, using reflection, we could examine the properties of any passed-in value to see what fields we should set.

public CoffeeOrder() { } 

public CoffeeOrder(object initializer) 
{ 
    /* reflect over "initializer" and set fields */ 
}

We could then create CoffeOrders like so:

CoffeeOrder order = new CoffeeOrder(new 
{ 
    DrinkName = "Drip", 
    SizeOunces = 16, 
});

Unfortunately, that wouldn’t be very reusable. Every time we defined a new immutable type, we’d need to rewrite the reflection code. We also aren’t being very clear to the caller what the passed in object is supposed to be.

Let’s create an abstraction for the initializer parameter:

public class NamedArgs 
{ 
    public static NamedArgs Create(object initializer) 
    { 
        return new NamedArgs { m_data = initializer, m_datatype = initializer.GetType() }; 
    } 

    object m_data; 
    Type m_datatype; 
}

Our CoffeeOrder constructor signature becomes:

public CoffeeOrder(NamedArgs args)

And the calling code is:

CoffeeOrder order = new CoffeeOrder(NamedArgs.Create(new 
{ 
    DrinkName = "Drip", 
    SizeOunces = 16, 
}));

We’ve improved the CoffeeOrder constructor signature a bit. It’s certainly more descriptive of what is expected. The code to call the constructor is incrementally less beautiful now, though.

But let’s move on and see about actually getting data out of this NamedArgs object.

We’re going to use reflection APIs (System.Reflection) to do so, though we’re only going to scratch the surface of what’s possible. Right now, we’re just going to look up property values by string names. We’ll look at a more interesting way to refer to the properties before we’re done.

public class NamedArgs 
{ 
    // ... 

    public void SetFromArg<TValue>(string argumentName, ref TValue result) 
    { 
        // get the PropertyInfo for the requested argument (if available) 
        PropertyInfo infoProp = m_datatype.GetProperty(argumentName); 
        if (infoProp != null) 
        { 
            // set the result 
            result = (TValue) infoProp.GetValue(m_data, null); 
        } 
    } 
}

Rather than returning default(TValue) when the argument isn’t specified, or creating a TryGetValue sort of API, we’re only going to set the referenced value if the NamedArgs instance contains an argument matching the name. That way, the class that calls this method can set up default values before calling our method and not worry about explicitly handling any unspecified, optional arguments. So let’s update the anemic CoffeeOrder constructor:

public CoffeeOrder(NamedArgs args) 
{ 
    args.SetFromArg("DrinkName", ref m_drinkName); 
    args.SetFromArg("AddSugar", ref m_addSugar); 
    args.SetFromArg("IncludeRoom", ref m_includeRoom); 
    args.SetFromArg("IsDecaf", ref m_isDecaf); 
    args.SetFromArg("SizerOunces", ref m_sizeOunces); 
}

That’s surprisingly unoffensive. I mean, sure, there are a lot of magic strings, but it’s not the worst thing I’ve ever seen.

On the other hand, it does kind of suck. We can improve it, though, using the power of expression trees.

Property accesses are a type of expression (MemberExpression to be specific). We can pass in a lambda that performs property access on an argument and we can treat that expression as data to figure out the name of the member that’s being accessed.

So in the case of our CoffeeOrder constructor calling SetFromArg, we want an expression of type Expression<Func<CoffeeOrder, T>> where T is the target property’s value type. But CoffeeOrder will be the type parameter for all of the expressions, and we don’t want to lose our type inference on the ref parameter (if we specify one type argument to SetFromArg, we have to specify them all). So let’s create a new derivation of NamedArgs to hold that fixed type argument:

public class NamedArgs 
{ 
    /* ... */ 

    public static NamedArgs<T> For<T>(object initializer) 
    { 
        return new NamedArgs<T> { m_data = initializer, m_datatype = initializer.GetType() }; 
    } 
}

public class NamedArgs<T> : NamedArgs 
{ 
    public void SetFromArg<TValue>(Expression<Func<T, TValue>> propertyAccessor, ref TValue result) 
    { 
        // examine the expression tree for member access 
        MemberExpression accessor = propertyAccessor.Body as MemberExpression; 
        if (accessor == null) 
            throw new ArgumentException("Invalid expression type. Expecting member access.", "propertyAccessor"); 

        // set the value from the specified argument name 
        SetFromArg(accessor.Member.Name, ref result); 
    } 
}

We added the For method to the non-generic NamedArgs class just for aesthetics. NamedArgs.For<CoffeeOrder>(/*...*/) looks better than NamedArgs<CoffeeOrder>.For(/*...*/).

Now our CoffeeOrder constructor looks like:

public CoffeeOrder(NamedArgs<CoffeeOrder> args) 
{ 
    args.SetFromArg(c => c.DrinkName, ref m_drinkName); 
    args.SetFromArg(c => c.AddSugar, ref m_addSugar); 
    args.SetFromArg(c => c.IncludeRoom, ref m_includeRoom); 
    args.SetFromArg(c => c.IsDecaf, ref m_isDecaf); 
    args.SetFromArg(c => c.SizeOunces, ref m_sizeOunces); 
}

Which is nice because now we’re even more explicit about what the CoffeeOrder constructor is expecting. And, when we’re writing the calls to SetFromArg, we get intellisense for the available properties.

Our calling code becomes:

CoffeeOrder order = new CoffeeOrder(NamedArgs.For<CoffeeOrder>(new 
{ 
    DrinkName = "Drip", 
    SizeOunces = 16, 
}));

Which is certainly on the verbose side, but not abhorrent.

Closing

We’ve created a type that encapsulates the notion of named parameters with the intent of using it to construct immutable types. We’ve tailored some behaviors specifically to that intended usage, but it’s not closed from use by other methods.

Since anonymous types are immutable, they are thread-safe. The NamedArgs class with its fields that reference immutable type instances is thread-safe as well (though callers will obviously have to protect any values passed as ref parameters).

There are certainly features we could add. One commonly requested feature for immutable types is the ability to initialize one instance from another instance with only a few overridden values. Something like:

public static NamedArgs<T> From<T>(T baseValue, object overrides)

There are also certainly drawbacks to our technique. Some would argue the use of reflection alone is a drawback.

More discerning folks will point out that it’s terribly easy to give incorrectly named arguments. It’s nice that we got intellisense support while implementing the constructor by using expression trees, but we don’t have any such niceties when we’re creating a NamedArgs instance used for calling that constructor.

A more robust solution could certainly validate the property names for generic NamedArgs instances, but that would involve extra reflection.

I hope this post has been enjoyable and inspires other creative solutions. I’d love to hear any feedback you have on this.

Full example source follows:

using System; 
using System.Linq.Expressions; 
using System.Reflection; 

class Program 
{ 
    static void Main(string[] args) 
    { 
        var order1 = new CoffeeOrder(NamedArgs.For<CoffeeOrder>(new 
        { 
            DrinkName = "Drip", 
            SizeOunces = 16, 
        })); 

        var order2 = new CoffeeOrder(NamedArgs.For<CoffeeOrder>(new 
        { 
            DrinkName = "Drip", 
            SizeOunces = 12, 
            IsDecaf = true, 
            AddSugar = true, 
        })); 

        Console.WriteLine(order1); 
        Console.WriteLine(order2); 

        if (System.Diagnostics.Debugger.IsAttached) 
        { 
            Console.Write("Press any key to continue . . . "); 
            Console.ReadKey(true); 
        } 
    } 
} 

public class NamedArgs 
{ 
    public static NamedArgs Create(object initializer) 
    { 
        return new NamedArgs { m_data = initializer, m_datatype = initializer.GetType() }; 
    } 

    public static NamedArgs<T> For<T>(object initializer) 
    { 
        return new NamedArgs<T> { m_data = initializer, m_datatype = initializer.GetType() }; 
    } 

    public void SetFromArg<TValue>(string argumentName, ref TValue result) 
    { 
        // get the PropertyInfo for the requested argument (if available) 
        PropertyInfo infoProp = m_datatype.GetProperty(argumentName); 
        if (infoProp != null) 
        { 
            // set the result 
            result = (TValue) infoProp.GetValue(m_data, null); 
        } 
    } 

    object m_data; 
    Type m_datatype; 
} 

public class NamedArgs<T> : NamedArgs 
{ 
    public void SetFromArg<TValue>(Expression<Func<T, TValue>> propertyAccessor, ref TValue result) 
    { 
        // examine the expression tree for member access 
        MemberExpression accessor = propertyAccessor.Body as MemberExpression; 
        if (accessor == null) 
            throw new ArgumentException("Invalid expression type. Expecting member access.", 
                "propertyAccessor"); 

        // set the value from the specified argument name 
        SetFromArg(accessor.Member.Name, ref result); 
    } 
} 

public class CoffeeOrder 
{ 
    public CoffeeOrder() { } 

    public CoffeeOrder(NamedArgs<CoffeeOrder> args) 
    { 
        args.SetFromArg(c => c.DrinkName, ref m_drinkName); 
        args.SetFromArg(c => c.AddSugar, ref m_addSugar); 
        args.SetFromArg(c => c.IncludeRoom, ref m_includeRoom); 
        args.SetFromArg(c => c.IsDecaf, ref m_isDecaf); 
        args.SetFromArg(c => c.SizeOunces, ref m_sizeOunces); 
    } 

    public string DrinkName 
    { 
        get { return m_drinkName; } 
    } 

    public bool AddSugar 
    { 
        get { return m_addSugar; } 
    } 

    public bool IncludeRoom 
    { 
        get { return m_includeRoom; } 
    } 

    public bool IsDecaf 
    { 
        get { return m_isDecaf; } 
    } 

    public int SizeOunces 
    { 
        get { return m_sizeOunces; } 
    } 

    public override string ToString() 
    { 
        return String.Format("Coffee Order: {0} oz. {1}\n   IsDecaf     = {2}\n" + 
            "   IncludeRoom = {3}\n   AddSugar    = {4}", SizeOunces, DrinkName, 
            IsDecaf, IncludeRoom, AddSugar); 
    } 

    readonly string m_drinkName; 
    readonly bool m_addSugar; 
    readonly bool m_includeRoom; 
    readonly bool m_isDecaf; 
    readonly int m_sizeOunces; 
}

November 17, 2007

Named parameters and immutable type initialization

Filed under: csharp — Jacob @ 3:53 pm

Eric Lippert and Joe Duffy have each started blogging about immutable types (here and here, respectively). I’m a big fan of immutable types as the simplest solution to many thread-safety issues. And regardless of the number of threads your application utilizes, value types should always be immutable (and you should read the Framework Design Guidelines).

But there is a bit of a problem: C# 3.0 introduced "object initialization syntax."

Object initialization

Hopefully you’re familiar with the feature already, but here’s an example:

var settings = new XmlReaderSettings { CloseInput = true, IgnoreWhitespace = true };

XmlReaderSettings only has a default, parameterless constructor. The designers of the type expect you to individually set a number of properties on an instance before handing it off to XmlReader.Create. C# 3.0 now lets you initialize the instance in one statement.

Well, okay: you get to write one statement, but when compiled, that single statement turns back into a multi-step process of creating the instance and then setting the specified properties/fields individually. But it sure makes for nicer looking source.

Unfortunately, this compiler magic also means you cannot use object initialization syntax with immutable types. Since immutable types, by definition, don’t expose property setters, object initialization can’t compile.

Strangely enough, anonymous type instances (for which object initialization was created) are immutable. This post covers some of the changes to support the immutablity of anonymous types, including: "Anonymous type construction will no longer be realized as an Object initalizer (since the anonymous type is created in a single call), but as a single constructor call."

It’s a shame object initialization wasn’t changed to support immutablity, rather than anonymous types being modified to not use object initialization.

Named parameters

Object initialization isn’t unlike the concept of named parameters. In C# (with one caveat), all method/constructor parameters are positional parameters. That means that, while you give your method parameters descriptive names in your method declaration, the caller only cares about matching the order of their arguments to your method signature. Your method’s parameter names have no bearing on the caller.

With named parameters, parameters are allowed to be specified in any order and can even be optional. They can also make the source code more descriptive. Going back to our XmlReaderSettings example, imagine the constructor took a number of boolean parameters:

public XmlReaderSettings(bool closeInput, bool ignoreWhitespace /*, etc. */) { /*...*/ }

Our calling code would look like:

var settings = new XmlReaderSettings(true, true);

That’s obviously silly.

With named parameters, creation would look just like our object initialization example with parentheses instead of curly braces. There wouldn’t be any ambiguity about which properties were being set.

In some languages that don’t support named parameters (such as Ruby), people have used dictionaries to pass name/value pairs to methods. With C#’s collection initialization, a similar technique could be used. But .NET’s strong type system doesn’t support dictionaries with mixed value types very nicely. You could use a Dictionary<string, object>, but then value types would have to be boxed and there’d be casting to get values out… it’d be a mess.

Immutable type initialization

Hopefully, the future of C# includes stronger support for immutable types (including initialization scenarios). But until then, we can have some fun abusing anonymous types to meet our immutable type initialization needs. We’ll look at how in the next post.

November 16, 2007

Unnecessary MoveNext: do we care?

Filed under: csharp — Jacob @ 4:59 pm

In the last post we looked at some counts for MoveNext, when iterating IEnumerables returned by Take/Skip. Now we’re going to attempt to answer the question: do we care?

Let’s step back a little bit and strip this code down to the bare essentials:

using System;
using System.Collections.Generic;
using System.Linq;

class Program
{
    static IEnumerable<int> NumbersOneToTen()
    {
        yield return 1;
        yield return 2;
        yield return 3;
        yield return 4;
        yield return 5;
        yield return 6;
        yield return 7;
        yield return 8;
        yield return 9;
        yield return 10;
    }

    static void Main()
    {
        using (IEnumerator<int> enumerator = NumbersOneToTen()
            .GetEnumerator())
        {
            enumerator.MoveNext();
        }
    }
}

Here, we have a console application that merely calls MoveNext on the Enumerator returned by NumbersOneToTen. We’re explicitly controlling the enumeration to avoid any ambiguity introduced by using foreach. For instance, we’ve ensured that the Current property of the enumerator is never accessed.

I’ve set a breakpoint on the line containing MoveNext with the intent to hit F11 (Step Into). [It probably bears mentioning that I'm using Visual Studio 2008 Beta 2.]

When you run that (F5), you’ll see that pressing F11 once steps into the NumbersOneToTen method, pressing it a second time moves to the line with the first yield return, and pressing it a third time returns control back to our Main method (with the enumerator positioned on the first item).

The only potentially surprising thing here is that calling NumbersOneToTen() doesn’t actually step into the method, but that’s the beauty of deferred evaluation.

So, what are some things we could do with the sequence that would make calling MoveNext more interesting?

One interesting thing we can do is project elements from the sequence as different elements. In .NET 2.0, this operation was called ConvertAll. In .NET 3.5, we use the Select extension method.

Let’s modify our Main method a bit:

using (var enumerator = NumbersOneToTen().Select(n => "Number " + n)
    .GetEnumerator())

Here, we’re projecting all of the numbers from 1..10 to the strings: “Number 1″..”Number 10″. Now let’s step through our code again: F11 once steps into NumbersOneToTen; F11 moves to the line with the first yield return; F11 steps into the lambda passed to Select.

I’m not just emphasizing that line because it’s a miraculous feat of the debugger that it can step into embedded statements.

I’m emphasizing that statement because we haven’t accessed Current at all. We’ve only asked to advance the enumerator. Every time we call MoveNext, it appears our projection is evaluated. I was actually surprised by that fact.

So, given the example from the last post, if we were to naively introduce our simple projection:

foreach (var batch in NumbersOneToTen().Select(n => "Number " + n).Slice(3))
    foreach (var item in batch)
        Console.WriteLine(item);

We would end up doing 50 unnecessary string concatenations. It’s also not hard to conceive of a projection more expensive than string concatenation.

But we can improve that, right? Certainly the same result is achieved if we project the items after splitting them into batches:

foreach (var batch in NumbersOneToTen().Slice(3))
    foreach (var item in batch.Select(n => "Number " + n))
        Console.WriteLine(item);

There! Now we’re no longer unnecessarily projecting elements when splitting into batches; we only execute the projection as we consume the batch’s items.

But projection isn’t the only operation we can perform on sequences. Filtering is pretty useful, too. In .NET 2.0, we used FindAll; in 3.5, we use Where:

var evens = NumbersOneToTen().Where(n => n % 2 == 0);

Here, we’re filtering the sequence of numbers from 1..10 to a sequence containing only the even numbers.

Unfortunately, this is also an example of a case where we can’t move the sequence operation into the item-consuming loop. If we filter the outer sequence (prior to slicing), we get two slices: { 2, 4, 6}, { 8, 10 }. However, if we filter at the item-consuming level, we get four slices, none of which are the desired size (3): { 2 }, { 4, 6 }, { 8 }, { 10 }.

So, given a Take/Skip-based implementation of Slice with a filtered sequence, we’re stuck with 50 unnecessary predicate evaluations.

That’s enough to convince me that Take/Skip should be avoided in this context, despite the elegance of the proposed implementation.

Slice revisited

Filed under: csharp — Jacob @ 10:43 am

So Dustin responded to my last post in the comments of his original post with the following code snippet:

public static bool IsEmpty<T>(this IEnumerable<T> sequence)
{
    if (sequence == null)
        throw new ArgumentNullException("sequence");

    if (sequence.GetEnumerator().MoveNext())
        return false;

    return true;
}

public static IEnumerable<IEnumerable<T>> Slice<T>(this IEnumerable<T> sequence, int size)
{
    if (sequence == null)
        throw new ArgumentNullException("sequence");
    if (size <= 0)
        throw new ArgumentOutOfRangeException("size");

    while (!sequence.IsEmpty())
    {
        yield return sequence.Take(size);
        sequence = sequence.Skip(size);
    }
}

First of all: wow. That is remarkably succinct and elegant code capturing what we’re expressing with slice. (The parameter validation won’t occur until enumeration begins, but that’s easy to change.)

The trouble is that Take and Skip hide a lot of potentially expensive, unnecessary enumeration.

To look at it, we can create an iterator method to represent our source sequence:

public static IEnumerable<int> NumbersOneToTen()
{
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 4;
    yield return 5;
    yield return 6;
    yield return 7;
    yield return 8;
    yield return 9;
    yield return 10;
}

Now, rather than calling slice with an array, we can set a breakpoint in this method and actually step through the calls to MoveNext(). [Note: this is a little unfair because some methods that accept an IEnumerable actually internally optimize for situations where they have, say, an ICollection/IList (for example, Reverse). This does not appear to be the case with Take/Skip, though.]

Here are the tallies for MoveNext() when executing the following code:

foreach (var slice in NumbersOneToTen().Slice(3))
    foreach (var item in slice)
        Console.WriteLine(item);
  MoveNexts Total
IsEmpty [false] 1 1
First slice [1, 2, 3] 3 4
IsEmpty [false] 4 8
Second slice [4, 5, 6] 6 14
IsEmpty [false] 7 21
Third slice [7, 8, 9] 9 30
IsEmpty [false] 10 40
Fourth slice [10] 11 51
IsEmpty [true] 11 62

It’s of some note that all of the slices, except the last, require n calls to MoveNext where n is the last element of the slice. Returning the last slice (as well as checking IsEmpty at the end) requires n+1 MoveNexts, because Take asks for more elements than there are remaining in the sequence. The 11th MoveNext simply returns false.

Now, I certainly don’t think Take or Skip are inherently evil. There are certainly contexts in which their use is completely benign.

But, the question emerges: do we care about 62 calls to MoveNext when 11 will do? Comments?

« Newer PostsOlder Posts »

The Shocking Blue Green Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.