Thursday 30 July 2009

Enhanced Weak Events Part Two - Immutable Event Entry List

Posts in this series

- Part One: DynamicMethod vs. Expression Tree Compilation
- Part Two: Immutable Event Entry List (this post)
- Part Three: Synchronized Events (to be written)
- Part Four: Custom Events and unsolved problems (to be written)
- Dynamic Code Generation With Lambda Expressions (to be written)

C# Events - Basics

Just for a start, let me reinforce some preliminary definitions in an very factual way.

  • Delegates are tuples composed by a reference (A) to a method, and a reference (B) to a target object.
    - All delegates are made immutable;
    - The reference (B) will be null if the method is static;
    - When a delegate is invoked, the referenced method is called. If the method is an instance method, the target object reference is used;
  • Multicast Delegates encapsulate a collection of delegates.
    - When invoked, all of its delegates are invoked in an unspecified order;
    - Multicast delegates can only be created by composition or decomposition. Composition is the merging of two delegates into a new multicast delegate. Decomposition is a subtraction of one delegate from another;
    - There's no such thing as an empty multicastdelegate. If a decomposition would create one, the result is null;
    - For all purposes, a multicast delegate is a delegate;
  • C# Events are nothing more than a special type of property that have add and remove accessors instead of get and set.
    - Only events of a delegate type can be declared;
    - Due to the special accessors, only the class that owns the event can reset its value or invoke its delegate, if it's not null;
    - When add is called, the current delegate is composed with the delegate that was provided;
    - Conversely, remove does a decomposition;

C# Events - Not So Basics

Having said that, let's dive deeper. When it comes to c# events, there are two interesting problems that can be easily overseen. The simpler one is that an event is always initialized with null, and they can always be decomposed to null later on. In other words, an event can be null at any time. Since it's not possible to invoke a null delegate (without causing a NullReferenceException), any class that provides an event has be to sure that the event is not null before invoking it.

The other problem appears in multithreaded applications. A thread might be adding or removeing an event handler while the event is in the middle of an invocation. This is a race situation, and has to be correctly handled. If you want to check a complete explanation on this subject, I recommend you to read Eric's post on events and races.The generally recommended solution to both problems is the following pattern:
var handler = this.MyEvent;
if (handler != null) handler(this, eventArgs);

Now, let's move to main subject of this post.

Mutable Event Entry List

As you may already have noticed, my custom events were born from Daniel Grunwald's WeakEvents. Internally, to represent each event entry* he chose to use a List<EventEntry>. Using regular a List<T> seems very convenient. It will make add and remove operations simple and quick to execute, but there is a downside. List<T>'s are mutable.

One of the major disadvantages of mutable objects appear on multithreaded applications. Consider the problems that I've introduced above. They apply to custom events too. Alas, that invocation pattern can't be used.

One goo point is that even an empty WeakEvent will never be null, so we don't event-is-null problem. The concurrency problem, though, can't be avoided. As the event entry list is mutable, any access to it should be synchronized if we want to make the custom event thread-safe.

Looking at Grunwald's code that makes the invocation of a WeakEvent, I noticed that the entry list is copied to an array before invoking the entries. No locks are made. Is that enough? Unfortunately, the answer is no. Copying the list to an array is not an atomic operation. If the list is modified in the middle of the copy, things will probably get messy. Even if he had enclosed the copy in a lock block, there would still be the performance problem caused by the copy itself.

* An event entry is an abstraction of an event handler delegate. Weak event entries, for example, serve the same purpose of a regular delegate. The sole difference being that it does not keep a strong reference to the target object.

Immutable Event Entry List

Naturally, my suggestion is to use an immutable list to store event entries. By the way, as I mentioned in the beginning, all delegates are immutable. And regular c# events use immutable multicast delegates internally. There's probably a good reason they did it that way, and I don't see why custom events should differ. Let's consider some aspects of this decision.

  • Lower invocation overhead: There's no need to copy event entries before invoking them because the list does never change. All that's needed before the invoking is to read the current list object and store a reference to it in a local variable. If the event is changed by other thread, a new list will be created, but the invocation will be working on an untouched list object.

    The only possible problem may happen when an handler is removed, but an invocation is being made. The result is that the handler is invoked even after it has been removed from the event. Actually, this is a third interesting problem. The post from Eric Lippert (events and races) that I've suggested earlier talks about this stuff too. If you read it, you'll see that my custom events are actually imitating the exactly same behavior that regular c# events have.

    Unfortunately, there is also a trade-off involved. Add and remove operations will be inevitably slower. Each time they happen, a new list has to be created, and a partial copy of the current entries will be made. Personally, invoking performance has higher priority than adding or removing performance, so I think this trade-off is acceptable. If anyone disagrees, I'll be very happy to hear your thoughts.

  • Thread-safety: Invoking an event when its entry list is immutable completely thread-safe without any need for locks. That's a great win.

    Add and remove, on the other hand, are not thread safe. I can say as an excuse that the mutable-list-approach is not thread-safe too. At least, I did not created a problem that was not there before.

    Looking at regular events, we can see that even they are not thread-safe when it comes to add and remove. Would the C# language team let so big a hole in the language? Of course not. The solution is hidden from the programmer's eyes. Whenever you declare a regular c# event, the compiler generates the two accessors (add and remove). Both of them are them decorated with this custom attribute:[System.Runtime.CompilerServices.MethodImpl(MethodImplOptions.Synchronized)]. You can make a test assembly and decompile it to see. Or you can read this old MSDN article too.

    So, in my opinion, locking should be avoided unless there's no alternative. In a reusable component (like custom events), it's preferable to let it be not thread-safe and free of locks than forcing an overhead that is not always required. A developer that uses your component should be able to use any synchronization mechanism he wants, and only if needed. Inside the .NET Framework, almost all collection implementations are not synchronized for this reason. Following the same pattern, I won't put any locks on my Custom Events. User developers should be aware of that, and put locks around add/remove it they need it.

Conclusion

In a world where multi-core processors are everywhere, multithreading and immutability are increasingly important. If you want to read more about immutability in C#, you might want to check this out.

If you read this far, you probably are interested in events. A very extensive article you might want to read is on codeproject.

Finally, as usual, here is the download link for the latest version (v1.1) the my CustomEvents' source code. See ya!

Tuesday 14 July 2009

Enhanced Weak Events Part One - DynamicMethod vs. Expression Tree Compilation

Introduction / Background

In my last post some months ago I showed a use case for weak events*. Since then, I've ran into some other interesting event problems, and that gave me time to make some improvements over Daniel's code.
* If you're not familiar with the concept of weak events, I believe the best introduction you can find is Daniel Grunwald's article.

Weak Events Enhancements Part One: Faster Forwarder Delegates / DynamicMethod vs. Lambda Expression Compilation

The first improvement that I made is, in fact, the last that I've implemented. In Daniel's final solution (FastSmartWeakEvent), a forwarder method is compiled using System.Reflection.Emit.DynamicMethod. Although this is a very light way to generate code dynamically, it's very low level. Usually, the generating code is quite unreadable and very hard to debug, to modify or to extend. Daniel's first version, for example, had a type safety issue. I imagine that it was hard to find, and the solution is not obvious to understand (at least it wasn't obvious for me).

For some time I was wondering if there was any higher level alternative to a raw DynamicMethod. The usual way to hide low level code is by encapsulating it inside a higher level framework. I don't remember how the idea came to me at the time, but I realized that .NET Framework 3.5 already contains what I was looking for: Expression Trees.

Expressions trees can be composed dynamically, and compiled to a delegate whenever they are ready. I tried it out, and developed a new WeakEventForwarderProvider class. The implementation details have several interesting catches. In fact, I think there are enough of them that I'll cover them in another post. Until then, feel free to check out the code. For now, I'll share the results that I've found.

Maintainability: Readability is a key feature of a maintainable code, but opinions might always diverge about it. It depends on coding style, and my style is somewhat unusual. Even though I admit the new code is not easy to read, I feel that it has improved a bit from the original; and there is still room for improvement. Debugging, on the other hand, has certainly been improved. The main reason for this is that the debugger is able to show any intermediate expression as nice programmer-readable strings. They can also be converted to string. So, it's much easier to know what is being generated.

Correctness: Using DynamicMethod, it's possible to generate any sequence of IL commands. One obvious advantage of this is a very high flexibility. The trade-off is that it's too easy to generate broken code. Expression trees work as middle ground solution. Its major restriction is that statements are not allowed inside expressions. Fortunately, that can be worked around, specially when generating small pieces of code (I promise to explain how in a future post). A great advantage expressions do provide is a very good validation at dynamic-code-generation-time. The type safety bug that leaked from Daniel's first version of WeakEvents was found and fixed very easily using expressions.

Compiling Performance: I knew that expression compilation uses DynamicMethod internally, so I took for granted that it would be slower to compile. Surprisingly enough, the results that I've measured were close to a tie. Sometimes it appeared that lambda compilation was faster. This is still puzzling to me. While I don't find a good explanation, I'll let the topic open for discussion. My best guess is that the overhead that expression trees add is so much smaller than the overhead of emitting the code that, proportionately, the first can be disregarded.

Executing Performance: Code execution performance is a tricky subject. You can always try to tweak assembly code to make it faster. You can do it at IL level with DynamicMethod too. But that generally is a bad idea. To quote Donald Knuth, premature optimization is the root of all evil. I've found out that the expression tree compiler is quite good. My tests have shown that my expression-generated delegates were about 15% faster to execute than the ones that Daniel has got. That's better than I've expected.

Performance-wise, I came to a conclusion that simply reinforces one of my software beliefs. A well-built framework will not make user code slower. Actually, some remarkable cases are capable to make user code even faster. In my opinion, instead of optimizing your code, find a good library or framework to do the job for you. If you can't find it, do not optimize. At least not now. Considering all aspects, my expression tree solution was very satisfactory. And it was fun to code too.

Following Posts

These are the next posts that I'm planning to write on Weak Events:
- Events Part Two: Immutable event entry list;
- Events Part Three: Synchronized Events
- Events Part Four: Custom Events and unsolved problems
- Dynamic Code Generation With Lambda Expressions

Source Code

Download Custom Events source code here (v1.1 @ 2008.07.31). It contains stuff that I've not discussed yet. If you're eager to see it, feel free to delve into the code. See you soon (I hope :)).

Updated at 2009.08.03: Source code link updated.