Last time I discussed how two lists of financial transactions can be matched by defining a custom extension method. The used types were hard-coded, which is fine as long as you don’t plan to reuse the code.
It is not hard to generalize the extension method using generic type parameters. These would be TFirst
for elements of the first list, TSecond
for elements of the second list, and TResult
for elements of the resulting enumeration. The signature of the generic match
method would then be
public static IEnumerable<TResult> Match<TFirst, TSecond, TResult>( this IEnumerable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, int> compareTo, Func<TFirst, TSecond, TResult> matchSelector, Func<TFirst, TResult> firstSelector, Func<TSecond, TResult> secondSelector )
The first two parameters are the lists to compare. The elements are compared with the compareTo
delegate. The remaining three parameters are delegates that yield the resulting elements. Depending on the result of compareTo
, the appropriate selector is called.
The method body is structurally identical to the non-generic version. The two specific differences are
- Instead of the hard-coded comparison of transaction amounts, the comparion delegate is invoked
- Instead of calling the constructors of the return types, the selector delegates are invoked
using (var enum1 = first.GetEnumerator()) using (var enum2 = second.GetEnumerator()) { // Move both enumerators to first element bool eoc1 = !enum1.MoveNext(); // end of first collection? bool eoc2 = !enum2.MoveNext(); // end of second collection? // Match sequences until end of one collection is reached while (!eoc1 && !eoc2) { // Compare current elements var cur1 = enum1.Current; var cur2 = enum2.Current; int cmp = compareTo(cur1, cur2); // Yield the next TResult element and advance iterator(s) if (cmp < 0) { yield return firstSelector(cur1); eoc1 = !enum1.MoveNext(); } else if (cmp > 0) { yield return secondSelector(cur2); eoc2 = !enum2.MoveNext(); } else { yield return matchSelector(cur1, cur2); eoc1 = !enum1.MoveNext(); eoc2 = !enum2.MoveNext(); } } // Yield rest of first sequence while (!eoc1) { yield return firstSelector(enum1.Current); eoc1 = !enum1.MoveNext(); } // Yield rest of second sequence while (!eoc2) { yield return secondSelector(enum2.Current); eoc2 = !enum2.MoveNext(); } } }
At first it may seem that we do not need three different selectors. Instead of firstSelector
and secondSelector
one could call matchSelector
with default(TFirst)
or default(TSecond)
, respectively, as one argument. This general approach is taken in the related ZipLongest
and ZipShortest
extensions methods in the MoreLinq LINQ extension library. However, if the type’s default values are viable elements of the input lists, there is no good way to discriminate between these values coming from the input or being created in the matching. Also, matchSelector
is called with two values that would not necessarily be compared as equal. If you rely on this fact, either in the code of matchSelector
itself, or in the use of the resulting object, you are out of luck.
However, it seems reasonable to implement an overload that discards unmatched elements. Sure enough, one selector is sufficient now.
public static IEnumerable<TResult> Match<TFirst, TSecond, TResult>( this IEnumerable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, int> compareTo, Func<TFirst, TSecond, TResult> matchSelector ) { using (var enum1 = first.GetEnumerator()) using (var enum2 = second.GetEnumerator()) { // Move both enumerators to first element if (!enum1.MoveNext() || !enum2.MoveNext()) yield break; // Match sequences until end of one collection is reached while (true) { // Compare current elements var cur1 = enum1.Current; var cur2 = enum2.Current; int cmp = compareTo(cur1, cur2); // Yield the next TResult element and advance iterator(s) if (cmp < 0) { if (!enum1.MoveNext()) yield break; } else if (cmp > 0) { if (!enum2.MoveNext()) yield break; } else { yield return matchSelector(cur1, cur2); if (!enum1.MoveNext() || !enum2.MoveNext()) yield break; } } } }
For a final variation, imagine that the input types implement the IComparable
interface. This enables us to get rid of the comparator delegate.
public static IEnumerable<TResult> Match<TFirst, TSecond, TResult>( this IEnumerable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, TResult> matchSelector, Func<TFirst, TResult> firstSelector, Func<TSecond, TResult> secondSelector ) where TFirst : IComparable<TSecond> { return Match(first, second, (f, s) => f.CompareTo(s), matchSelector, firstSelector, secondSelector); }
Here TFirst
is supposed to implement the comparison.
Now this variation seems less satisfactory. In order to keep the symmetry between the two input types, one would need another overload with a type constraint where TSecond : IComparable<TFirst>
. However, generic type constraints are not part of the method signature, so this is not possible. Moreover, a comparison delegate is more in the spirit of LINQ and offers the highest flexibility.