Just recently, after deploying a web application to a testing environment, I was  approached by one of the testers complaining of an “Out of Memory” exception.

The application in question was a port of a legacy system (written in JSP and C++) from an unsupported Windows 2000 server to a more future-resistant .NET solution on a later, supported, version of Windows Server. Much of the original functionality was to be retained, and this was facilitated by the ability to copy/paste a large chunk of the original code (particularly the client-side JavaScript and static HTML).

However, some areas required developing from the ground up. One of these areas was the code that XML-formatted the data, and performed the XSLT transforms upon it. A particular headache was to get the whitespace recognition and treatment on the .NET platform to work the same as the older JSP system. After hours of trying different settings, in the end I was forced to reformat the underlying XSLT file just to get the screens looking the same. However, I digress…

With most of the major hurdles out-of-the-way, and the application delivered to testing, it was quite a disappointment to find a memory leak as the cause of the “Out of Memory” exception, especially at the scale I was noticing (50K+ per web page post).

I was pretty sure I had cleaned up unmanaged resources as required, but set about my code with a fine tooth comb to see if there was anything I missed. Eventually I fired up perfmon and stepped through the code line-by-line, looking for where significant memory consumption occurred.

perfmon_xsl

I happened to notice significant memory consumption when stepping over the code that loaded the XSLT and performed the transformation (the equivalent to lines 5 and 10 in the sample below)

var xslCompiledDoc = new XslCompiledTransform(true);
//....
//....get myXSLDoc from some XSL file
//....
xslCompiledDoc.Load(myXSLDoc);
//....
//....get xmlSource and instantitate an XmlWriter
//....
//Memory leak on this line!!!
xslCompiledDoc.Transform(xmlSource, xmlWriter)

This restored my ego a little. It looked like the problem might be some side-effect of XslCompiledTransform, and not the result of overly crap code. It also meant that I had pinpointed where exactly the problem lay, even if I didn’t know how to solve it.

As it turned out, no one solution solved the problem, but it did lay bare an interesting problem with the System.Xml.Xsl.XslCompiledTransform .NET class. When executed, this class will take the XSLT that is passed to it, and compile it down to IL. The problem was that when the client application was run in debug mode, both the load and transformation leaked memory.

So here are the steps that I took to seal the leak:

1. XMLWriter Cleanup

Although the leak wasn’t all my fault, I may have been partially responsible. My first action was to correctly ensure that the XmlWriter unmanaged resource wrapper was cleaned up correctly:

using (XmlWriter xmlWriter = XmlWriter.Create(xmlStringBuilder, xmlWriterSettings))
{
    xslCompiledDoc.Transform(xmlSource, xmlWriter);
}

Using using in this context is considered all-round good practice for any class that wraps managed resources and implements IDisposable.

2. Debug Option Off

The first overload of the XslCompiledTransform constructor accepts a boolean parameter indicating whether or not to enable debug information, and therefore be able to debug the stylesheet itself.

You’ll note from my original code that I had this option set to true, which caused the load and transform to leak memory. The fix was to pass false explicitly, although I could have used the default parameterless constructor instead.

3. Caching in Session

The XslCompiledTransform class was designed to be initialised once (i.e. one call to the load function), and for that instance used for the duration of the application.When you think about it, there is little point in performing a compilation of XSLT to IL every occasion, unless the XSLT content itself is variable.

My code contained two static XSLT files, so I chose to cache these stylesheets in session once initialised. Technically, they could have been stored at application level once and their one instance re-utilised for every user. I must confess that this is not an option I considered at the time, but is something I can always revisit if needs be.

4. Release Mode DLL

The new version of the application was to be hosted within existing application infrastructure. Essentially, the port of the old application was to be a new web page within an existing website.

This existing website had been deployed in debug mode. Changing every module of a large website to release mode was not an option, so I moved all the supporting code out into a dedicated DLL that itself could be deployed in release mode on its own.

5. Precompiling XSL

After doing all this however, I was still observing a significant memory leak upon each page post (between 10K and 13K). The memory  leak was now reserved to the transform, but it was still occurring.

The last step, which eventually fixed the issue was to follow the advice set out at the end of this article:

http://blogs.msdn.com/b/xmlteam/archive/2011/09/26/effective-xml-part-5-something-went-really-wrong-outofmemoryexception-and-stackoverflowexception-thrown-when-using-xslcompiledtransform.aspx

Another way to resolve the problem is to compile the stylesheet “offline” (i.e. not at runtime) with the xsltc (xslt compiler, more details in MSDN http://msdn.microsoft.com/en-us/library/bb399405.aspx). This tool will create “regular” assemblies (.dll files) for the Xslt stylesheet and the embedded scripts. To be able to use them in your app you just need to add references. The memory leak problem is resolved – there is no compilation going on at runtime so no temporary assemblies are created.

At the time of writing this was required to be a manual build step. The actions that I tool specifically were to compile each of my XSLT files (input.xsl and output.xsl) into DLLs with equivalent names (InputXSL.dll and OutputXSL.dll) and internal class names (InputXSL and OutputXSL). This was done with the following syntax:

xsltc.exe /class:InputXSL /out:InputXSL.dll input.xsl

Note: the xsltc.exe executable took some finding. The one I needed was located in C:/Program Files/Microsoft SDKs/Windowsv7.0A/bin, however this will be dependent on the .NET framework version you are using. If you don’t have the framework SDK installed then you will need to do so.

Once the DLLs were compiled, I referenced them from the new component I created in step 4 above. This gave the code access to the internal class names just stipulated.

Finally, I updated the code itself to pass the respective stylesheet class to the XsltCompiledTransform load() function:

xslCompiledDoc.Load(typeof(InputXSL));

When this last step was completed, the memory profile of the application remained constant, and there were no observable memory leaks.

 

, , , , ,

Ok

Its been a little time since the last post, but I always resolved to finish this series, so here we go with Episode 7.

Intro

The subject of Variance in computer science is well documented and well described and you’ll find scores of articles on the subject if you cared to enter the term into your favourite search engine. I’ve included a selection of links at the bottom which may assist you in further reading.

So, why write another series on this when there’s so many already? Well, unfortunately, I Just Didn’t Understand then. Let’s be clear on this, I found Covariance and Contravariance hard to comprehend to any level on which I was satisfied, and, sometimes the best way of understanding anything to try to explain it to someone else1. In fact, the ‘someone else’ in question is a colleague of mine, James who might be a little surprised when he reads this.

The term Variance in relation to computing has its root in the mathematical area of Category Theory2. The challenge is to convey the language of this domain to those who have never studied or even heard of it.

So, this article will try to describe these topic from a novice programmer’s perspective.  It’s also important to note that the explanation won’t be entirely correct, but it’ll do for now.

A ‘Relationship’

We’ve spoken at length about Casting and the notion of Assignment Compatibility. Generally speaking, the former can be achieved if the latter is met. More profoundly, Casting is an operation, whereas Assignment Compatibility describes a relationship. This is exactly what the terms Covariant and Contravariant do – they describe a Relationship.

The previous articles had hinted at Variance as a future topic, with the message “we’re not quite there yet”, however that was a lie, because Assignment Compatibility is a special kind of Variance. If we can understand our rules for the former we find it easier to grasp the latter.

But, whereas Assignment Compatibility describes the relationship between two related TypesVariance describes the relationship between two  ’Users Of‘ different types. To understand what we might mean by ‘Users Of‘, simply replace the word ‘Users‘ with a C# collection of your choice. For example:

  • Variance describes the relationship between two ‘Lists‘ of different Types3.
  • Variance describes the relationship between two ’Arrays‘ of different Types.

Although the above implies collections, we can also talk about Delegates and Interfaces, which aren’t strictly speaking ‘Collections Of‘, but can certainly be ‘Users Of‘ the Types.

In short, Variance describes the relationship between an instance of a “Thing that uses a Type“, and another instance of an “identical Thing that uses a different Type.”

Whilst, in theory, Variance describes the relationship between any such ‘Users Of‘ two different types, in reality C# limits the ’Users Of‘ to three particular areas:

  • Arrays
  • Delegates
  • Generic Interfaces

We’ll cover them in more detail later, but its enough to highlight currently that  Lists, Dictionaries, and other strongly typed collections do not support Variance, and the same is true for most Return Types and Function Parameters. Again, we’ll visit this a little later.

Finally its worth stating up front the different kinds of variance we can perceive. Relationships between two identical things that use different types can be described as:

  • Covariant
  • Contravariant
  • Invariant

We’ll leave it at that for the moment, and leave the description of these to later posts. And with that…

Next up – Types of Covariance…


1 My friend, Ian Ozsvald reckons that presenting on a topic is the best way of refining knowledge. In this respect, I would include “writing” as a form of presentation.

2 The term Variance is also used in Statistical and Probability Theory, which is different.

3 There is no variance on the generic List type in C#. I just wanted to get across a concept.

 

, , , , , ,

Our next stop in this series concludes our look at Casting by examining Downcasting.

Downcasting is the process by which we cast a reference of a parent class to class down to one of its derived class. This is the opposite of Upcasting, and once again we have deliberately avoided the use of the term Base Class because, frankly, it confuses matters. The whole concept of Downcasting from a Base doesn’t make sense from a lexical perspective.

So let’s get on with this and look at a Downcast:

B b = new B();
S s = b; //won't compile
S s2 = (S)b;

The first thing to observe is we have to do this explicitly. Why? Because Downcasting forgoes the compiler’s ability enforce type safety. The compiler doesn’t like this – it makes it nervous, so we have to reassure it and promise that everything will be OK.

Unfortunately, our code above has broken that promise. The last of the above snippets S s2 = (S)b; compiles, but when we run the code, an InvalidCastException is thrown.

To perform a successful Downcast at runtime, the underlying object must be of the type we are Downcasting to. In other words we must have at some point have performed an Upcast on this variable, so think of this like a ‘back-downcast‘:

B b = new S();
//lines of code
S s2 = (S)b;

But why even bother to Upcast if we are eventually going to Downcast?

Firstly, it is considered good Object-Oriented programming practice to Program To An Interface, Not An Implementation. Although all our examples have been Casting between a Parent class and its Derived classes, we could have been Casting between a class and an Interface that it implements. The Parent class could also have been some Abstract class that served a similar purpose (*).

Secondly, we might not be in control of the initial Cast. We could have sent our object into an API on a third-party library that  required it to be Upcasted. When extracting this object back out of the API, it might have returned the reference in terms of the Parent class (or Interface). At this point we may have to Downcast it to get any meaningful use out of it.

The restrictions upon Downcasting and the desire to ensure type safety are important principles when trying to understand Variance. By knowing that we cannot perform an implicit Downcast we have a head start as we move to the next section.

We’ve now come to the end of our discussion on Casting. To be frank I’m slightly surprised that its taken up six posts, but hopefully its all been worthwhile. Armed with all our facts we can now move on into the mystifying and baffling world of Variance….


(*) For recommendation s of when to use Abstract Classes and when to use Interfaces, refer to the Microsoft guidelines on this subject: Choosing Between Classes and Interfaces.

, , , , , ,

After our merry diversion into Explicit Upcasting corner-cases we now return to make a few more observations about Upcasting in general, and how it is implemented behind the scenes.

In C#, when we create an instance of a class, it’s data is stored in memory, and the variable instance (objectreferences this memory. This is deliberately vague – it tells us roughly what is going on without the how, and is distinct from some lower level languages whose terminology and constructs give us some indication of the how.

Languages such as C or C++ make use of pointers which are numeric values that ‘point’ to a memory addresses. A pointer is much like a firearm – it has great power, but has the ability to cause great harm (and if it doesn’t solve your problem you need to use some more(*)).

In C#, this is abstracted away somewhat from the developer, so we are deliberately removed from some of this peril. We talk about references and memory rather than pointers and addressesEric Lippert, principal developer on the C# compiler team prefers to think of references as “Opaque handles that are meaningful only to the garbage collector” (link).

And, keeping true to this sentiment, in the formal specification for the .NET Common Language Infrastructure (CLI), we are told what we must do, not how to do it. Refer to page 132 of the formal CLI standard (ECMA-335) for the exact text:

Objects of instantiated types shall carry sufficient information to recover at runtime their exact type (including the types and number of their generic arguments). [Rationale: This is required to correctly implement casting and instance-of testing, as well as in reflection capabilities]

Awwwww, it even mentions casting!

So, we can build up a picture of what’s going on. We have two pieces of information:

  • The object in memory with its data and meta-data.
  • A small bit of information that tells the CLI/CLR(**) where this data starts.

Here’s how Microsoft picture it (link):

So, our reference knows where is it, and our memory store knows what it is. So what happens when we do something like:

Lion l = new Lion();
Feline f = l; //Upcast lion to a Feline

?

The underlying object is a Lion, however we store it as instance of Feline. The reference is just a hex value, and the underlying object data has not changed, so meta-data still denotes an instance of type Lion.

But where is this information about it being a Feline stored? How does the CLR know it is a Feline?

The answer is that it isn’t needed, and that the CLR doesn’t have to know about it. This is information is only the available to the C# compiler via the conventions in your code and the language rules. The compiler spots an upcast, deduces that it is legal, and generates the associated Intermediate Language code (IL) for the CLR to execute. By the time it reaches the IL, it doesn’t matter, and because the vast overwhelming majority of code will be emitted by a compiler, we don’t need this at a lower level.

There are plenty of places where the compiler doesn’t enforce type safety quite as rigorously, and that we emit type-unsafe IL. Take the following example:

Lion[] la = new Lion[10];
la[0] = new Tiger();  //compile time failure

Feline[] fa = la;
fa[0] = new Tiger();  //compile time success!

In the first instance, the compiler recognises that we have an array of Lions so prevents us from putting a Tiger in it. However, in the second instance we assign our array of Lions to an array of Felines, and then we are allowed to put a Tiger in because Tiger derives from Feline! It’s the same collection underneath, but the compiler now lets us do something illegal.

We only discover something is wrong when we run it, and get an ArrayTypeMismatch exception:

Attempted to access an element as a type incompatible with the array.

Above, when I described the assignment, I deliberately avoided using the word cast. Although this may look like casting, it’s actually covariance. We’ll cover this and its counterpart Contravariance in later posts.

So, we’ve reached the end of your Upcasting journey. Next up, we look at Downcasting and reflect upon the whole series on casting before turning our attention to the world of Variance, Covariance and Contravariance.

 


(*) If you’re in any doubt: I’m being flippant.

(**) The CLR (Common Language Runtime) is Microsoft’s implementation of the formal CLI specification. The CLR belongs to Microsoft, whereas the CLI belongs to the world.

 

, , , , , ,

At the end of Part 3 we hopefully had a good feeling as to the nature of Upcasting before preparing to delve further into the details. We then encountered a question which warranted further investigation, hence this little diversion.

To date, I’ve tried to keep the discussion on a level a beginner can understand, however this article will require a little assumed knowledge. So, let’s turn off the road and accelerate on into our diversion.

The question we raised from the last post was: In C# Do I Ever Have to Perform An Explicit Upcast?

For the most part, we concluded: No: We are Upcasting types that are Assignment Compatible. However, Stackoverfow yielded four scenarios whereby you are required to do so. These are when you want to…

  • Cast from the result of a ternary operation
  • Cast from the return value of an anonymous method
  • Call explicitly implemented interface function in a class
  • Call a parent class function hidden in the derived class from an instance of the derived class.

Of these four, the first three yield compiler warnings. The latter is merely decision regarding behavioural feature of your code. Let’s look at these in turn.

Ternary Operation

Let’s start with some code (you can paste this straight into LINQPad (selecting the ‘C# Program’ option) and run it there should you desire.

class B { }
class S : B { }
class T : B { }

void Main()
{
    bool bln = (1==1);
    B obj_b;

    //assign by means of an if/else
    if (bln) { obj_b = new S(); }
    else { obj_b = new T(); }

    //assign by means of a ternary operator
    obj_b = bln ? new S() : new T();
}

The first assignment – a conditional Upcast to obj_b from S or T will compile happily, whereas the second one will not. The attempt at implicit casting via the ? ternary operator fails at the compile stage with the error:

Type of conditional expression cannot be determined because there is no implicit conversion between ‘S’ and ‘T’

The line of code is a single statement containing a conditional assignment. In these instances the compiler expects there to be an implicit conversion between all types in the statement. Fortunately, it’s easy to fix, but we must use the if statement as shown above, or we must explicitly cast:

obj_b = bln ? new S() : (B)new T(); //or
obj_b = bln ? (B)new S() : new T(); //or
obj_b = bln ? (B)new S() : (B)new T();

It it worth noting that this is merely a constraint placed upon you by the compiler. If you are using LINQPad you will be able to inspect the emitted IL, and compare the ternary and ‘if’ assignments. When compiler optimisations are turned on, the two statements yield identical IL*.

Anonymous Method

I should say that about a year ago the code below would have been gibberish to me. If it’s gibberish to you I can only apologise, as you may struggle to follow this bit.

In the same way that our ternary operator can yield a result conditionally in a single statement, so can an Anonymous Method. The code below represents the method as a Lambda Expression, but I could have quite happily used a Delegate Instance instead.

bool[] arr_bln = new bool[0]; //empty boolean array 
IEnumerable<B> b5 = arr_bln.Select(b => new B()); //Assignment 1 
IEnumerable<B> b6 = arr_bln.Select(b => 
                                    { 
                                        if (b) 
                                        { 
                                            return new S(); 
                                        } 
                                        else 
                                        { 
                                            return new T(); 
                                        } 
                                    }); //Assignment 2 
IEnumerable<B> b7 = arr_bln.Select(b => b ? new S() : new T()); //Assignment3

In the above code, Assignment 1 will compile, and Assignment 2 and Assignment 3 will not. They generate the following compiler errors respectively:

The type arguments for method ‘Select<TSource,TResult>(IEnumerable, Func)’ cannot be inferred from the usage. Try specifying the type arguments explicitly.

Type of conditional expression cannot be determined because there is no implicit conversion between ‘S’ and ‘T’

For the first one, the compiler expects a bool instance as the input parameter (TSource) and a instance as the output parameter (TResult). It doesn’t think it can infer TResult unless the return type is explicit. We can fix this two ways:

  • Explicitly cast one of your return values as we did for our ternary operator, or..
  • Explicitly stipulate the expected return type in the Select method signature: IEnumerable b6 = arr_bln.Select<bool, B>(

Either way, we’re back to being explicit again.

The second compiler message is equivalent to the issue we encountered with our original assignment by ternary operator above. The fix for this the same: An Explicit Cast.

Explicit Interface Functions

If we declare an interface and implement that explictly in a class, we have to Upcast to that interface explicitly in order to call it. Take the following code:

interface IOne { void MyMeth(); }
class MyClass : IOne
{
    void IOne.MyMeth() {throw new NotImplementedException();}
}
//...
MyClass mc = new MyClass();
mc.MyMeth(); //won't compile
((IOne)mc).MyMeth();

The clue is in the name of the implementation: Explicit. In making the implementation explicit we restrict the access to explicit Upcasts to that interface.

Hidden Function

If we implement a class hierarchy like so:

class B2 { public void DoSomething() { Console.WriteLine("B"); } }
class S2 : B2 { public new void DoSomething() { Console.WriteLine("S"); } }

We can see that S2 hides the implementation of DoSomething() in B2 from anyone that wants to use S2. If we call DoSomething() on an instance of S2 the program will emit “S”. If we want to emit “B” we have to do an Explicit Upcast:

S2 s_2 = new S2();
s_2.DoSomething();  //emits "S"
((B2)s_2).DoSomething(); //emits "B"

Either of these night be correct, and both will be compiled without error or warning. The detail is in the implementation, and what the developer wants to achieve.

Ok, so now we’ve reached the end of our diversion. Let’s return to the main theme. Up next: Upcasting: Detail

(*) The optimised version yields slightly more IL for the if/else statement than it does for the ternary operator. Again, you can look at this in LINQPad.

, , , , , ,

So, we reach part three of our series, in which I shall talk about Upcasting. Everyone interested should alight here, the next stops are ‘Upcasting, The Details’ ‘Downcasting’, ‘Covariarance’, ‘Contravariance’ and ‘Variance In The C# Language’.

So, Upcasting

If you’ve been following the series, then you’ve probably noticed that we’ve already encountered a sort of Upcasting:

    Feline f = new Lion();  //Fine: Lion is smaller and is related to Feline
    Mammal m = new Lion();  //Fine: Lion is smaller and is related to Mammal

If we refer back to our inheritance hierarchy we can see that we are casting Up to types that are above Lion:

It’s really as simple as that. Upcasting is about taking an instance of a class and expressing it in terms of another one higher Up the inheritance hierarchy.

There are some additional details. The code above has subtly introduced another casting concept without our noticing. We were able to assign across from one type to another as if they were the same. We have assumed that the compiler will let us convert from one type to another without any further information, which it will do happily.

In short, the Cast is Implicit (family-friendly). The compiler lets you assign from one object to another. It is able to deduce that because the Upcasts above will always have Assignment Compatibility and will therefore always be Type Safe.

Although Microsoft refer to the above convention above as  Implicit Conversion (See Casting and Type Conversions on MSDN), there’s a broad acceptance that this is still Upcasting,  and here (at least) we probably benefit from a consistency of terminology.

We could have written our code with an Explicit (x-rated) Cast, which would have had the same effect:

    Feline f = (Feline)(new Lion());  //Explicit Upcast of Lion to Feline
    Mammal m = (Feline)(new Lion());  //Explicit Upcast of Lion to Mammal

Although valid, in the above code there’s little value in the explicit cast – it actually makes our code harder to read! In reality we almost never have to perform an explicit Upcast in our C# code.

Sometimes it is used as a coding convention as a means to augment readability:

    List mammalList = new List(); 
    //...lines of code... 
    Lion lion = new Lion(); 
    //...lines of code... 
    mammalList.Add(lion as Mammal);

But even this is a little tenuous. We only really have to do this when the compiler demands it, and those cases are rare.  Out of curiosity I asked Stackoverflow, and they were able to conjure a few examples: In C# Do I Ever Have to Perform An Explicit Upcast?

Finally, a confession. I promised at the start of this post that our next stop would be  ’Upcasting, The Details’ however, the previous link put us on a tangent far too interesting to ignore. So….

Up Next: Upcasting: A Diversion

, , , , , ,

Sooooooo, what did we learn last time?  Well Up is Big is Broad is Generalised, and Down is Small is Narrow is Specialised.

And what did we unlearn? Well, because Base is as far Up as one can get, we should probably discard the term Base Class just for this series of articles.

If none of that makes sense to you, then you might like to refer to the previous article in which I cover and attempt to tackle the confusion between these terms.

So, let’s revive a diagram from the previous post, augmented a little further for clarity:

 

Now, as promised we are going to put this is the context of Casting,

It’s difficult to mention Casting without referring back to our old friend Emohawk Polymorphism (link). Simply put,  Polymorphism is the ability to create a variable, a function, or an object that has more than one form. (In C#, the very fact that we can overload operators means that they can also be Polymorphic)

Casting has this as its very heart, as we are changing one thing into something else Related, but different. In fact, we can think of Casting as a mechanism that facilitates Polymorphism. And, while we’re at it, we can think of Variance in the same way (more on Variance later).

Underpinning Casting is the concept of Assignment Compatibility. This makes an important statement about our Source and Destination types. It says that we can only cast from a Source to a Destination if their Underlying types are directly Related (can be reached in a single direction) and Source is the same size or smaller (or DownNarrower, More Specialised).

Let’s look at some code:
    Lion l = new Lion();    //Fine: Lion is same size and related to Lion
    Feline f = new Lion();  //Fine: Lion is smaller and is related to Feline
    Mammal m = new Lion();  //Fine: Lion is smaller and related to Mammal
    Lion l = new Feline();  //Not fine: Feline is bigger than Lion 
    Canine f = new Feline();  //Not fine: Feline and Canine are not directly related

In the fourth example we try to cast a Feline to a Lion. Here we are trying to take everything a Feline could possibly represent (from human-killing predator to cuddly pet) and represent this in terms of a Lion. Whilst a lion might be a fierce predator, the last time I looked it wasn’t a cuddly pet. By similar reasoning, I couldn’t cast from a Canine to a Feline because, well, when did you last see a cat that howled, chased sticks and herded sheep?

Hopefully, we now have a strong feeling of What we can cast and When we can do it. We may also recognise that there is a gap in our knowledge – the above code seems like a Cast , and we can probably deduce which sort, but there are still some questions outstanding.

Up next: Upcasting: The ‘X-Rated’ and ‘Family-Friendly’ version

 

, , , , , ,

I’ve spent quite some time coming to grips with the concepts of casting and variance in C#, and I think it’s about time to refine this knowledge  into a series of blog posts.

When I first set out it seemed remarkably simple , but as I sought I to understand more I realised serious limitations in what I had actually come to understand. So, in this series, I’ll hope to answer some of the following questions along the way, which include (in no particular order):

  • When can we upcast?
  • When can we downcast?
  • When it comes to it, how do I remember which is which? (I keep forgetting)
  • Why is downcasting bad?
  • What the hell is variance and how does it differ between variance and casting?
  • Why is a generic type Covariant in its outputs or Contravariant in its inputs?
  • Why can’t a type be both Covariant and Contravariant?
  • Where is Covariance and Contravariance supported in C#?

For anyone wanting to reference this series, I’ll be posting it under the tag EduCastVariance (an idea I borrowed from Jon Skeet’s blog)

For the purposes of this series, It’s worth defining a few terms up front and painting some pictures.

The first thing we need to consider is an Inhertance Hierarchy (*).  This hierarchy has a class that itself derives from nothing, and from which other classes derive. Other classes may derive from those classes, and so on, and so forth. Let’s look at a picture:

When thinking about casting we refer to it in a hypothetical Vertical plane. We can cast Up, or we can cast Down. It therefore makes sense to put this plane into context. If there is an Up there must be a Top, and similarly for a Down there must be a Bottom. So, let’s adjust our diagram:

We can also add a Size dimension to our terminology. As we go Up we get Bigger, and when we go Down we get Smaller.

By Bigger we loosely mean Represents More, and by Smaller we are implying Represents Less. If you read around you will also see the terms Broader and Narrower used. they fit into the above model as you would expect.

Some people might perceive that Derived1 is bigger than Class because in implementation terms they think it is bigger. However, we are not comparing lines of code between single classes. If you must think in terms of implementation size, count the lines of code for the classes and everything that derives from them. So, given that Class is actually  Class + Derived1 + Derived2, and Derived1 is actually Derived1 + Derived2, we can safely say that Class  is bigger than Derived.

Even when doing this, we must be careful that we use the terms Bigger and Smaller in the correct context, and are not too literal. Consider the following Inheritance Hierarchy:

Whilst it makes sense to say Mammal  is Bigger or Represents More than Canine, it doesn’t make sense to say Canine is Bigger or Represents More than Lion. How do we know without traversing the entire structure to find out? It only makes sense to compare size in terms of things we can navigate to without changing our vertical direction. To get from Canine to Lion we have to navigate in two directions: up and then down; so went can’t compare them.

Naturally, as we start using formal OO terms we’ll encounter the term Base Class, which denotes the most Generalised class in our Inheritance Hierarchy. This is the class that does not derive from anything and is related to every other class. Here, the term Base Class is confusing, because Base implies Bottom, which we have stipulated as the place we reach when we go Down in our hierarchy.

So, for the purposes of these articles, FORGET the term BASE CLASS. In our picture it’s confusing and detracts from us understanding about casting.

When we go Up our hierarchy we get classes that are Bigger and more Generalised. When go Down we get Smaller and more Specialised.

Now we’re just about ready. With a good understanding of what is Up and what is Down we can start talking about it in terms of casting. So, next up…Upcasting

(*) Yes, it can also be an ‘Implements’ hierarchy if we’re talking about Interfaces, but I’m trying to keep things simple for the moment.

, , , , , ,