Wednesday, March 20, 2013

Pitfalls of Linq

Linq is a great tool, it allows you to abstract common operations into simple directives.

Recently I encountered a case where Linq will fall over flatly in terms of performance compared to running your own optimisations.

The case in point is where data is already sorted.

In this scenario I had to retrieve from a data dump of a simulation different values to make meaningful analysis of the drive data. In this case the Frame data was sorted based on duration and distance traveled.

I was using Linq to isolate subsections of the drive for analysis, for example:
 var framesBetween = from p in frames where p.SceneData.DistanceTravelled >= bucketStartDistance && p.SceneData.DistanceTravelled < bucketEndDistance select p;  


The issue with the above statement, when using standard off the shelf List and ObservableCollection's inside .NET is that the Linq processor seems to be going through the entire array or list before returning the data set. This approach kind of makes sense since .NET has no awareness that the data is already sorted, and therefore would have to scan the entire array.

I ended up having to role out of my own solution which essentially did a for loop where I would exit from the loop when the required distance is reached.

       FrameCollection outputCollection = new FrameCollection();  
       bool inState = true;  
       for (; startIndex < masterCollection.Count; ++startIndex)  
       {  
         Frame frame = masterCollection[startIndex];  
         if (frame.SceneData.DistanceTravelled >= distanceStart)  
         {  
           inState = true;  
         }  
         if (inState == true)  
         {  
           outputCollection.Add(frame);  
         }  
         if (includeFinalFrame == false && frame.SceneData.DistanceTravelled >= distanceEnd)  
         {  
           break;  
         }  
         if (includeFinalFrame == true && frame.SceneData.DistanceTravelled > distanceEnd)  
         {  
           break;  
         }  
       }  
       return outputCollection;  

One thing I will investigate and put into a later article is if LINQ will optimise for sorted data in the case where you are using a container type such as SortedList<T> and take into account the above optimisations.

Wednesday, April 04, 2012

The XNA XML sub­sys­tem I find is woe­fully under doc­u­mented. It’s not up to Microsoft’s usual standards.

So the concept of the XML sys­tem, is that XNA will map the XML to a object con­tained inside your code project/assembly. So you can define a class, then inside your con­tent pro­ject you add a new XML file which mem­bers map one to one to the mem­bers on the class. This sub­sys­tem uses reflec­tion to set the values.

The XML approach offers two nice fea­tures. the first being it will auto­mat­ic­ally pro­duce com­pile time errors to alert you of the fact if one of your XML ele­ments does not match up the cor­res­pond­ing class. This makes it easy to track down prob­lems when you change your code, and your data needs to be updated. The other advant­age of these errors is you can setup a build server that checks each new revi­sion for any errors and auto­mat­ic­ally alert the devel­op­ment team with an email if any errors are intro­duced. This can save hours of hunt­ing down errors that are introduced.

The other advant­age to the XNA XML sys­tem is the data is con­ver­ted to a bin­ary file dur­ing the com­pile step, this makes the data much more com­pressed then in its nat­ive XML format mak­ing it much nicer for plat­forms such as the Win­dows Phone 7 which have poten­tially lim­ited capacity.

So the first thing when using the sys­tem is that your classes must be con­tained in a sep­ar­ate pro­ject to both the main game pro­ject and the con­tent pro­ject. The reason for this is the con­tent pro­ject needs to be aware of the classes for its val­id­a­tion stage of the XML and the main game pro­ject also must be aware. You can­not have cir­cu­lar depend­en­cies in .NET pro­jects so there­fore a third pro­ject must be intro­duced. You can then setup a pro­ject ref­er­ence in both the main game pro­ject and the con­tent pro­ject to the data project.

Now to the undoc­u­mented stuff which drove me crazy.

The XML seri­al­isa­tion sys­tem in XNA is very soph­ist­ic­ated, much more so then the Microsoft Doc­u­ment­a­tion gives it credit for.

The first concept I want to intro­duce is con­tain­ers of objects.

Say you want to seri­al­ise a series of sub nodes like so:


So you have a series of WorldNode’s con­tained inside your WorldMap. I had to tinker around with the XML format to work out how to get the sub-elements to work. Here is the end result: The Item ele­ment denotes an addi­tional object to be cre­ated inside the array or con­tainer. Then you must as dis­cussed before provide xml ele­ments to match all pub­lic prop­er­ties exposed by the child class. So for instance in our example above we had to expose the Name for each World­Node item. There was one scen­ario even more com­plex then this. Lets say we had a con­tainer inside a class con­tain­ing a ref­er­ence a inter­face or abstract base class. We want to be able to define that the Item is of the type of a derived class. XNA XML seri­al­isa­tion can also handle this scen­ario also. On the Item tag you need to define the type attrib­ute con­tain­ing the Type you wish to instantiate. For example, if we had inside a GuiS­creen a series of GuiEle­ments we wanted to con­struct (such as But­tons, Text­Boxes etc) then we could use the fol­low­ing XML: As you can see above we are mak­ing a Text­Box for one of the GuiEle­ments. This fea­ture inside the XNA XML is not doc­u­mented inside any of the XNA doc­u­ment­a­tion and I found it purely through chance attempt­ing to get the above scen­ario to func­tion properly. You can then use the Con­tent­Man­ager provided by .NET to import the GuiS­creen for example. I hope this helps any­one using XNA out there.