dotnet

5 posts

LINQ to XML and reading large XML files

LINQ to XML makes it relatively easy to read and query XML files. For example consider the following XML file:

<xml version="1.0" encoding="utf-8" ?>
    <users>
        <user name="User1" groupid="4" />
        <user name="User2" groupid="1" />
        <user name="User3" groupid="3" />
        <user name="User4" groupid="1" />
        <user name="User5" groupid="1" />
        <user name="User6" groupid="2" />
        <user name="User7" groupid="1" />
    </users>

Suppose you would like to find all records with groupid > 2. You could be tempted to issue the following query:

XElement doc = XElement.Load("users.xml");
    var users = from u in doc.Elements("user")
                where u.Attribute("groupid") != null &&
                int.Parse(u.Attribute("groupid").Value) > 2
                select u;
    Console.WriteLine("{0} users match query", users.Count());

There’s a flaw in this method. XElement.Load method will load the whole XML file in memory and if this file is quite big, Continue reading

Dynamic methods in .NET

Using reflection to invoke methods which are not known at compile time might be problematic in performance critical applications. It is roughly 2.5-3.0 times slower than direct method calls. Here’s a sample test I’ve conducted:

class Program
    {
        [DllImport("kernel32.dll")]
        static extern void QueryPerformanceCounter(ref long ticks);

        static PropertyInfo _intProp = typeof(Foo).GetProperty("IntProp", BindingFlags.Public | BindingFlags.Instance);

        static void Main(string[] args)
        {
            Foo foo = new Foo { IntProp = 10 };
            const int COUNT = 1;
            Console.WriteLine(Measure(() => ReadPropertyWithReflection(foo), COUNT));
            Console.WriteLine(Measure(() => ReadPropertyDirectly(foo), COUNT));
        }

        static void ReadPropertyWithReflection(Foo foo)
        {
            int intProp = (int)_intProp.GetValue(foo, null);
        }

        static void ReadPropertyDirectly(Foo foo)
        {
            int intProp = foo.IntProp;
        }

        static long Measure(Action action, int count)
        {
            long startTicks = 0;
            QueryPerformanceCounter(ref startTicks);
            for (int i = 0; i < count; i++)
            {
                action();
            }
            long endTicks = 0;
            QueryPerformanceCounter(ref endTicks);
            return endTicks - startTicks;
        }

        class Foo
        {
            public int IntProp { get; set; }
        }
    }

Here are the results:

Method CPU units
Direct method invocation 796
Reflection method invocation 1986

So using reflection to perform a single property read is 2.5 slower than direct property access.

Dynamic methods can be used to generate and execute a method at run time, without having to declare a dynamic assembly and a dynamic type to contain the method. They are the most efficient way to generate and execute small amounts of code. Continue reading

Transactional NTFS (part 1/2)

Transactional NTFS and DotNet

Starting from Windows Vista and Windows Server 2008, Microsoft introduced a great new feature called Transactional NTFS (TxF).

It allows developers to write file I/O functions that are guaranteed to either succeed completely or fail completely.
Unfortunately there are no classes in the .NET framework that would allows us to perform, such operations.
We need to resort to P/Invoke to use the newly introduced functions CreateTransaction, CommitTransaction, RollbackTransaction and DeleteFileTransacted. Continue reading

Playing with FindFirstFile and FindNextFile

Enumerating files in .NET is easy. Everybody knows the GetFiles method and you may be tempted to write code like this :

var files = Directory.GetFiles(@"c:\windows\system32", "*.dll");
    foreach (var file in files)
    {
        Console.WriteLine(file);
    }

But if you look closer you will notice that this method returns an array of strings. This could be problematic if the directory you search contains lots of files. The method will block until it performs the search and once it finishes it will load all the results into memory. It would be much better if it just returned an IEnumerable. That’s exactly what the EnumerateFiles method does in .NET 4.0. Unfortunately in .NET 3.5 there’s nothing for the job.

In this post I will show how to implement this functionality using the FindFirstFile and FindNextFile functions. Continue reading

Fun with matrix multiplication and unsafe code

In this post I will compare two methods that perform matrix multiplication. We start by defining the Matrix class:

class Matrix
    {
        private readonly double[,] _matrix;
        public Matrix(int dim1, int dim2)
        {
            _matrix = new double[dim1, dim2];
        }

        public int Height { get { return _matrix.GetLength(0); } }
        public int Width { get { return _matrix.GetLength(1); } }

        public double this[int x, int y]
        {
            get { return _matrix[x, y]; }
            set { _matrix[x, y] = value; }
        }
    }

Next we add the first algorithm to the Matrix class which performs a naive multiplication:

 public static Matrix NaiveMultiplication(Matrix m1, Matrix m2)
    {
        Matrix resultMatrix = new Matrix(m1.Height, m2.Width);
        for (int i = 0; i < resultMatrix.Height; i++)
        {
            for (int j = 0; j < resultMatrix.Width; j++)
            {
                resultMatrix[i, j] = 0;
                for (int k = 0; k < m1.Width; k++)
                {
                    resultMatrix[i, j] += m1[i, k] * m2[k, j];
                }
            }
        }
        return resultMatrix;
    }

The second method uses unsafe code: Continue reading