xml

1 post

LINQ to XML and reading large XML files

LINQ to XML makes it relatively easy to read and query XML files. For example consider the following XML file:

<xml version="1.0" encoding="utf-8" ?>
    <users>
        <user name="User1" groupid="4" />
        <user name="User2" groupid="1" />
        <user name="User3" groupid="3" />
        <user name="User4" groupid="1" />
        <user name="User5" groupid="1" />
        <user name="User6" groupid="2" />
        <user name="User7" groupid="1" />
    </users>

Suppose you would like to find all records with groupid > 2. You could be tempted to issue the following query:

XElement doc = XElement.Load("users.xml");
    var users = from u in doc.Elements("user")
                where u.Attribute("groupid") != null &&
                int.Parse(u.Attribute("groupid").Value) > 2
                select u;
    Console.WriteLine("{0} users match query", users.Count());

There’s a flaw in this method. XElement.Load method will load the whole XML file in memory and if this file is quite big, Continue reading