From: Jonathan Wood on
I'm working with a substantial XML file. I need to locate a node with a
particular value.

I don't have much experience working with XML files. I found the XPath
classes. Looks like I can use them to iterate through the nodes and find
what I'm searching for. I was just wondering if there's any search function
that I can use. Is this already supported?

I'm having trouble finding this information because I'm not sure which
class(es) I need. I don't want to use LINQ (if that's an option).

Could someone point me in the right direction here? Thanks.

Jonathan


From: Peter Duniho on
Jonathan Wood wrote:
> I'm working with a substantial XML file. I need to locate a node with a
> particular value.

Define "substantial".

> I don't have much experience working with XML files. I found the XPath
> classes. Looks like I can use them to iterate through the nodes and find
> what I'm searching for. I was just wondering if there's any search
> function that I can use. Is this already supported?

Yes. If you load the XML document as an XDocument instance, you can
easily use XPath expressions to search for specific information. For
very simple searches, you can even just use the methods in the XDocument
class (e.g. Element()).

The one thing to watch out for with XDocument and XPath is when element
names use an XML namespace. Then you need to make sure you map the
namespace using the namespace manager and incorporate that into the
element name being searched for. You can Google this newsgroup for past
discussions involving the XML namespace manager for more details.

> I'm having trouble finding this information because I'm not sure which
> class(es) I need. I don't want to use LINQ (if that's an option).

Why _don't_ you want to use LINQ? The XDocument class is specifically
designed to work well with LINQ, and is in the System.Xml.Linq
namespace. But other than that, it's not technically part of LINQ. But
you might run into a situation where LINQ actually works well for
specific operations.

For a simple search, you should not need LINQ syntax though.

Note that regarding all of the above, if your document is VERY large �
that is, large enough that storing the entire thing in-memory as an
XDocument would be prohibitive � then you will want to use the XmlReader
class instead. It's harder to use, and won't let you use XPath
expressions, but it only reads linearly through the XML, with just the
current node of the XML document in-memory at once, so it's MUCH more
efficient in its memory usage.

Pete
From: Jonathan Wood on
Peter Duniho wrote:

> Yes. If you load the XML document as an XDocument instance, you can
> easily use XPath expressions to search for specific information. For very
> simple searches, you can even just use the methods in the XDocument class
> (e.g. Element()).

After playing with for a considerable time, I came up with the following. It
seems to work and it actually performs pretty good. But there's a few
details about this code I don't understand. (Note: I realize items specific
to my app will not be clear. Suffice to say, _xml is an instance of an
XmlDocument with my XML file already loaded. I also haven't provided my XML
structure, but the XPath expressions appear to be working fine. I just
thought I'd post what I had if anyone was interested.)

protected override bool GetLinksInternal(string vendorId, string productId)
{
XmlNodeList vendors =
_xml.SelectNodes(String.Format("//Vendor[VendorID='{0}']", vendorId));
if (vendors.Count > 0)
{
XmlNodeList products =
vendors[0].SelectNodes(String.Format("//Product[ProductID='{0}']",
productId));
if (products.Count > 0)
{
XmlNode product = products[0];
_moreInfoUrl = product.SelectSingleNode("InfoURL").InnerText;
_downloadUrl = product.SelectSingleNode("TrialURL").InnerText;
_buyNowUrl =
product.SelectSingleNode("DirectPurchaseURL").InnerText;
return true;
}
}
return false;
}

> Why _don't_ you want to use LINQ? The XDocument class is specifically
> designed to work well with LINQ, and is in the System.Xml.Linq namespace.
> But other than that, it's not technically part of LINQ. But you might run
> into a situation where LINQ actually works well for specific operations.
>
> For a simple search, you should not need LINQ syntax though.

I'm just not very familiar with LINQ and I have a lot on my plate right now.
If I could see an example of LINQ searching XML, I may be able to get a
better idea if it's worth delving into. But, yeah, my search is pretty
simple.

> Note that regarding all of the above, if your document is VERY large �
> that is, large enough that storing the entire thing in-memory as an
> XDocument would be prohibitive � then you will want to use the XmlReader
> class instead. It's harder to use, and won't let you use XPath
> expressions, but it only reads linearly through the XML, with just the
> current node of the XML document in-memory at once, so it's MUCH more
> efficient in its memory usage.

It looks like it should be able to fit in memory okay.

Thanks.

Jonathan


From: Martin Honnen on
Jonathan Wood wrote:

> protected override bool GetLinksInternal(string vendorId, string productId)
> {
> XmlNodeList vendors =
> _xml.SelectNodes(String.Format("//Vendor[VendorID='{0}']", vendorId));
> if (vendors.Count > 0)
> {

If you know you are only interested in the first "Vendor" then don't use
SelectNodes and check the Count, instead use SelectSingleNode e.g.

XmlNode vendor =
_xml.SelectSingleNode(String.Format("//Vendor[VendorID='{0}']", vendorId));
if (vendor != null)
{
> XmlNodeList products =
> vendors[0].SelectNodes(String.Format("//Product[ProductID='{0}']",
> productId));
> if (products.Count > 0)
> {

Same here
XmlNode product =
vendor.SelectSingleNode(String.Format("//Product[ProductID='{0}']",
productId));
if (product != null)
{




--

Martin Honnen --- MVP XML
http://msmvps.com/blogs/martin_honnen/
From: Jonathan Wood on
Martin Honnen wrote:

> If you know you are only interested in the first "Vendor" then don't use
> SelectNodes and check the Count, instead use SelectSingleNode e.g.
>
> XmlNode vendor =
> _xml.SelectSingleNode(String.Format("//Vendor[VendorID='{0}']",
> vendorId));
> if (vendor != null)

Ah, yes. That makes sense. Thanks!

Jonathan