Prev: AckTerm
Next: JIT Debugging
From: Jesse Houwing on
* Martin Honnen wrote, On 10-1-2010 14:45:
> Anna wrote:
>>> If the markup is not well-formed then I don't think any of the XML
>>> APIs in the .NET framework help, they all want well-formed markup.
>>
>> I was afraid of that.
>> So any advice on what's the best approach to solve this problem,
>> writing my own code ?
>
> You will need to find out exactly which rules the markup you have
> implements respectively if there are any rules at all. The only other
> markup language I know is SGML, it allows omitting certain tags, not
> quoting certain attribute values, but there are clear rules how the
> parser has to infer elements or has to find out where an attribute value
> ends. There is a .NET implementation of an SGML parser, SgmlReader
> (http://developer.mindtouch.com/SgmlReader) which can be used to convert
> "HTML tag soup" to XHTML. There is also a HTML Tidy application doing
> the same. So studying the code of such applications can help.

You might also be able to use the HTML Agility Pack, it's pretty
forgiving when it comes to tags, but I'm not sure it'll parse just any
XML like structure...

See Codeplex.com/HtmlAgilityPack for the download.

--
Jesse Houwing
jesse.houwing at sogeti.nl
 | 
Pages: 1
Prev: AckTerm
Next: JIT Debugging