From: "Gary ." on
Why am I still getting an exception when I do this:

libxml_use_internal_errors(true);
$this->xml = new SimpleXMLElement($this->htmlString);

or this
$this->xml = new SimpleXMLElement($this->htmlString,
LIBXML_NOERROR|LIBXML_NOWARNING);

?

The exception says "Exception: String could not be parsed as XML". Not
a hint of why not, of course.

I thought the point of those things was to just stuff the content in,
and let user code handle errors? I mean, I *know* the provided HTML is
broken. I also know there's not a chance in hell of it ever being
fixed (completely out of my control).

And yes, I'd rather use DOM, but I can't.
From: Richard Quadling on
On 8 July 2010 08:07, Gary . <php-general(a)garydjones.name> wrote:
> Why am I still getting an exception when I do this:
>
> libxml_use_internal_errors(true);
> $this->xml = new SimpleXMLElement($this->htmlString);
>
> or this
> $this->xml = new SimpleXMLElement($this->htmlString,
> LIBXML_NOERROR|LIBXML_NOWARNING);
>
> ?
>
> The exception says "Exception: String could not be parsed as XML". Not
> a hint of why not, of course.
>
> I thought the point of those things was to just stuff the content in,
> and let user code handle errors? I mean, I *know* the provided HTML is
> broken. I also know there's not a chance in hell of it ever being
> fixed (completely out of my control).
>
> And yes, I'd rather use DOM, but I can't.
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

The XML needs to be "well formed" [1]. So, if it is junk, you can't
read it using SimpleXML as the XML is not well formed.

Try putting it through Tidy first - that is, tidy the file first.


Regards,

Richard.

[1] http://www.devx.com/projectcool/Article/19944/0/page/3
From: "Gary ." on
Richard Quadling writes:
> On 8 July 2010 08:07, Gary wrote:
>> Why am I still getting an exception when I do this:
>>
>> libxml_use_internal_errors(true);
>> $this->xml = new SimpleXMLElement($this->htmlString);
>>
>> or this
>> $this->xml = new SimpleXMLElement($this->htmlString,
>> LIBXML_NOERROR|LIBXML_NOWARNING);
>>
>> ?
>>
>> The exception says "Exception: String could not be parsed as XML".
....
> The XML needs to be "well formed" [1].

I thought so, thanks. What does libxml_use_internal_errors do then, if
it doesn't allow me to handle those problems in my own code?

> So, if it is junk, you can't
> read it using SimpleXML as the XML is not well formed.

I'm trying to just use xml_parse and so on now.

This problem really should be *so* easy. In fact I've already solved it once X(

> Try putting it through Tidy first - that is, tidy the file first.

Ha ha!

Sorry.

It's almost certainly not available. I don't want to talk about it
*cries*
From: Marc Guay on
> libxml_use_internal_errors(true);
> $this->xml = new SimpleXMLElement($this->htmlString);

Hi Gary,

I have code that looks like this:

libxml_use_internal_errors(true);
$xml = simplexml_load_string($val);
$errors = libxml_get_errors();

if ($errors)
do this
else
do that

which works fine. Not sure if that's helpful to you, but it seems
like it might.

Marc
From: "Gary ." on
Marc Guay writes:
>> libxml_use_internal_errors(true);
>> $this->xml = new SimpleXMLElement($this->htmlString);

> I have code that looks like this:
>
> libxml_use_internal_errors(true);
> $xml = simplexml_load_string($val);

Yeah. I tried simplexml_load_string and found that "worked" (in that it
didn't cause an exception - there are errors which caused the conversion
not to work). I wonder what the difference is between doing "new
SimpleXMLElement" and calling simplexml_load_string which results in the
libxml_use_internal_errors call being ineffective. Odd.