From: "Stephan Rose" on 16 Feb 2007 21:28 Question regardin the wxXmlDocument. I was trying it out earlier today and it essentially works quite well. Loaded the XML file 100% correctly, no problems there. The only thing was$B!D(Bit takes several minutes to load the file which is about 25 megs in size. While I know that processing the data in entire file will easily take me several minutes, having the wxXmlDocument load take that long is somewhat problematic as the call blocks any ability to show any kind of progress. Is there any way to get the class to load the data on the fly as I iterate through the child nodes? Thanks, Stephan
From: Yuri Borsky on 17 Feb 2007 03:41 "Stephan Rose" <kermos(a)somrek.net> ����� � ���������:000f01c7523b$4753f0a0$6402a8c0(a)stephan... > > The only thing was$B!D(Bit takes several minutes to load the file which is about > 25 megs in size. > You may want to look at TinyXML - fast and simple XML parser. While it lacks many useful features, like checking for DTD correctness it is actually much faster. > Is there any way to get the class to load the data on the fly as I iterate > through the child nodes? Is it actually possible with XML? I mean this is exactly one of the big problems of XML: in order to process any node you must start from top-level one and you won't have it until you have entire XML tree loaded and verified according to DTD or schema. Sometimes it is actually much faster to process an XML like a normal text. Say if your task is to calculate number of given nodes to get some statistics. There you may just look for corresp. string while processing XML text line by line. But most of the time yes, first you load _and_verify_ entire XML tree and only then you get to process it. > > > > Thanks, > > > > Stephan > >
From: Francesco Montorsi on 17 Feb 2007 07:30 Stephan Rose ha scritto: > Question regardin the wxXmlDocument. > > > > I was trying it out earlier today and it essentially works quite well. > Loaded the XML file 100% correctly, no problems there. > > > > The only thing was$B!D(Bit takes several minutes to load the file which is > about 25 megs in size. > > > > While I know that processing the data in entire file will easily take me > several minutes, having the wxXmlDocument load take that long is > somewhat problematic as the call blocks any ability to show any kind of > progress. you should load the document from a secondary thread to avoid blocking your GUI. You could then use wxGauge::Pulse to show the progress of the loading. > > > > Is there any way to get the class to load the data on the fly as I > iterate through the child nodes? you can take a look at libxml2 (http://xmlsoft.org) - IIRC it does support that feature and wxXml2 component at wxCode wraps it for wxWidgets (even if it does not wrap the load-on-fly feature). HTH, Francesco --------------------------------------------------------------------- To unsubscribe, e-mail: wx-users-unsubscribe(a)lists.wxwidgets.org For additional commands, e-mail: wx-users-help(a)lists.wxwidgets.org
From: John Ralls on 17 Feb 2007 12:15 On Feb 17, 2007, at 12:41 AM, Yuri Borsky wrote: > > > "Stephan Rose" <kermos(a)somrek.net> ÐÉÛÅÔ × > ÓÏÏÂÝÅÎÉÉ:000f01c7523b$4753f0a0$6402a8c0(a)stephan... > >> Is there any way to get the class to load the data on the fly as I >> iterate >> through the child nodes? > > Is it actually possible with XML? I mean this is exactly one of the > big > problems of XML: in order to process any node you must start from > top-level > one and you won't have it until you have entire XML tree loaded and > verified > according to DTD or schema. > > Sometimes it is actually much faster to process an XML like a > normal text. > Say if your task is to calculate number of given nodes to get some > statistics. There you may just look for corresp. string while > processing XML > text line by line. But most of the time yes, first you load > _and_verify_ > entire XML tree and only then you get to process it. There are two flavors of XML parsers available: Tree-based as you describe here, generally are based on the Document Object Model, or DOM. There are also event-based parsers, often, though not always, based on SAX; expat (which is used in wxWidgets internally and is part of the distribution) is an event based parser which isn't based on SAX. Some of the larger XML support libraries like Xerces and libxml2 provide both. Event based parsers notify the application (often via callbacks) of the beginning, value, and ending of each node as it occurs. It is the application's job to keep track of where it is in the document's tree. One oft-touted benefit is that they do not require the entire document to be in memory, so very large documents may be handled by an event-based parser where a tree-based parser would choke or swap itself to a standstill. Unless you have tight control over the format of the incoming documents, processing XML as plain text is a bad idea. The DTD or schema may insert additional content via entities and default attributes that plain text processing will not handle correctly. Regards, John Ralls --------------------------------------------------------------------- To unsubscribe, e-mail: wx-users-unsubscribe(a)lists.wxwidgets.org For additional commands, e-mail: wx-users-help(a)lists.wxwidgets.org
From: Yuri Borsky on 17 Feb 2007 14:43 John Ralls <jralls(a)ceridwen.fremont.ca.us> ����� � ���������:70B0189C-FEBD-4556-991E-6EF68DF18CB6(a)ceridwen.fremont.ca.us... > There are two flavors of XML parsers available: Tree-based as you > describe here, generally are based on the Document Object Model, or > DOM. There are also event-based parsers, often, though not always, > based on SAX; expat (which is used in wxWidgets internally and is > part of the distribution) is an event based parser which isn't based > on SAX. Some of the larger XML support libraries like Xerces and > libxml2 provide both. Thanks for the insight. Now I wonder why wxWidgets use event-based parser internally but expose it tree-like :) > Unless you have tight control over the format of the incoming > documents, processing XML as plain text is a bad idea. The DTD or > schema may insert additional content via entities and default > attributes that plain text processing will not handle correctly. That is true. However sometimes speed gain you get from raw text processing makes it worth it - and yes, only when appropriate and only when you know what you are doing:)
|
Pages: 1 Prev: Thumbnail control and wxImageList limits Next: wxXmlDocument memory leak |