<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>43085</bug_id>
          
          <creation_ts>2010-07-27 15:07:45 -0700</creation_ts>
          <short_desc>libxml2 parser has a large performance overhead</short_desc>
          <delta_ts>2011-06-29 06:16:27 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>XML</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>45735</dependson>
    
    <dependson>52036</dependson>
    
    <dependson>41427</dependson>
    
    <dependson>45488</dependson>
    
    <dependson>45594</dependson>
    
    <dependson>45990</dependson>
    
    <dependson>50516</dependson>
    
    <dependson>50517</dependson>
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Patrick R. Gansterer">paroga</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>annulen</cc>
    
    <cc>ap</cc>
    
    <cc>darin</cc>
    
    <cc>eric</cc>
    
    <cc>mrowe</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>256876</commentid>
    <comment_count>0</comment_count>
    <who name="Patrick R. Gansterer">paroga</who>
    <bug_when>2010-07-27 15:07:45 -0700</bug_when>
    <thetext>In the current implementation of the XMLParser is much room for performance improvements.

A expat based XMLParser (see bug 41427) showed up to 25% less parsing time:
            libxml2        expat     percent
 5MB SVG:  0.7183sec     0.5356sec    -25%
10MB SVG:  1.6084sec     1.2298sec    -24%
20MB SVG:  5.4084sec     4.6952sec    -13%</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>258608</commentid>
    <comment_count>1</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2010-07-31 09:56:27 -0700</bug_when>
    <thetext>Long ago we used Expat. I don&apos;t remember why we switched to libxml2.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>258609</commentid>
    <comment_count>2</comment_count>
    <who name="Patrick R. Gansterer">paroga</who>
    <bug_when>2010-07-31 09:58:07 -0700</bug_when>
    <thetext>(In reply to comment #1)
&gt; Long ago we used Expat. I don&apos;t remember why we switched to libxml2.
because expat has no XLST support?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>258610</commentid>
    <comment_count>3</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2010-07-31 10:01:58 -0700</bug_when>
    <thetext>If you&apos;re interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>258615</commentid>
    <comment_count>4</comment_count>
    <who name="Patrick R. Gansterer">paroga</who>
    <bug_when>2010-07-31 10:14:41 -0700</bug_when>
    <thetext>(In reply to comment #3)
&gt; If you&apos;re interested in this question, I suggest reading the svn logs in the xml directory in webcore. Trac.webkit.org.
Wow, that&apos;s realy old code. ;-)
I don&apos;t think that expat will be better than libxml. IMHO only a &quot;native&quot; WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has. My expat implementation avoids the UTF16-&gt;UTF8-&gt;UTF16 conversation of libxml implementation, but there are unnecessary memcpy in the expat code anyway.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>429543</commentid>
    <comment_count>5</comment_count>
    <who name="Konstantin Tokarev">annulen</who>
    <bug_when>2011-06-29 05:42:56 -0700</bug_when>
    <thetext>&gt;IMHO only a &quot;native&quot; WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.

Rapidxml does not have any memcpy/strcpy calls</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>429558</commentid>
    <comment_count>6</comment_count>
    <who name="Patrick R. Gansterer">paroga</who>
    <bug_when>2011-06-29 06:16:27 -0700</bug_when>
    <thetext>(In reply to comment #5)
&gt; &gt;IMHO only a &quot;native&quot; WebKit parser can avoid the time-consuming memcpy/strcpy that any 3rdparty parser has.
&gt; 
&gt; Rapidxml does not have any memcpy/strcpy calls

Rapidxml (like expat) has many missing features: e.g. namespace support. So it&apos;s not a real alternative for libxml2.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>