<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>21977</bug_id>
          
          <creation_ts>2008-10-30 11:26:01 -0700</creation_ts>
          <short_desc>KURL should prohibit most escape sequences in hostnames</short_desc>
          <delta_ts>2023-05-22 03:47:21 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Platform</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>INVALID</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          <blocked>37641</blocked>
          <everconfirmed>1</everconfirmed>
          <reporter name="Brett Wilson (Google)">brettw</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>abarth</cc>
    
    <cc>annevk</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>97140</commentid>
    <comment_count>0</comment_count>
    <who name="Brett Wilson (Google)">brettw</who>
    <bug_when>2008-10-30 11:26:01 -0700</bug_when>
    <thetext>KURL allows hostnames such as &quot;hello%03world&quot; or even more scarily &quot;hello%00world&quot; or &quot;hello%2fworld&quot; (which will unescape to &quot;hello/world&quot;).

If the URL is extracted and unescaped (many of the component getters unescape by default, including host()) and passed to another system, such as the native OS&apos;s URL object, it could be treated as a completely different URL, with different security policy.

Google Chrome uses the lookup table at the top of this file:
http://code.google.com/p/google-url/source/browse/trunk/src/url_canon_host.cc
Characters marked with &quot;kEsc&quot; are allowed to be escaped, while characters marked with 0 are disallowed either escaped or unescaped in hostnames. This table prohibits control charcters, characters that may change the parsing of the URL if unescaped like /?#, and NULL. I think KURL needs to do the same.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1956899</commentid>
    <comment_count>1</comment_count>
    <who name="Anne van Kesteren">annevk</who>
    <bug_when>2023-05-22 03:47:21 -0700</bug_when>
    <thetext>KURL is gone.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>