<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>4120</bug_id>
          
          <creation_ts>2005-07-24 02:00:32 -0700</creation_ts>
          <short_desc>Servers that need encoding sniffing to be rendered properly</short_desc>
          <delta_ts>2024-01-22 20:56:33 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Layout and Rendering</component>
          <version>312.x</version>
          <rep_platform>Mac</rep_platform>
          <op_sys>OS X 10.3</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>CONFIGURATION CHANGED</resolution>
          
          
          <bug_file_loc>http://www.museum.ru/museum/Ostankino/5.htm</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>245305</dependson>
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Alexey Proskuryakov">ap</reporter>
          <assigned_to name="Dave Hyatt">hyatt</assigned_to>
          <cc>gavin.sharp</cc>
    
    <cc>ian</cc>
    
    <cc>karlcow</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>15127</commentid>
    <comment_count>0</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2005-07-24 02:00:32 -0700</bug_when>
    <thetext>This server (using Microsoft-IIS/5.0) auto-guesses the encoding, and sends Mac Cyrillic to Safari. For 
whatever reason, the charset sent is quite broken - &quot;mac&quot; is ambiguous and thus unsupported by 
WebKit.

Still, it should be possible to disambiguate &quot;mac&quot; by using the system primary language&apos;s Mac 
encoding.

% curl -I --header &quot;User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; ru-ru) Apple WebKit/312.1 
(KHTML, like Gecko) Safari/312&quot; http://www.museum.ru/museum/Ostankino/5.htm
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sun, 24 Jul 2005 08:58:33 GMT
Accept-Ranges: bytes
Last-Modified: Wed, 15 Jun 2005 12:58:15 GMT
ETag: &quot;ed49fce4a971c51:804&quot;
Content-Length: 7318
Set-Cookie: charset=mac; path=/; expires=Mon, 10 May 2032 23:12:40 GMT
Content-Type: text/html; charset=mac</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>15141</commentid>
    <comment_count>1</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2005-07-24 13:36:42 -0700</bug_when>
    <thetext>Oops, in fact &quot;mac&quot; and &quot;macintosh&quot; charsets are defined in RFC 1345 (as MacRoman), and WebKit 
explicitly supports them.

So, the implementation is correct, and probably shouldn&apos;t be changed. However, this example may need to 
be considered in a future encoding sniffer - museum.ru is a rather important server.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>21334</commentid>
    <comment_count>2</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2005-10-04 09:58:50 -0700</bug_when>
    <thetext>I propose to use this bug to track servers whose encoding cannot be determined via HTTP or HTML 
headers, so content sniffing is required. Two more:

http://stats.distributed.net/team/tmsummary.php?project_id=8&amp;team=11269
http://www.mdf.ru</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>27905</commentid>
    <comment_count>3</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2006-01-07 01:08:12 -0800</bug_when>
    <thetext>http://www.zoo.ru (also sends charset=mac instead of x-mac-cyrillic).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>71029</commentid>
    <comment_count>4</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2008-02-18 02:44:11 -0800</bug_when>
    <thetext>Bug 17405: http://tianya.cn - no charset information; encoded as Simplified Chinese.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2006811</commentid>
    <comment_count>5</comment_count>
    <who name="Karl Dubost">karlcow</who>
    <bug_when>2024-01-22 20:56:33 -0800</bug_when>
    <thetext>From the sites in this bug

PASS http://www.museum.ru/museum/Ostankino/5.htm
PASS https://stats.distributed.net/team/tmsummary.php?project_id=8&amp;team=11269
PASS http://www.mdf.ru after redirect to https://www.mamm-mdf.ru
ERR  http://www.zoo.ru Domain is for sale.
ERR  http://tianya.cn  Domain not available anymore.


Let&apos;s close this bug as Bug 245305
is about addressing the requirements of Content Sniffing.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>