<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>194934</bug_id>
          
          <creation_ts>2019-02-21 23:02:46 -0800</creation_ts>
          <short_desc>Spurious find results on many apple.com pages</short_desc>
          <delta_ts>2022-02-09 10:13:49 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebCore Misc.</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Tim Horton">thorton</reporter>
          <assigned_to name="Tim Horton">thorton</assigned_to>
          <cc>ap</cc>
    
    <cc>bdakin</cc>
    
    <cc>darin</cc>
    
    <cc>jonlee</cc>
    
    <cc>koivisto</cc>
    
    <cc>rniwa</cc>
    
    <cc>simon.fraser</cc>
    
    <cc>webkit-bug-importer</cc>
    
    <cc>wenson_hsieh</cc>
    
    <cc>zalan</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1508903</commentid>
    <comment_count>0</comment_count>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-02-21 23:02:46 -0800</bug_when>
    <thetext>Apple.com seems to have adopted a technique where they hide accessibility text all over the page in spans with this style:

    clip: rect(1px, 1px, 1px, 1px);
    -webkit-clip-path: inset(0px 0px 99.9% 99.9%);
    clip-path: inset(0px 0px 99.9% 99.9%);
    overflow: hidden;
    height: 1px;
    width: 1px;

If you use find-in-page to search for text inside of one of these elements, we happily return it as a potential result, though the text is not visible, and the find hole and highlight end up being 1x1 px.

I&apos;m not sure the best way or place to detect this, but making TextIterator&apos;s `fullyClipsContents` return true if the contentSize&apos;s area is &lt;= 1px certainly fixes it (and makes find on these pages feel much more sensible).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1508904</commentid>
    <comment_count>1</comment_count>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-02-21 23:03:08 -0800</bug_when>
    <thetext>Steps to Reproduce:

1. Search for &apos;apple&apos; or &apos;iphone&apos; on apple.com</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1509023</commentid>
    <comment_count>2</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2019-02-22 09:52:52 -0800</bug_when>
    <thetext>The heuristic about very small sizes makes sense to me and seems worth doing rather than a lot of soul searching. I think we could consider a threshold even larger than 1px, based on some assumptions about the smallest size for readable text.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1509024</commentid>
    <comment_count>3</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2019-02-22 09:53:43 -0800</bug_when>
    <thetext>What is not so clear to me is which part of this is the key. The small size, the clipping. It would be nice to have the heuristic consider the minimum clue necessary to help ensure it works on the widest number of pages.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1509156</commentid>
    <comment_count>4</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2019-02-22 13:43:06 -0800</bug_when>
    <thetext>Hm.. maybe 2-3px box would be too small? I think anything bigger than that, we run the risk of things being visible when zoomed, or the author intentionally showing really tiny text.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544785</commentid>
    <comment_count>5</comment_count>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-06-14 11:27:18 -0700</bug_when>
    <thetext>&lt;rdar://problem/51739857&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544810</commentid>
    <comment_count>6</comment_count>
      <attachid>372136</attachid>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-06-14 13:00:31 -0700</bug_when>
    <thetext>Created attachment 372136
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544821</commentid>
    <comment_count>7</comment_count>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-06-14 13:28:25 -0700</bug_when>
    <thetext>Hilariously, the overlay test failures are legit, because that test puts its log content in a &quot;position: absolute; height: 1px; width: 1px; overflow: hidden;&quot; div (for some reason), and now TextIterator decides not to dump that text when dumpAsText-ing the layout test.

Not really sure we can change global TextIterator behavior that much (Darin? Ryosuke? smfr is not on board anymore)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544823</commentid>
    <comment_count>8</comment_count>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-06-14 13:29:51 -0700</bug_when>
    <thetext>The other option is to add a TextIteratorOption that find uses, but it seems weiiiird for plainText() and find to include different text.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544868</commentid>
    <comment_count>9</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2019-06-14 15:27:46 -0700</bug_when>
    <thetext>(In reply to Tim Horton from comment #7)
&gt; Hilariously, the overlay test failures are legit, because that test puts its
&gt; log content in a &quot;position: absolute; height: 1px; width: 1px; overflow:
&gt; hidden;&quot; div (for some reason), and now TextIterator decides not to dump
&gt; that text when dumpAsText-ing the layout test.
&gt; 
&gt; Not really sure we can change global TextIterator behavior that much (Darin?
&gt; Ryosuke? smfr is not on board anymore)

I think this kind of change is OK. I can’t think of any concrete use of TextIterator where iterating this invisible text is a plus. We should change those tests to not depend on this.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544871</commentid>
    <comment_count>10</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2019-06-14 15:31:47 -0700</bug_when>
    <thetext>(In reply to Tim Horton from comment #8)
&gt; The other option is to add a TextIteratorOption that find uses, but it seems
&gt; weiiiird for plainText() and find to include different text.

We need to go through the different TextIterator use types. There are probably still fewer than 10. I doubt we can find any where it’s a benefit to include this kind of invisible text.

The use of TextIterator in dumpAsText is probably the least &quot;legit&quot; use of it. It&apos;s basically designed for use in Copy and Find, and then we found many uses for that in text editing as well.

If there’s someone arguing that such invisible text *should* continue to be made visible when you copy and paste, I suggest that’s what we should discuss, not the abstract &quot;what should a text iterator do&quot;.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544889</commentid>
    <comment_count>11</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2019-06-14 16:15:01 -0700</bug_when>
    <thetext>TextIterator is also used to implement innerText for now, and that API&apos;s behavior shouldn&apos;t change.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544891</commentid>
    <comment_count>12</comment_count>
    <who name="Tim Horton">thorton</who>
    <bug_when>2019-06-14 16:20:53 -0700</bug_when>
    <thetext>(In reply to Ryosuke Niwa from comment #11)
&gt; TextIterator is also used to implement innerText for now, and that API&apos;s
&gt; behavior shouldn&apos;t change.

Hmm, I assumed that set TextIteratorIgnoresStyleVisibility but it does not.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1544962</commentid>
    <comment_count>13</comment_count>
    <who name="Antti Koivisto">koivisto</who>
    <bug_when>2019-06-14 23:08:21 -0700</bug_when>
    <thetext>Time for another flag!</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1545045</commentid>
    <comment_count>14</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2019-06-15 13:02:32 -0700</bug_when>
    <thetext>Seems risky that innerText is possibly important for website compatibility but it’s not precisely specified.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1545075</commentid>
    <comment_count>15</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2019-06-15 18:18:12 -0700</bug_when>
    <thetext>OK, so if innerText does not set TextIteratorIgnoresStyleVisibility:

1) I think it should.
2) I think then that changing to treat 1x1 text as invisible is a good idea anyway.
3) But why would any of these need to be web-visible.

I do not want to hold off on fixing this bug forever just because of innerText!</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>372136</attachid>
            <date>2019-06-14 13:00:31 -0700</date>
            <delta_ts>2019-06-14 13:29:59 -0700</delta_ts>
            <desc>Patch</desc>
            <filename>bug-194934-20190614130030.patch</filename>
            <type>text/plain</type>
            <size>3105</size>
            <attacher name="Tim Horton">thorton</attacher>
            
              <data encoding="base64">U3VidmVyc2lvbiBSZXZpc2lvbjogMjQ2NDM5CmRpZmYgLS1naXQgYS9Tb3VyY2UvV2ViQ29yZS9D
aGFuZ2VMb2cgYi9Tb3VyY2UvV2ViQ29yZS9DaGFuZ2VMb2cKaW5kZXggMGFhNmQwODdjMjY4ODMz
NGJjN2VhMmE0NjM5ODA0ZmFmNTUyN2U5Zi4uMjY4N2FlYmQyYWI0MWEwZGY0NGM2ZDAwZmIwM2Y0
NTJjMzExZGEzOSAxMDA2NDQKLS0tIGEvU291cmNlL1dlYkNvcmUvQ2hhbmdlTG9nCisrKyBiL1Nv
dXJjZS9XZWJDb3JlL0NoYW5nZUxvZwpAQCAtMSwzICsxLDE3IEBACisyMDE5LTA2LTE0ICBUaW0g
SG9ydG9uICA8dGltb3RoeV9ob3J0b25AYXBwbGUuY29tPgorCisgICAgICAgIFNwdXJpb3VzIGZp
bmQgcmVzdWx0cyBvbiBtYW55IGFwcGxlLmNvbSBwYWdlcworICAgICAgICBodHRwczovL2J1Z3Mu
d2Via2l0Lm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MTk0OTM0CisgICAgICAgIDxyZGFyOi8vcHJvYmxl
bS81MTczOTg1Nz4KKworICAgICAgICBSZXZpZXdlZCBieSBOT0JPRFkgKE9PUFMhKS4KKworICAg
ICAgICBOZXcgQVBJIHRlc3Q6IFdlYktpdC5Eb05vdEZpbmRNb3N0bHlDbGlwcGVkT3V0VGV4dC4K
KworICAgICAgICAqIGVkaXRpbmcvVGV4dEl0ZXJhdG9yLmNwcDoKKyAgICAgICAgKFdlYkNvcmU6
OmZ1bGx5Q2xpcHNDb250ZW50cyk6CisgICAgICAgIEhhdmUgVGV4dEl0ZXJhdG9yIGNvbnNpZGVy
IGEgMXgxIGNsaXAgYXMgImZ1bGx5IGNsaXBwZWQgb3V0IiB0ZXh0LgorCiAyMDE5LTA2LTE0ICBT
YWFtIEJhcmF0aSAgPHNiYXJhdGlAYXBwbGUuY29tPgogCiAgICAgICAgIFVucmV2aWV3ZWQuIEZv
bGxvdyB1cCB0byByMjQ2NDM4LiBUaGlzIHJlbW92ZXMgYSBkZWJ1ZyBhc3NlcnQgdW50aWwKZGlm
ZiAtLWdpdCBhL1NvdXJjZS9XZWJDb3JlL2VkaXRpbmcvVGV4dEl0ZXJhdG9yLmNwcCBiL1NvdXJj
ZS9XZWJDb3JlL2VkaXRpbmcvVGV4dEl0ZXJhdG9yLmNwcAppbmRleCAyMzQxNGEyOTU2NDQwMGU4
NDdjNDUxNWJhMjlhOTY5ZWU4ZGIyYTgxLi43YTk4MzM5YTUzNzcxYjliMzcxMzRmOGFkNTQwYTNk
ZDUzNDNlNGFiIDEwMDY0NAotLS0gYS9Tb3VyY2UvV2ViQ29yZS9lZGl0aW5nL1RleHRJdGVyYXRv
ci5jcHAKKysrIGIvU291cmNlL1dlYkNvcmUvZWRpdGluZy9UZXh0SXRlcmF0b3IuY3BwCkBAIC0y
MjAsNyArMjIwLDcgQEAgc3RhdGljIGlubGluZSBib29sIGZ1bGx5Q2xpcHNDb250ZW50cyhOb2Rl
JiBub2RlKQogICAgIGlmIChpczxIVE1MVGV4dEFyZWFFbGVtZW50Pihub2RlKSkKICAgICAgICAg
cmV0dXJuIGJveC5zaXplKCkuaXNFbXB0eSgpOwogCi0gICAgcmV0dXJuIGJveC5jb250ZW50U2l6
ZSgpLmlzRW1wdHkoKTsKKyAgICByZXR1cm4gRmxvYXRTaXplKGJveC5jb250ZW50U2l6ZSgpKS5h
cmVhKCkgPD0gMTsKIH0KIAogc3RhdGljIGlubGluZSBib29sIGlnbm9yZXNDb250YWluZXJDbGlw
KE5vZGUmIG5vZGUpCmRpZmYgLS1naXQgYS9Ub29scy9DaGFuZ2VMb2cgYi9Ub29scy9DaGFuZ2VM
b2cKaW5kZXggZTNhYTRlOTM3OWI3MGI3NGI3MjZjYTk5MzBjZjM1NGFhZWY5YjdjZS4uMzE2NWE0
MzQ5MmU3MWZjY2Q1NDBlNTI1NzUyMmQ5ZDAzN2IzMGQ1OSAxMDA2NDQKLS0tIGEvVG9vbHMvQ2hh
bmdlTG9nCisrKyBiL1Rvb2xzL0NoYW5nZUxvZwpAQCAtMSwzICsxLDE0IEBACisyMDE5LTA2LTE0
ICBUaW0gSG9ydG9uICA8dGltb3RoeV9ob3J0b25AYXBwbGUuY29tPgorCisgICAgICAgIFNwdXJp
b3VzIGZpbmQgcmVzdWx0cyBvbiBtYW55IGFwcGxlLmNvbSBwYWdlcworICAgICAgICBodHRwczov
L2J1Z3Mud2Via2l0Lm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MTk0OTM0CisgICAgICAgIDxyZGFyOi8v
cHJvYmxlbS81MTczOTg1Nz4KKworICAgICAgICBSZXZpZXdlZCBieSBOT0JPRFkgKE9PUFMhKS4K
KworICAgICAgICAqIFRlc3RXZWJLaXRBUEkvVGVzdHMvV2ViS2l0Q29jb2EvRmluZEluUGFnZS5t
bToKKyAgICAgICAgKFRFU1QpOgorCiAyMDE5LTA2LTE0ICBZb3Vlbm4gRmFibGV0ICA8eW91ZW5u
QGFwcGxlLmNvbT4KIAogICAgICAgICBpbXBvcnQtdzNjLXRlc3RzIHNob3VsZCByZXNwZWN0IFdF
QktJVF9PVVRQVVRESVIKZGlmZiAtLWdpdCBhL1Rvb2xzL1Rlc3RXZWJLaXRBUEkvVGVzdHMvV2Vi
S2l0Q29jb2EvRmluZEluUGFnZS5tbSBiL1Rvb2xzL1Rlc3RXZWJLaXRBUEkvVGVzdHMvV2ViS2l0
Q29jb2EvRmluZEluUGFnZS5tbQppbmRleCA0MmQwNzZjNDUzODFjNTc3MTFkN2M1MWY3ODAxMmY2
NDlhZWY4NzVmLi5iY2QxZTczZThmYjZjZWEyNTZiZDZmMTEzOTZlMDM5MThiMDdlMjY5IDEwMDY0
NAotLS0gYS9Ub29scy9UZXN0V2ViS2l0QVBJL1Rlc3RzL1dlYktpdENvY29hL0ZpbmRJblBhZ2Uu
bW0KKysrIGIvVG9vbHMvVGVzdFdlYktpdEFQSS9UZXN0cy9XZWJLaXRDb2NvYS9GaW5kSW5QYWdl
Lm1tCkBAIC0yNDgsNCArMjQ4LDE1IEBAIFRFU1QoV2ViS2l0LCBGaW5kQW5kUmVwbGFjZSkKICAg
ICBFWFBFQ1RfV0tfU1RSRVEoImhpIGhpIiwgW3dlYlZpZXcgc3RyaW5nQnlFdmFsdWF0aW5nSmF2
YVNjcmlwdDpAImRvY3VtZW50LmJvZHkudGV4dENvbnRlbnQiXSk7CiB9CiAKK1RFU1QoV2ViS2l0
LCBEb05vdEZpbmRNb3N0bHlDbGlwcGVkT3V0VGV4dCkKK3sKKyAgICBSZXRhaW5QdHI8VGVzdFdL
V2ViVmlldz4gd2ViVmlldyA9IGFkb3B0TlMoW1tUZXN0V0tXZWJWaWV3IGFsbG9jXSBpbml0V2l0
aEZyYW1lOk5TTWFrZVJlY3QoMCwgMCwgMjAwLCAyMDApXSk7CisKKyAgICBbd2ViVmlldyBzeW5j
aHJvbm91c2x5TG9hZEhUTUxTdHJpbmc6QCI8ZGl2IHN0eWxlPSd3aWR0aDogMXB4OyBoZWlnaHQ6
IDFweDsgb3ZlcmZsb3c6IGhpZGRlbjsnPkJpcnRoZGF5PC9kaXY+PGRpdj5CaXJ0aGRheTwvZGl2
PiJdOworCisgICAgLy8gU2hvdWxkIG9ubHkgZmluZCBvbmUgbWF0Y2g7IHRoZSBmaXJzdCBvbmUg
aXMgY2xpcHBlZCBvdXQgZW5vdWdoIHRoYXQgaXQgZG9lc24ndCBjb3VudC4KKyAgICBhdXRvIHJl
c3VsdCA9IGZpbmRNYXRjaGVzKHdlYlZpZXcuZ2V0KCksIEAiQmlydGhkYXkiKTsKKyAgICBFWFBF
Q1RfRVEoKE5TVUludGVnZXIpMSwgW3Jlc3VsdC5tYXRjaGVzIGNvdW50XSk7Cit9CisKICNlbmRp
ZiAvLyAhUExBVEZPUk0oSU9TX0ZBTUlMWSkK
</data>
<flag name="review"
          id="387900"
          type_id="1"
          status="-"
          setter="simon.fraser"
    />
          </attachment>
      

    </bug>

</bugzilla>