| Summary: | WTF::StringView::split should have an allowEmptyEntries flag | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Ross Kirsling <ross.kirsling> | ||||||||||
| Component: | Web Template Framework | Assignee: | Ross Kirsling <ross.kirsling> | ||||||||||
| Status: | RESOLVED FIXED | ||||||||||||
| Severity: | Normal | CC: | annulen, benjamin, cdumez, cmarcelo, commit-queue, darin, dbates, ews-watchlist, Hironori.Fujii, webkit-bug-importer | ||||||||||
| Priority: | P2 | Keywords: | InRadar | ||||||||||
| Version: | WebKit Nightly Build | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| See Also: | https://bugs.webkit.org/show_bug.cgi?id=187963 | ||||||||||||
| Bug Depends on: | |||||||||||||
| Bug Blocks: | 187859 | ||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Ross Kirsling
2018-07-20 10:39:20 PDT
Created attachment 345461 [details]
Patch
I think this is not the best API, because by looking at call site like
a.split(' ', true)
you won't have a slightest idea what does it mean until you look up implementation. In Qt we call this "boolean parameter trap" [1]. It would be better to use enum value, or other kind of named and typed flag
[1] https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap
(In reply to Konstantin Tokarev from comment #2) > I think this is not the best API, because by looking at call site like > > a.split(' ', true) > > you won't have a slightest idea what does it mean until you look up > implementation. In Qt we call this "boolean parameter trap" [1]. It would be > better to use enum value, or other kind of named and typed flag > > [1] https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap That's a fair point, but I was assuming we would want the API to align with String::split: https://github.com/WebKit/webkit/blob/master/Source/WTF/wtf/text/WTFString.cpp#L705 I think it would be better to fix String::split as well, though the latter is a bit more readable: split(separator, result) vs split(separator, true, result) Created attachment 345478 [details]
Patch
Created attachment 345504 [details]
Patch
Assertion fails in Debug build.
> ASSERTION FAILED: m_position < m_result.m_string.length()
> c:\webkit\ga\webkitbuild\debug\derivedsources\forwardingheaders\wtf\text\stringview.h(946) : WTF::StringView::SplitResult::Iterator::operator *
Created attachment 345579 [details]
Patch
(In reply to Fujii Hironori from comment #7) > Assertion fails in Debug build. > > > ASSERTION FAILED: m_position < m_result.m_string.length() > > c:\webkit\ga\webkitbuild\debug\derivedsources\forwardingheaders\wtf\text\stringview.h(946) : WTF::StringView::SplitResult::Iterator::operator * Whoops, thanks! Assertion updated. Comment on attachment 345579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=345579&action=review > Source/WTF/wtf/text/StringView.h:953 > + return m_position == other.m_position && m_isDone == other.m_isDone; This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? Comment on attachment 345579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=345579&action=review >> Source/WTF/wtf/text/StringView.h:953 >> + return m_position == other.m_position && m_isDone == other.m_isDone; > > This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > > Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== (In reply to Konstantin Tokarev from comment #11) > Comment on attachment 345579 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=345579&action=review > > >> Source/WTF/wtf/text/StringView.h:953 > >> + return m_position == other.m_position && m_isDone == other.m_isDone; > > > > This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > > > > Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? > > Actually, it seems that m_isDone affects control flow only when assers are > enabled, this means that it should be guarded with #ifndef NDEBUG, and it > should be excluded from operator== No, the asserts are just asserts; I simply forgot to update them, as I was in Release... m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition. Comment on attachment 345579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=345579&action=review >>>> Source/WTF/wtf/text/StringView.h:953 >>>> + return m_position == other.m_position && m_isDone == other.m_isDone; >>> >>> This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries >>> >>> Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? >> >> Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== > > No, the asserts are just asserts; I simply forgot to update them, as I was in Release... > > m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition. Thanks for explanation, now I have better idea of what's going on here. However, I don't get why empty string (corresponding to result.m_string.isEmpty()) should produce any entries at all - it doesn't have any separators. (In reply to Konstantin Tokarev from comment #13) > Comment on attachment 345579 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=345579&action=review > > >>>> Source/WTF/wtf/text/StringView.h:953 > >>>> + return m_position == other.m_position && m_isDone == other.m_isDone; > >>> > >>> This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > >>> > >>> Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? > >> > >> Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== > > > > No, the asserts are just asserts; I simply forgot to update them, as I was in Release... > > > > m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition. > > Thanks for explanation, now I have better idea of what's going on here. > However, I don't get why empty string (corresponding to > result.m_string.isEmpty()) should produce any entries at all - it doesn't > have any separators. I'm just following the standard behavior of string split in every language I am aware of (including WTF::String::split and boost::algorithm::split). The rationale is that we want "abc".split(',') to return { "abc" } (just the original string), so that sort of implies that "".split(',') should return { "" }. Comment on attachment 345579 [details] Patch Clearing flags on attachment: 345579 Committed r234122: <https://trac.webkit.org/changeset/234122> All reviewed patches have been landed. Closing bug. I agree that using a boolean here is not good style and I am unhappy to see this pattern repeated from WTF::String rather than fixing it in both places. (In reply to Darin Adler from comment #18) > I agree that using a boolean here is not good style and I am unhappy to see > this pattern repeated from WTF::String rather than fixing it in both places. Oh, I see, we used a named argument instead of a boolean -- better. However, I think we should simply have used a new function named splitAllowingEmptyEntries. (In reply to Darin Adler from comment #19) > (In reply to Darin Adler from comment #18) > > I agree that using a boolean here is not good style and I am unhappy to see > > this pattern repeated from WTF::String rather than fixing it in both places. > > Oh, I see, we used a named argument instead of a boolean -- better. However, > I think we should simply have used a new function named > splitAllowingEmptyEntries. I'd be happy to address this for String and StringView alike in a new ticket -- would that be okay? (In reply to Ross Kirsling from comment #20) > I'd be happy to address this for String and StringView alike in a new ticket > -- would that be okay? Created https://bugs.webkit.org/show_bug.cgi?id=187963 for this purpose. |