WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
94320
Share WebKit-Gtk's Enchant implementation with others WebKit ports
https://bugs.webkit.org/show_bug.cgi?id=94320
Summary
Share WebKit-Gtk's Enchant implementation with others WebKit ports
Grzegorz Czajkowski
Reported
2012-08-17 02:29:29 PDT
WebKit-Gtk ensures spellchecking with the Enchant library support, mostly implemented at
https://bugs.webkit.org/show_bug.cgi?id=90269
. This code is used in both WebKit-Gtk. The WebKit-Efl's implemention of spellchecking is based on Enchant too. It doesn't have sense to duplicate implementation from text/gtk/TextCheckerEnchant* files for WebKit's Efl. To share WebKit-Gtk's implemnetation with others WebKit, this patch proposes: 1. Move TextCheckerEnchant{h.cpp} files form WebCore/platform/text/gtk/ to WebCore/platform/text/enchant/. 2. Limit Glib's calls as possible for example, GOwnPtr -> WTF::OwnPtr etc.
Attachments
proposed patch
(12.27 KB, patch)
2012-08-21 07:55 PDT
,
Grzegorz Czajkowski
no flags
Details
Formatted Diff
Diff
updated patch
(12.36 KB, patch)
2012-08-22 03:50 PDT
,
Grzegorz Czajkowski
no flags
Details
Formatted Diff
Diff
patch v3
(12.34 KB, patch)
2012-08-23 00:27 PDT
,
Grzegorz Czajkowski
no flags
Details
Formatted Diff
Diff
passing cstring to enchant
(12.26 KB, patch)
2012-08-23 00:42 PDT
,
Grzegorz Czajkowski
no flags
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Grzegorz Czajkowski
Comment 1
2012-08-21 07:55:49 PDT
Created
attachment 159689
[details]
proposed patch
Carlos Garcia Campos
Comment 2
2012-08-21 09:53:53 PDT
Comment on
attachment 159689
[details]
proposed patch Since the GTK+ port already uses ICU (at least by default) maybe we could use the same code based on ICU and avoid the #ifdefs? And for the language maybe we could use WebCore::defaultLanguage() or any other method in Language.h. What you guys think?
Martin Robinson
Comment 3
2012-08-21 10:01:33 PDT
I think #ifdefs should be avoided here if possible and it seems like it's very possible.
Mario Sanchez Prada
Comment 4
2012-08-21 10:37:47 PDT
Comment on
attachment 159689
[details]
proposed patch View in context:
https://bugs.webkit.org/attachment.cgi?id=159689&action=review
I agree with Martin and Carlos's comments. Additionally, you're missing the fix for
bug 94202
here, thus re-introducing the issue with this patch. See my comment below...
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:137 > + if (enchant_dict_check(*dictIter, word.utf8().data(), wordLength)) {
You are probably missing the fix for
bug 94202
, committed last week, which is about not calling this function passing wordLength because it will fail for strings with special UTF8 characters, such as "דעפ", for instance. You should pass 'bytes' instead which is what enchant_dict_check actually expects. See
bug 94202
for more details.
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:203 > + // No dictionaries selected, we get first from the list.
"we get first from the list" -> "we get the first one from the list"
Grzegorz Czajkowski
Comment 5
2012-08-22 03:50:14 PDT
Created
attachment 159895
[details]
updated patch
Grzegorz Czajkowski
Comment 6
2012-08-22 04:03:04 PDT
(In reply to
comment #2
)
> (From update of
attachment 159689
[details]
) > Since the GTK+ port already uses ICU (at least by default) maybe we could use the same code based on ICU and avoid the #ifdefs?
Ok, I used ICU instead of Pango at the patch so any Platform specif code doesn't exist.
> And for the language maybe we could use WebCore::defaultLanguage() or any other method in Language.h. What you guys think?
Good idea. For WebKit-Gtk the method returns "en-US". There is no problem for the Enchant backed, its 'normalize' function is called to properly find the desired dictionary.
Grzegorz Czajkowski
Comment 7
2012-08-22 04:14:52 PDT
Comment on
attachment 159689
[details]
proposed patch View in context:
https://bugs.webkit.org/attachment.cgi?id=159689&action=review
>> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:137 >> + if (enchant_dict_check(*dictIter, word.utf8().data(), wordLength)) { > > You are probably missing the fix for
bug 94202
, committed last week, which is about not calling this function passing wordLength because it will fail for strings with special UTF8 characters, such as "דעפ", for instance. You should pass 'bytes' instead which is what enchant_dict_check actually expects. > > See
bug 94202
for more details.
You're right. This bug shows up that the String::fromUTF8() also requires the number of bytes instead of number of characters. To properly compute the number of bytes from utf8 string I had to use glib's API - g_utf8_offset_to_pointer() as it was done before. Unfortunately it makes TextCheckerEnchant implementation glib independent. Of course for Gtk, Efl it is no problem - we already use it.
>> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:203 >> + // No dictionaries selected, we get first from the list. > > "we get first from the list" -> "we get the first one from the list"
Fixed thanks.
Mario Sanchez Prada
Comment 8
2012-08-22 06:56:35 PDT
Comment on
attachment 159895
[details]
updated patch View in context:
https://bugs.webkit.org/attachment.cgi?id=159895&action=review
Thanks for the update. However, I'm afraid you will need to provide yet another one mainly because a new patch has recently (today!) been committed that affects the base your patch :(. See
bug 94683
for more details. Also, I commented a couple of things on this patch too. Hope you find them interesting / useful...
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:27 > +#include <glib.h> // For utf8 support.
I don't think this comment is actually needed (also, it's not very common to have inline comments, even less for includes, AFAIK)
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:101 > + const char* cstart = cString + start;
Shouldn't you use g_utf8_offset_to_pointer() here too?
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:106 > + if (enchant_dict_check(*dictIter, word.utf8().data(), numberOfBytes)) {
I'm not completely sure, but I think it's not safe to call enchant_dict_check passing directly word.utf8().data() as parameter, since the CString returned by .utf8() will be disposed early and you'll have a gchar* (as returned by .data()) pointing to invalid memory. Thus, it's probably safer to do something like this in line 103: CString word = String::fromUTF8(cstart, numberOfBytes).utf8(); ... and then just call to word.data() from that point on where you need it (lines 106 and 130 only, I think)
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:130 > + char** suggestions = enchant_dict_suggest(*iter, word.utf8().data(), -1, &numberOfSuggestions);
This line would also benefit of the CString declaration I mentioned before.
Martin Robinson
Comment 9
2012-08-22 08:22:10 PDT
Comment on
attachment 159895
[details]
updated patch View in context:
https://bugs.webkit.org/attachment.cgi?id=159895&action=review
>> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:106 >> + if (enchant_dict_check(*dictIter, word.utf8().data(), numberOfBytes)) { > > I'm not completely sure, but I think it's not safe to call enchant_dict_check passing directly word.utf8().data() as parameter, since the CString returned by .utf8() will be disposed early and you'll have a gchar* (as returned by .data()) pointing to invalid memory. > > Thus, it's probably safer to do something like this in line 103: > > CString word = String::fromUTF8(cstart, numberOfBytes).utf8(); > > ... and then just call to word.data() from that point on where you need it (lines 106 and 130 only, I think)
It's safe to use .data(), but not to store it and use it on a later line.
Grzegorz Czajkowski
Comment 10
2012-08-23 00:10:46 PDT
(In reply to
comment #8
)
> (From update of
attachment 159895
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=159895&action=review
> > Thanks for the update. However, I'm afraid you will need to provide yet another one mainly because a new patch has recently (today!) been committed that affects the base your patch :(. See
bug 94683
for more details.
Thanks for the information. This change is good for us too :) Our internal implementation also bases on vector to pass/get languages. Frankly speaking I wanted to change in the another patch:)
> > Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:27 > > +#include <glib.h> // For utf8 support. > > I don't think this comment is actually needed (also, it's not very common to have inline comments, even less for includes, AFAIK)
You're right. I removed this useless comment.
> > Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:101 > > + const char* cstart = cString + start; > > Shouldn't you use g_utf8_offset_to_pointer() here too?
Ok, I wanted to limit GLib's invocation as possible. The beginning of the word can be determined based on raw pointers - it doesn't matter whether string contains UTF8 data or not. I couldn't find alternative way how to determine the number of bytes of the string that is passes in UTF8 (only GLib API works) we can restore origin code. We have to use GLib anyway :)
> > Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:106 > > + if (enchant_dict_check(*dictIter, word.utf8().data(), numberOfBytes)) { > > I'm not completely sure, but I think it's not safe to call enchant_dict_check passing directly word.utf8().data() as parameter, since the CString returned by .utf8() will be disposed early and you'll have a gchar* (as returned by .data()) pointing to invalid memory. > > Thus, it's probably safer to do something like this in line 103: > > CString word = String::fromUTF8(cstart, numberOfBytes).utf8(); > > ... and then just call to word.data() from that point on where you need it (lines 106 and 130 only, I think)
I understand your idea. I tested this patch and I didn't notice any issues with that. As Martin said using word().utf8().data() is safety. Anyway 'ignoreWord', 'learnWord', 'getGuessesForWord' also use word.utf8().data() directly.
> > > Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:130 > > + char** suggestions = enchant_dict_suggest(*iter, word.utf8().data(), -1, &numberOfSuggestions); > > This line would also benefit of the CString declaration I mentioned before.
Here only gchar -> char is being changed. So I leave word.utf8().data() as it is.
Carlos Garcia Campos
Comment 11
2012-08-23 00:18:25 PDT
Comment on
attachment 159895
[details]
updated patch View in context:
https://bugs.webkit.org/attachment.cgi?id=159895&action=review
> Source/WebCore/platform/text/enchant/TextCheckerEnchant.cpp:103 > + String word = String::fromUTF8(cstart, numberOfBytes);
Why converting it to String here? the string is then converted to utf8 before it's passed to enchant, so I think it would be better to use a CString in this case and pass word.data() to enchant, or even use cstart directly since the number of bytes is passed to enchant too.
Grzegorz Czajkowski
Comment 12
2012-08-23 00:27:30 PDT
Created
attachment 160101
[details]
patch v3
Grzegorz Czajkowski
Comment 13
2012-08-23 00:42:45 PDT
Created
attachment 160104
[details]
passing cstring to enchant Thanks Carlos for suggestion - applied.
Carlos Garcia Campos
Comment 14
2012-08-24 06:05:35 PDT
Comment on
attachment 160104
[details]
passing cstring to enchant Looks good, thanks!
Grzegorz Czajkowski
Comment 15
2012-08-24 06:16:42 PDT
(In reply to
comment #14
)
> (From update of
attachment 160104
[details]
) > Looks good, thanks!
Thanks! This patch changes base code of spelling for WebKit-Gtk. I will set cq+ if unit and layout tests pass.
Grzegorz Czajkowski
Comment 16
2012-08-24 06:38:02 PDT
(In reply to
comment #15
)
> (In reply to
comment #14
) > > (From update of
attachment 160104
[details]
[details]) > > Looks good, thanks! > > Thanks! > This patch changes base code of spelling for WebKit-Gtk. I will set cq+ if unit and layout tests pass.
Unit test result: TestWebKitWebContext which tests spelling passes. TEST: ./Tools/gtk/../Scripts/../../WebKitBuild/Release/Programs/WebKit2APITests/TestWebKitWebContext... (pid=12321) PASS: ./Tools/gtk/../Scripts/../../WebKitBuild/Release/Programs/WebKit2APITests/TestWebKitWebContext Layout test result: ./Tools/Scripts/run-webkit-tests --gtk editing/spelling Found 27 tests; running 25, skipping 2. Running 1 DumpRenderTree over 1 shard. [1/25] editing/spelling/context-menu-suggestions.html failed [2/25] editing/spelling/design-mode-spellcheck-off.html passed [3/25] editing/spelling/grammar-edit-word.html failed [4/25] editing/spelling/grammar-markers.html passed [5/25] editing/spelling/grammar.html failed [6/25] editing/spelling/inline_spelling_markers.html passed [7/25] editing/spelling/spellcheck-api-crash.html passed [8/25] editing/spelling/spellcheck-async-mutation.html failed [9/25] editing/spelling/spellcheck-async-remove-frame.html passed [10/25] editing/spelling/spellcheck-async.html failed [11/25] editing/spelling/spellcheck-attribute.html passed [12/25] editing/spelling/spellcheck-input-search-crash.html passed [13/25] editing/spelling/spellcheck-paste-disabled.html passed [14/25] editing/spelling/spellcheck-paste.html passed [15/25] editing/spelling/spellcheck-queue.html failed [16/25] editing/spelling/spellcheck-sequencenum.html failed [17/25] editing/spelling/spelling-attribute-at-child.html passed [18/25] editing/spelling/spelling-attribute-change.html passed [19/25] editing/spelling/spelling-backspace-between-lines.html passed [20/25] editing/spelling/spelling-hasspellingmarker.html passed [21/25] editing/spelling/spelling-insert-html.html passed [22/25] editing/spelling/spelling-linebreak.html passed [23/25] editing/spelling/spelling-marker-description.html failed [24/25] editing/spelling/spelling-unified-emulation.html failed [25/25] editing/spelling/spelling.html passed All 25 tests ran as expected.
Grzegorz Czajkowski
Comment 17
2012-08-24 06:43:37 PDT
Comment on
attachment 160104
[details]
passing cstring to enchant No regression, cq+.
WebKit Review Bot
Comment 18
2012-08-24 07:01:38 PDT
Comment on
attachment 160104
[details]
passing cstring to enchant Clearing flags on attachment: 160104 Committed
r126583
: <
http://trac.webkit.org/changeset/126583
>
WebKit Review Bot
Comment 19
2012-08-24 07:01:43 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug