RESOLVED FIXED51272
commit-queue will report constant failures as flaky if other tests flake
https://bugs.webkit.org/show_bug.cgi?id=51272
Summary commit-queue will report constant failures as flaky if other tests flake
Eric Seidel (no email)
Reported 2010-12-17 12:56:05 PST
commit-queue will report constant failures as flaky if other tests flake See https://bugs.webkit.org/show_bug.cgi?id=51236#c9 first_failing_tests = self._failing_tests_from_last_run() if self._test(): self._report_flaky_tests(first_failing_tests) return True second_failing_tests = self._failing_tests_from_last_run() if first_failing_tests != second_failing_tests: self._report_flaky_tests(first_failing_tests + second_failing_tests) return False Notice how if second_failing_tests is different from first_failing_tests, all of them get reported. What should happen is only the differences between first and second should be reported.
Attachments
Patch (7.72 KB, patch)
2010-12-20 20:14 PST, Eric Seidel (no email)
no flags
Patch (8.44 KB, patch)
2010-12-20 22:19 PST, Eric Seidel (no email)
no flags
Patch for landing (9.00 KB, patch)
2010-12-21 00:21 PST, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2010-12-17 12:56:58 PST
Fixing this may also fix bug 50263
Adam Barth
Comment 2 2010-12-17 13:02:28 PST
We can look at the test ordering and only report a flaky if we get further the second time.
Eric Seidel (no email)
Comment 3 2010-12-17 13:11:30 PST
Oh, I wasn't even considering that. I figured we'd just intersect the failure lists. But you're right, it's possible we'd still report a "late" constant failure as flaky if an earlier test flaked the second time.
Eric Seidel (no email)
Comment 4 2010-12-17 23:49:03 PST
How would you suggest we look at the test ordering? I'm not sure we have a class to do that.
Eric Seidel (no email)
Comment 5 2010-12-17 23:49:46 PST
Eh, we can write an approximate sort function. I'll take a whack at this.
Eric Seidel (no email)
Comment 6 2010-12-20 20:14:10 PST
Eric Seidel (no email)
Comment 7 2010-12-20 20:35:00 PST
*** Bug 50263 has been marked as a duplicate of this bug. ***
Adam Barth
Comment 8 2010-12-20 21:00:28 PST
Comment on attachment 77078 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=77078&action=review The test ordering code is too fragile. As the project evolves, its going to be subtly wrong. > Tools/Scripts/webkitpy/common/net/layouttestresults.py:99 > + # This is intended to match run-webkit-tests behavior. > + @classmethod > + def test_order_compare(cls, test1, test2): Really? I don't think this function is right. What about WebSocket tests? > Tools/Scripts/webkitpy/tool/bot/commitqueuetask.py:181 > + compare_result = LayoutTestResults.test_order_compare(first_failures[-1], second_failures[-1]) > + if compare_result < 0: # First run was shorter > + return different_failures.difference(second_failures) > + elif compare_result > 0: > + return different_failures.difference(first_failures) I don't know about this design. You didn't like the idea of only reporting a flak if the second run was all-success?
Eric Seidel (no email)
Comment 9 2010-12-20 21:26:30 PST
(In reply to comment #8) > (From update of attachment 77078 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=77078&action=review > > The test ordering code is too fragile. As the project evolves, its going to be subtly wrong. I agree, that's a risk. However, if we shared this code with NRWT it would be fine. :) > > Tools/Scripts/webkitpy/common/net/layouttestresults.py:99 > > + # This is intended to match run-webkit-tests behavior. > > + @classmethod > > + def test_order_compare(cls, test1, test2): > > Really? I don't think this function is right. What about WebSocket tests? Again, should be shared if we're going to go this way. > > Tools/Scripts/webkitpy/tool/bot/commitqueuetask.py:181 > > + compare_result = LayoutTestResults.test_order_compare(first_failures[-1], second_failures[-1]) > > + if compare_result < 0: # First run was shorter > > + return different_failures.difference(second_failures) > > + elif compare_result > 0: > > + return different_failures.difference(first_failures) > > I don't know about this design. You didn't like the idea of only reporting a flak if the second run was all-success? Oh, that's fine. Certainly simpler. It just produces this state where we retry w/o reporting any flaky tests which is OK just less than idea.
Adam Barth
Comment 10 2010-12-20 21:30:38 PST
> Oh, that's fine. Certainly simpler. It just produces this state where we retry w/o reporting any flaky tests which is OK just less than idea. It generates this strange response curve where if things are really really flaky, we never file bugs, but if things are just a bit flaky, then we'll be good at detecting them and filing bugs.
Eric Seidel (no email)
Comment 11 2010-12-20 21:34:39 PST
Basically, I ended up writing the test sorting function first, and then fleshed out the other details. The other details ended up rather complicated. :) This is certainly not a simple solution. Then again, flaky tests are not simple to deal with. :) But I think I'll write a new (much much simpler) patch which just removes the attempt flaky tests when we have a double-flake.
Eric Seidel (no email)
Comment 12 2010-12-20 22:19:07 PST
Adam Barth
Comment 13 2010-12-20 22:20:32 PST
Comment on attachment 77084 [details] Patch Thanks.
WebKit Commit Bot
Comment 14 2010-12-21 00:06:31 PST
Comment on attachment 77084 [details] Patch Rejecting attachment 77084 [details] from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=cr-jail-4', 'build-and-test', '--no-clean', '--no-update', '--test', '--non-interactive']" exit_code: 2 Last 500 characters of output: t/webkit-commit-queue/Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py", line 114, in test_update_failure self._run_through_task(commit_queue, expected_stderr) File "/mnt/git/webkit-commit-queue/Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py", line 76, in _run_through_task self.assertEqual(success, not expect_retry) AssertionError: False != True ---------------------------------------------------------------------- Ran 755 tests in 19.168s FAILED (failures=2) Full output: http://queues.webkit.org/results/7304078
Eric Seidel (no email)
Comment 15 2010-12-21 00:21:24 PST
Created attachment 77090 [details] Patch for landing
WebKit Commit Bot
Comment 16 2010-12-21 01:38:20 PST
The commit-queue encountered the following flaky tests while processing attachment 77090 [details]: fast/preloader/script.html bug 50879 (author: abarth@webkit.org) animations/play-state-suspend.html bug 50959 (author: cmarrin@apple.com) The commit-queue is continuing to process your patch.
WebKit Commit Bot
Comment 17 2010-12-21 04:56:00 PST
Comment on attachment 77090 [details] Patch for landing Clearing flags on attachment: 77090 Committed r74408: <http://trac.webkit.org/changeset/74408>
WebKit Commit Bot
Comment 18 2010-12-21 04:56:09 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.