Spurious test failure: test_create_notification

Mon Jun 22 07:21:08 EDT 2015

On 06/22/2015 12:19 PM, Thomas De Schampheleire wrote:
> Hi,
>
>
> On Mon, Jun 22, 2015 at 9:59 AM, Thomas De Schampheleire
> <patrickdepinguin at gmail.com> wrote:
>> Hi,
>>
>> I'm seeing a spurious test failure on test_create_notification.
>> Repeating the test makes it succeed.
>>
>> Is anyone seeing the same? Any ideas on the cause?
>>
>> I'm only starting to see this recently, so it must be caused by
>> a. a test change
>> b. switch to py.test (didn't try with nosetest yet)
>>
>>
>> kallithea/tests/models/test_notifications.py:48: in test_create_notification
>>      self.assertEqual([], Notification.query().all())
>> E   AssertionError: Lists differ: [] != [<DB:Notification>]
>> E
>> E   Second list contains 1 additional elements.
>> E   First extra element 0:
>> E   <DB:Notification>
>> E
>> E   - []
>> E   + [<DB:Notification>]
>>
> Upon deeper investigation, it turns out that this issue is not so
> spurious at all. It looked like that because it really is a bleeding
> of state from one test to the other, and repeating the failing test on
> its own does not have this bleed.
>
> The issue was introduced with my commit
> 9a23b444a7fefe5b67e57a42d26343787d992874 pullrequests: detect invalid
> reviewers and raise HTTPBadRequest
>
> that added more tests in test_pullrequests.py, and apparently these
> tests cause a new notification for the test user.
>
> My idea would be that this is not really a mistake of my tests, but
> rather there should never be impact from one test to another. The
> database before and after each test should be identical.
>
> Am I naive in assuming this?
> How should we fix this?

Yeah. Currently we have the database as shared state among all the 
tests. That means that a failing test can make following tests fail - 
there is not much we can do about that without fixing the real problem 
of having tests use and modify the same real database. We should however 
make sure that all passing tests leave the database in a state where all 
other tests work. It is kind of undefined, but in some cases it means 
removing/undoing what is changed, in other cases it is ok to add or 
modify records but no tests should look at absolute values. (It would 
also be interesting to run the tests in a random order and verify that 
we don't have any dependencies.)

In this case I guess it would be enough if your test ended up reading 
notifications. It seems to me like the failure also has some randomness 
in it - I don't know how that can be.

/Mads