what is the issue with database changes?

Tue Mar 17 15:58:43 EDT 2015

Hi,

On Tue, Mar 17, 2015 at 8:40 PM, Matt Mackall <mpm at selenic.com> wrote:
> On Tue, 2015-03-17 at 20:15 +0100, Jan Heylen wrote:
>> On Tue, Mar 3, 2015 at 7:55 PM, Matt Mackall <mpm at selenic.com> wrote:
>> > On Tue, 2015-03-03 at 15:24 +0100, Thomas De Schampheleire wrote:
>> >> Hi,
>> >>
>> >> Regularly I hear that we don't want to change the model yet to stay
>> >> backwards compatible.
>> >>
>> >> However, I do see several 'dbmigrate' scripts in the source base,
>> >> which hint at it being possible to migrate across database changes.
>> >>
>> >> Can someone explain in more detail why we do not want such database
>> >> changes (yet)?
>> >
>> > On-disk format migrations are an anti-pattern of software development.
>> > They're fragile and one-way and present a large barrier to user
>> > acceptance of new versions. And there's no excuse for them when you have
>> > a database rather than a file format: you can always add new tables
>> > without changing the structure of the existing tables.
>>
>> So, given this anti-pattern, say you want to introduce a new type of
>> changeset-comment, I see 3 options:
>
> ...
>
>> 3. add a new class (so database table) e.g. ChangesetOtherCommentType,
>> and only use it for 'the other type' and keep using ChangsetComment
>> for comments and use the other table for the new feature.
>>
>> No issue with downgrading. The new table is just not seen by the old model.
>
> That's the ticket. Append to the schema, but don't change the semantics
> of any existing elements of the schema.
>
>> For all three, I also have the question: Is there a migrate (script)
>> needed?
>
> For #3, you can detect/ignore that the table doesn't exist on read and
> create it on write.. at run-time. Or simply assert that the list of
> tables L exists or create them at start-up. Then the user need never
> even be aware that things are changing.
>
> This is a bit of a headache for developers to save a significant
> headache for users. If you have more users than developers, it's a huge
> win.
>
>>  And how does that work, for as far as I can see, the 'old'
>> model db.py is copied to another class, and the new db.py get an
>> increased version number.
>
> Version numbers are what you do when you can't do the above. Given that
> databases are basically the ultimate free-form extensible file format,
> there's really no good reason to have version numbers when you can
> literally say to the storage "hey, do you know about feature X? ok, now
> you do."
>
> This is mostly about "there's something missing in the schema" rather
> than "the schema is fundamentally broken" but I would say that even in
> that case, it's much better for your installed base for you to try to
> work around your mistakes rather than reaching for an incompatible
> change first.
>

Feedback I have seen passing by from Mads on several occasions is that
the Kallitha database schema is fundamentally broken. Given that there
is not yet a huge install base, doesn't it make sense to make a big
incompatible change now and use the incremental approach described
above after that initial cleanup?

Best regards,
Thomas