[PATCH] ChangesetCommentsModel: refactor duplicate code

Thu Apr 30 00:52:33 EDT 2015

On Thu, Apr 30, 2015 at 12:39 AM, Mads Kiilerich <mads at kiilerich.com> wrote:
> On 04/29/2015 10:09 PM, Jan Heylen wrote:
>>
>> # HG changeset patch
>> # User Jan Heylen <heyleke at gmail.com>
>> # Date 1430323563 -7200
>> #      Wed Apr 29 18:06:03 2015 +0200
>> # Node ID 8ead3310a073d6453b926b891f1a59a0fcf6c6b5
>> # Parent  71140b7c9abfcb34db70792c39e09a4e4796efb1
>> ChangesetCommentsModel: refactor duplicate code
>>
>> KALLITHEA_WHOOSH_TEST_DISABLE=1 KALLITHEA_NO_TMP_PATH=1 nosetests
>> kallithea/tests/functional/test_changeset_comments.py
>> still passes all 4 tests
>>
>> diff -r 71140b7c9abf -r 8ead3310a073 kallithea/model/comment.py
>> --- a/kallithea/model/comment.py        Wed Apr 29 04:25:53 2015 +0200
>> +++ b/kallithea/model/comment.py        Wed Apr 29 18:06:03 2015 +0200
>> @@ -257,27 +257,36 @@
>>           :param revision:
>>           :param pull_request:
>>           """
>> -
>> -        q = ChangesetComment.query()\
>> -                .filter(ChangesetComment.repo_id == repo_id)\
>> -                .filter(ChangesetComment.line_no == None)\
>> -                .filter(ChangesetComment.f_path == None)
>> -        if revision:
>> -            q = q.filter(ChangesetComment.revision == revision)
>> -        elif pull_request:
>> -            pull_request = self.__get_pull_request(pull_request)
>> -            q = q.filter(ChangesetComment.pull_request == pull_request)
>> -        else:
>> -            raise Exception('Please specify revision or pull_request')
>> -        q = q.order_by(ChangesetComment.created_on)
>> -        return q.all()
>> +        return self._get_comments(repo_id, revision=revision,
>> pull_request=pull_request, inline=False)
>>         def get_inline_comments(self, repo_id, revision=None,
>> pull_request=None):
>> +        """
>> +        Gets inline comments based on revision or pull_request_id
>> +
>> +        :param repo_id:
>> +        :param revision:
>> +        :param pull_request:
>
>
> I haven't been trained to embrace this kind of doxygen-ish docstrings. I can
> see how they can be very valuable if it not is obvious from the general
> description of the function and its name and the name of the parameters what
> they do ... but I think it in most cases would be more important and perhaps
> sufficient to do some renamings and make it self explanatory. I can also see
> how such comments can be nice _if_ we can get and maintain full and correct
> coverage of meaningful documentation ... but I don't think that is
> realistic.
>
> So: in this case I would prefer to not have these stubs (which always can be
> auto generated if they should be needed, but right now only adds an
> maintenance cost without adding value). In this case I think it would be
> enough (and preferred) if the docstring just pointed out that it returned
> comments for _either_ a revision or a pull_request.
>
> Also, it seems like it is non-trivial what this function returns (especially
> compared to get_comments). Python duck typing might make it hard to say much
> about the types of input parameters but there is often something relevant to
> say about the returned value.
>
> I realize most of the docstring I am commenting on here came from copying
> get_comments. My comments here should thus instead be directed to the
> previous author. Still, I would prefer if we could reduce the amount of such
> technical debt.

I wasn't aware that I should remove these, can you give an example
what it should be in such a case? Only the docstring description, or
in some other format?

>
>> +        """
>> +        return self._get_comments(repo_id, revision=revision,
>> pull_request=pull_request, inline=True)
>> +
>> +    def _get_comments(self, repo_id, revision=None, pull_request=None,
>> inline=False):
>> +        """
>> +        Gets comments based on revision or pull_request_id
>> +
>> +        :param repo_id:
>> +        :param revision:
>> +        :param pull_request:
>> +        :param inline: boolean, get normal comments or inline comments
>
>
> IMO, it is obvious that inline is a boolean considering the default value of
> False. It is also obvious that the default is "not-so-much inline" and
> setting inline=True means "yes to inline". All we need to know is if this
> function is able to return both types at once or if it returns either one or
> the other. The implementation hints that it returns either one or the other.
> IMO, _that_ would be relevant to have in the general description, leaving no
> need for description of the individual parameters.

Ok, I just copy pasted the style of docstring and added my parameter,
but I agree, a normal description is enough.

>
>> +        """
>>           q = Session().query(ChangesetComment)\
>>               .filter(ChangesetComment.repo_id == repo_id)\
>> -            .filter(ChangesetComment.line_no != None)\
>> -            .filter(ChangesetComment.f_path != None)\
>> -            .order_by(ChangesetComment.comment_id.asc())\
>> +
>> +        if inline:
>> +            q = q.filter(ChangesetComment.line_no != None)\
>> +                .filter(ChangesetComment.f_path != None)
>> +        else:
>> +            q = q.filter(ChangesetComment.line_no == None)\
>> +                .filter(ChangesetComment.f_path == None)
>>             if revision:
>>               q = q.filter(ChangesetComment.revision == revision)
>> @@ -287,7 +296,11 @@
>>           else:
>>               raise Exception('Please specify revision or
>> pull_request_id')
>>   +        q = q.order_by(ChangesetComment.created_on)
>> +
>>           comments = q.all()
>> +        if not inline:
>> +            return comments
>>             paths = defaultdict(lambda: defaultdict(list))
>
>
> It seems like this function returns completely different types of data,
> depending on the input parameter. That can confuse any type checker -
> including my mental type checker. I don't like that.

Yes, but I didn't introduce it, I was confused in the first look at it
too. But in the first step, I didn't want to change the interface at
all.

>
> How about letting _get_comments always return an "iterable" of comments and
> leave it to the new get_inline_comments to do the conversion to what seems
> to be a weird randomly ordered list of tuples of filenames and
> linenumber-to-list-of-comments. (Without looking at the context, I guess it
> would be cleaner if it returned the 'paths' dict and left it to the caller
> to convert to list or use as dict.

It is my belief that a next iteration of refactoring here is that
get_comments always return the same datastructure, and in the case of
normal comments that do not have a line number (or 0? so path[0]?)
also stored in that structure somehow, but again, I like to take
things step by step, I can't change whole kallithea in one commit.
This patch reduces the code duplication drastically, next step in
refactoring could be to fix the interface. Your initial comment on my
draft comments change was too much code duplication in the comment
model, that is addressed with this patch, right?

please let me know that changing the docstring style cleanup is enough
for the time being,

J.

>
> /Mads