[PATCH 3 of 3 v2] hooks: fix potentially invalid interpreter in git hooks (Issue #333)

Sun Apr 7 07:23:18 UTC 2019

On Sun, Apr 7, 2019, 00:25 Mads Kiilerich <mads at kiilerich.com> wrote:

> On 4/6/19 9:22 PM, Thomas De Schampheleire wrote:
>
>
>
> El jue., 4 abr. 2019 a las 0:41, Mads Kiilerich (<mads at kiilerich.com>)
> escribió:
>
>> > And pushing to such repo would result in following client errors:
>> >
>> >      $ git push
>> >      Password for 'http://user@localhost:5050':
>> >      Enumerating objects: 3, done.
>> >      Counting objects: 100% (3/3), done.
>> >      Writing objects: 100% (3/3), 241 bytes | 241.00 KiB/s, done.
>> >      Total 3 (delta 0), reused 0 (delta 0)
>> >      remote: unable to load configuration from hooks/pre-receive
>> >      To http://localhost:5050/gitrepo-new
>> >       ! [remote rejected] master -> master (pre-receive hook declined)
>> >      error: failed to push some refs to '
>> http://user@localhost:5050/gitrepo-new'
>>
>>
>> Can we somehow make it show a more useful error message in this case? I
>> guess not - none of our code is executed ...
>>
>
> Actually, there is some of our code executed: SimpleGit. But, at that
> point you don't know that the git hook call is going to fail.
>
>
> First: Since we install Git hooks in the file system, they will also be
> invoked if someone push directly. That may or may not be a feature, but we
> have to handle it.
>

I don't think this is a valid argument: for mercurial repos the hooks are
not installed on the file system. Direct pushing is not supported.

But yes, usually, SimpleGit will invoke the `git` command which then invoke
> the hooks.
>
> Also, I have significant refactorings in this area lined up, preparing for
> clean hook handling, also for ssh. Our first goal on the stable branch
> should be to fix regressions - perhaps by recommending workarounds.
>
>
>
>
>>
>> > The approach taken by this commit is:
>>
>>
>> FWIW, this:
>>
>> > - Introduce a new configuration setting: hook_python_path, and use its
>> value
>> >    as the Python interpreter to use in git hooks.
>> > - If the value is absent or empty, fall back to the current logic
>> >    'sys.executable' or if that is also invalid, 'python2'.
>>
>>
>> and this:
>>
>>
>> > - Let `kallithea-cli config-create` fill in its own interpreter as
>> default
>> >    value, which should be valid as long as the virtual environment is
>> not
>> >    moved/replaced.
>>
>>
>> *could* be separate "milestones" and separate changesets. But no big
>> deal. Split or keep together as you like.
>>
>
>
> Yes, I considered it, but got a bit stuck with the comments referring to
> code that does not yet exist.
> I'll think about it a second time.
>
>
> Yes, the first changeset will need comments about the potential, and the
> second changeset might update these comments. But the the phrasing and
> change of the comments might help to be aware what actually is changed and
> why ... and how it ideally should be done.
>
>
>
>
>> > The main downside of this approach is its fragility in case a new
>> virtualenv
>> > is used. Previously, the configuration file was the same regardless of
>> > virtualenv, but this is no longer true after this patch.
>>
>>
>> But also, it is crucial that the *right* virtualenv is used, also if Git
>> should be invoked directly, outside any virtual env. It is thus inherent
>> that the new virtualenv *has* to be written to the hooks somehow. The
>> previous use of same 'python2' was wrong and fragile. I thus don't think
>> the new way is more fragile. Explicit expectations and failing early is
>> *less* fragile ;-)
>>
>
> In fact, my understanding was not fully complete before: when users push
> to a git repo, they actually communicate with our Git middleware, which
> itself uses dulwich. All this should normally happen inside the virtualenv
> (if used). Only when the hooks get called, there is a decoupling into
> another process. (I am theorizing here, not explicitly checked)
>
> So if that is correct, then I wonder if we could not avoid the decoupling.
> We are already handling pull hooks in
> kallithea.lib.middleware.simplegit._handle_githooks .
> Can we not handle push hooks pre-receive/post-receive in the same place?
> In this case, we are definitely in the right environment, we are and stay
> within Python. I guess it will also simplify the situation for Windows.
>
> Am I overlooking something?
>
>
> We kind of *have* to run something inside the hook. That is the only way
> to know exactly what is pushed (or pulled). If we want branch access
> control, it will(!?) also have to run inside hooks.
>

Knowing what was pushed and what not is a more convincing argument. But how
do we know this in mercurial? Does mercurial tell us, or do we find out? Is
it impossible/fragile to do the same in git?

We can perhaps revisit after landing my cleanups and see if we can find a
> fundamentally new way to do it.
>
>
>
> Thanks for these comments, I generally agree.
>
> But before going further, I'd like to hear your opinion on the idea of
> calling the hooks directly from our own code (see above).
>
>
> Summarizing: I don't think that is feasible. But if it is, it will not be
> simple. Not suitable for the stable branch.
>
>
Ok. But when deciding what to do on the stable branch, I think it makes
sense to have an idea what we'll do for default. I think we should avoid
introducing user visible changes like ini settings if they will not apply
on default.

/Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20190407/a8a382c7/attachment.html>