[PATCH 3 of 3 v2] hooks: fix potentially invalid interpreter in git hooks (Issue #333)

Sat Apr 6 22:25:18 UTC 2019

On 4/6/19 9:22 PM, Thomas De Schampheleire wrote:
>
>
> El jue., 4 abr. 2019 a las 0:41, Mads Kiilerich (<mads at kiilerich.com 
> <mailto:mads at kiilerich.com>>) escribió:
>
>     > And pushing to such repo would result in following client errors:
>     >
>     >      $ git push
>     >      Password for 'http://user@localhost:5050':
>     >      Enumerating objects: 3, done.
>     >      Counting objects: 100% (3/3), done.
>     >      Writing objects: 100% (3/3), 241 bytes | 241.00 KiB/s, done.
>     >      Total 3 (delta 0), reused 0 (delta 0)
>     >      remote: unable to load configuration from hooks/pre-receive
>     >      To http://localhost:5050/gitrepo-new
>     >       ! [remote rejected] master -> master (pre-receive hook
>     declined)
>     >      error: failed to push some refs to
>     'http://user@localhost:5050/gitrepo-new'
>
>
>     Can we somehow make it show a more useful error message in this
>     case? I
>     guess not - none of our code is executed ...
>
>
> Actually, there is some of our code executed: SimpleGit. But, at that 
> point you don't know that the git hook call is going to fail.

First: Since we install Git hooks in the file system, they will also be 
invoked if someone push directly. That may or may not be a feature, but 
we have to handle it.

But yes, usually, SimpleGit will invoke the `git` command which then 
invoke the hooks.

Also, I have significant refactorings in this area lined up, preparing 
for clean hook handling, also for ssh. Our first goal on the stable 
branch should be to fix regressions - perhaps by recommending workarounds.

>
>     > The approach taken by this commit is:
>
>
>     FWIW, this:
>
>     > - Introduce a new configuration setting: hook_python_path, and
>     use its value
>     >    as the Python interpreter to use in git hooks.
>     > - If the value is absent or empty, fall back to the current logic
>     >    'sys.executable' or if that is also invalid, 'python2'.
>
>
>     and this:
>
>
>     > - Let `kallithea-cli config-create` fill in its own interpreter
>     as default
>     >    value, which should be valid as long as the virtual
>     environment is not
>     >    moved/replaced.
>
>
>     *could* be separate "milestones" and separate changesets. But no big
>     deal. Split or keep together as you like.
>
>
>
> Yes, I considered it, but got a bit stuck with the comments referring 
> to code that does not yet exist.
> I'll think about it a second time.

Yes, the first changeset will need comments about the potential, and the 
second changeset might update these comments. But the the phrasing and 
change of the comments might help to be aware what actually is changed 
and why ... and how it ideally should be done.

>     > The main downside of this approach is its fragility in case a
>     new virtualenv
>     > is used. Previously, the configuration file was the same
>     regardless of
>     > virtualenv, but this is no longer true after this patch.
>
>
>     But also, it is crucial that the *right* virtualenv is used, also
>     if Git
>     should be invoked directly, outside any virtual env. It is thus
>     inherent
>     that the new virtualenv *has* to be written to the hooks somehow. The
>     previous use of same 'python2' was wrong and fragile. I thus don't
>     think
>     the new way is more fragile. Explicit expectations and failing
>     early is
>     *less* fragile ;-)
>
>
> In fact, my understanding was not fully complete before: when users 
> push to a git repo, they actually communicate with our Git middleware, 
> which itself uses dulwich. All this should normally happen inside the 
> virtualenv (if used). Only when the hooks get called, there is a 
> decoupling into another process. (I am theorizing here, not explicitly 
> checked)
> So if that is correct, then I wonder if we could not avoid the decoupling.
> We are already handling pull hooks in 
> kallithea.lib.middleware.simplegit._handle_githooks .
> Can we not handle push hooks pre-receive/post-receive in the same 
> place? In this case, we are definitely in the right environment, we 
> are and stay within Python. I guess it will also simplify the 
> situation for Windows.
>
> Am I overlooking something?

We kind of *have* to run something inside the hook. That is the only way 
to know exactly what is pushed (or pulled). If we want branch access 
control, it will(!?) also have to run inside hooks.

We can perhaps revisit after landing my cleanups and see if we can find 
a fundamentally new way to do it.

>
> Thanks for these comments, I generally agree.
>
> But before going further, I'd like to hear your opinion on the idea of 
> calling the hooks directly from our own code (see above).

Summarizing: I don't think that is feasible. But if it is, it will not 
be simple. Not suitable for the stable branch.

/Mads

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/kallithea-general/attachments/20190407/f34ac959/attachment.html>