[PATCH] pullrequests: add support for custom pull request id prefix

Wed Apr 22 06:19:45 EDT 2015

On 04/22/2015 09:58 AM, Thomas De Schampheleire wrote:
> On Tue, Apr 21, 2015 at 4:37 PM, Mads Kiilerich <mads at kiilerich.com> wrote:
>> On 04/21/2015 10:19 AM, Thomas De Schampheleire wrote:
>>> On Tue, Apr 21, 2015 at 3:21 PM, Mads Kiilerich <mads at kiilerich.com>
>>> wrote:
>>>> On 04/21/2015 06:20 AM, Thomas De Schampheleire wrote:
>>>>> On Mon, Apr 20, 2015 at 11:35 PM, Mads Kiilerich <mads at kiilerich.com>
>>>>> wrote:
>>>>>>
>>>>>>>> It is a bit weird that Kallithea pull request numbers are global.
>>>>>>>> Especially
>>>>>>>> in a site that is hosting repos for multiple independent users, it
>>>>>>>> would
>>>>>>>> make sense to have per repo numbering. Would that solve your case?
>>>>>>>> Will
>>>>>>>> your
>>>>>>>> repos in the different instances be named differently?
>>>>>>> No, the different instances would operate on the same repositories
>>>>>>> with the same names (note that we're not using Kallithea for repo
>>>>>>> hosting, it is a mirror).
>>>>>>
>>>>>> Using it as a mirror is fine ... but having multiple independent
>>>>>> instances
>>>>>> does not seem like something I can recommend. It would make more sense
>>>>>> to
>>>>>> have multiple servers on the same database in some failover
>>>>>> loadbalancing
>>>>>> setup.
>>>>> The reason we planned doing such a setup is that the network
>>>>> latency/bandwidth between sites is not always very good. If there is
>>>>> one single Kallithea instance in a given site, the developers from
>>>>> that site get a good experience, while the developers from a remote
>>>>> site may suffer high latencies. With a local database + instance this
>>>>> would be mitigated.
>>>>
>>>> We have local mirrors for the actual cloning - using
>>>> https://bitbucket.org/Unity-Technologies/hgwebcachingproxy/commits/all
>>>> and
>>>> https://bitbucket.org/Unity-Technologies/dynapath/commits/branch/default
>>>> .
>>>>
>>>> Are you sure you need locally hosted Kallithea instances for the web UI?
>>>> Depending on the size of your changes and your workflow, the requirements
>>>> for bandwidth and latency might not be that high. Especially not to
>>>> justify
>>>> the added complexity for users and admins for managing multiple
>>>> instances.
>>>>
>>>>> Your suggestion of the same database and multiple Kallithea instances:
>>>>> how exactly does this work? Is all locking in place? And since the
>>>>> database is in one place: don't you suffer from the same network
>>>>> latency issue?
>>>>
>>>> The database could perhaps be distributed, with one master for writing
>>>> and
>>>> local mirrors for reading. The database access pattern might however not
>>>> be
>>>> good for that; read only operations have too many writes.
>>>>
>>> What you mean here is that Kallithea is not yet fit for this model?
>>
>> Not really. I mean that there is a lot of things to consider and test with
>> your latency and bandwidth and workload.
>>
> Let me restate my question: does Kallithea fully support having one
> database with multiple frontends?

Yes. No problems there. That is exactly like having multiple worker 
instances on the same machine.

Only caveat is that if used in a load balancing system, you might want 
to make sure all the worker processes use the same cache store ... just 
like the file system with the repos of course should be the same.

> Second question is: how do you suggest testing this? Do you mean just
> clicking about, pulling/pushing, creating pull requests etc. to see
> how responsive things are? Or is there a more objective way to test
> things?

Well ... hard to tell. More objective ways of testing would still have 
to prove that they were realistic. The best "test" is when the actual 
users are using it and are happy. Profiling and analyzing the actual 
performance can then help pointing out where the real bottlenecks are 
and thus suggest changes that have a real impact. Up front sizing is 
like sizing of other systems: any rules of thumb will only give a very 
rough estimation and either give under- or over-sizing for the actual 
workloads.

One big problem for creating a fake load is to figure out what that load 
should be. In our case the size (and thus slowness) of the repo and the 
size of the PRs would matter ... and it would be important to have most 
of the users on slow connections. Still, I guess it could be nice to 
have such a tool, similar to apache "ab".

/Mads