Kallithea internals overview

Mads Kiilerich mads at kiilerich.com
Wed Feb 18 22:21:35 EST 2015


On 02/03/2015 09:31 PM, Thomas De Schampheleire wrote:
> Hi,
>
> On Thu, Jan 29, 2015 at 11:50 AM, Thomas De Schampheleire
> <patrickdepinguin at gmail.com> wrote:
> ...
>>>> What would be useful to me (and probably to other new contributors) is
>>>> a technical overview of Kallithea's internals.

Thanks for starting this. You collected a lot of valuable information 
and I learned a lot.

To continue what you started, I will comment inline and answer 
questions. Will you try to rework it to something like a 
docs/internerals.rst which we can keep uptodate?

Some of my comments are opinions, ideas or hopes - they might not fit 
into such a document ...

> Since my first attempt, I've been reading documentation of pylons and
> mako, which was very enlightening to understanding Kallithea.
>
> http://docs.pylonsproject.org/projects/pylons-webframework/en/latest/
> http://docs.makotemplates.org/en/latest/
>
> Below I am updating parts of this overview with the gathered
> knowledge. I removed the things that are no longer relevant or
> accurate.
>
> Framework:
> ----------
> The pylons framework forms the core of the Kallithea application. It
> defines a lot more than I had imagined, including the directory tree
> and the set of additional Python packages being used. The concepts
> models, views, controllers also stem from this framework.
>
> When creating a pylons application, the skeleton structure is:
>
> ├── config
> │   ├── deployment.ini_tmpl
> │   ├── environment.py
> │   ├── __init__.py
> │   ├── middleware.py
> │   └── routing.py
> ├── controllers
> │   ├── error.py
> │   └── __init__.py
> ├── __init__.py
> ├── lib
> │   ├── app_globals.py
> │   ├── base.py
> │   ├── helpers.py
> │   └── __init__.py
> ├── model
> │   └── __init__.py
> ├── public
> │   ├── bg.png
> │   ├── favicon.ico
> │   ├── index.html
> │   └── pylons-logo.gif
> ├── templates
> ├── tests
> │   ├── functional
> │   │   └── __init__.py
> │   ├── __init__.py
> │   └── test_models.py
> └── websetup.py
>
> Which is precisely what Kallithea has for structure. See
> http://docs.pylonsproject.org/projects/pylons-webframework/en/latest/gettingstarted.html#creating-a-pylons-project
> for details.
>
> The principle is as follows:
> - URL paths are coupled to 'controllers' through the routes packages.
> A controller is basically a python class with some methods. Based on
> these routes, a specific method of a certain controller will be called
> to handle a request.
> - The controller may need some info from the db to fulfill the
> request. Access to the database is handled through the 'model', a
> python representation of the tables in the database (one class per
> table).
> - The controller then renders the page (the 'view') by using a
> template. Template handling is done through the mako package.  These
> templates can refer to python variables, and additionally can contain
> typical web stuff like CSS and Javascript / jQuery.
> - In addition to requests for a complete page, parts of a page can be
> updated in a similar way through AJAX calls.
>
>> Frontend:
>> ---------
>>
> [..]
>> Question: what is the reasoning to both use jQuery and YUI? Can't we
>> select one framework and do everything in that?
>> Answer: according to the wiki 'future' list, YUI should be killed in
>> favor of jQuery.

Yes, we are trying to move away from YUI. We are using an old version, 
and even the latest version has been desupported.

>> Related question: a bird told me that it may be better to use a
>> higher-level framework like AngularJS instead of directly scripting
>> stuff either in plain Javascript or jQuery. Has there been any thought
>> about this before?
> After reading further, I have the impression that AngularJS is not
> really necessary as we already have the mako template library, and
> most of the rendering of a page should be done at that (server-side)
> level rather than doing it in (client-side) Javascript.

Mako itself gives us a normal web page based application. That is "web 
1.0". On top of that we can add javascript and DOM manipulation that 
makes the page alive without reloading all of it as html from a url, 
like it is done in modern "rich web applications". With html and data 
and javascript we can implement anything - even something like Angular 
or similar frameworks. So yes, angular is not "necessary".

It might have advantages to generate all of the page content on the 
server side, but it also has advantages to have a "rich" application 
where most of the view/presentation code is running on the client side 
as javascript in the browser. One example is the difference between old 
web mail systems and what gmail introduced. I would prefer to have most 
of the user interaction as a "rich" application but still have a 
structure where the "resources" have stable URLs.

Right now the page is mostly web page based and I don't know Angular & 
friends. That is two reasons I don't feel like making a big change and 
would prefer incremental improvements.

>> On the 'future' list, I see "move code from templates to controllers
>> and from controllers to libs or models". Can you clarify this a bit
>> more?
> I assume the first aspect (templates->controllers) refers to the fact
> that the templates are currently containing a lot of stuff that is not
> just about displaying the data (the main purpose of the template) and
> that most of this logic should really be in controllers.
> This has the nice side effect of making things more easy to test.
>
> Not sure what type of code should move from controllers to libs or
> models, though...

I see a lot of code in db.py or the more high level models that 
apparently just as well could have lived in controllers ... or the other 
way around.

>
>> Question: how is the relation between the REST API (controllers/api)
>> and the AJAX calls? Is the API used internally in some way, or only
>> provided for external usage?

I think they are completely separate.

The ajax calls all use the custom X-PARTIAL-XHR / HTTP_X_PARTIAL_XHR 
header and the data are often very specific for a specific template.

I think it could be nice to move everything to the same REST API ... but 
generic low level APIs would also require more logic on the client side.



(Some minimal info on bundled code can be found in our LICENSE.md.)

>> The codemirror JS library (http://codemirror.net/) is used to display
>> code files.

AFAIK, we mainly use pygments for 'just' showing code files with (syntax 
highlighting).

codemirror is more for editing (and used by codemirror).

>> The mergely JS library (http://mergely.com) is used to display diffs
>> of code. Mergely is based on codemirror.

"based on" might be too strong. It is more like "uses".

>> fontello is a library/service to create a font of symbols (instead of
>> using icons in image files). Did we create our own font or are we
>> using an existing one?
>> 'font awesome' is a specific symbol font. How does it relate to fontello?

kallithea/public/fontello/config.json defines our custom font. It picks 
some symbols from fontawesome and GitHub Octicons and adds or customizes 
some glyphs. The kallithea font is generated by fontello, based on that 
input.

Sean is the expert in this area and can say more.

>> pygments: syntax highlighting

(It could perhaps make sense to replace pygments on the server side with 
more use of codemirror on the client side.)

>> formencode: form validation  (also recommended by Pylons)

It is not just validation. "form handling" might be more correct.

>> excanvas?

(See changeset 531ab818cc3d.) Canvas support for old IE versions. We 
will probably drop this very soon.

>> mousetrap?

Keyboard shortcuts. They broke in the forking process and is currently 
disabled - 38a4035426ac . We should either kill the last traces of it or 
revive it.

>> native.history.js?

Some fancy back/forward handling used by the repo content file 
navigation system. This part of the system is very "rich" and not like 
the rest of the application. I don't like it very much. I guess there 
must be a better solution.

>> Backend:
>> --------
>>
>> Revision information is obtained from the VCS library (lib/vcs), which
>> in case of Mercurial interacts directly with the Mercurial python
>> classes, and in case of git uses the dulwich python module.

it also invokes git directly for some operations - I don't know why.

>> paste: the deployment tool helping in starting up, configuring, etc. the application

Yes, http://pythonpaste.org/script/ , provides the command line 
interface 'paster' and a small web server. It can seem a bit convoluted 
and over generic but might be convenient. I think it is a "part of" Pylons.

>> bcrypt: password hashing

This area is weird. On windows we don't use bcrypt but just hash the 
password. Weird that it use different and compatible methods on 
different platforms. Bad that windows use an insecure method (can be 
attacked with rainbow tables).

I don't know if there is some PBKDF2 implementation that would be better 
than bcrypt. But we should at least use bcrypt everywhere (while staying 
backward compatible).

>> celery: distributed task queue  (how is this used?)

I think it is more important that it provides worker processes for big 
tasks and gives an async decoupling from the web server.

It uses the rabbitmq message bus.

The general setup instructions that can be found on the net and 
docs/setup.rst is "enough" - there is not much more to it. rabbitmq can 
however be very tricky to get up and running.

>> whoosh: code indexing/search
>>
>> Question: what is all this WSGI stuff? If you start Kallithea
>> according to the base instructions, it's hosting it's own web
>> interface (without WSGI?) How does WSGI work, what are the
>> advantages/disadvantages? Related to this, I see people running
>> Kallithea under Apache, advantages/disadvantages?

WSGI is the modern Python alternative to cgi, fcgi and isapi, defined in 
https://www.python.org/dev/peps/pep-0333/ .

Kallithea is a WSGI application. It has to be run by a WSGI server that 
expose the application as http.

'paster serve' can launch a small python web server, either a built-in, 
waitress or gunicorn. These web servers do not have as many features as 
'real' webservers .Some like to use that in production, but will then 
need tooling around to make it restart and put a proxy server (often 
nginx) in front of it. These two layers of web servers can make it 
tricky to pass all the right information to the wsgi environment 
(original client ip&port, host name, protocol (http/https), possibly 
authentication).

I prefer to use apache with mod_wsgi but consider trying uWSGI.

The web server performance doesn't matter with Python applications like 
Kallithea. Use one that has the features you need and that you are 
familiar with.

/Mads


More information about the kallithea-general mailing list