shared API for double-entry accounting, treating it as a 'math' library.

Chris Travers chris.travers at gmail.com
Thu Nov 14 23:56:50 EST 2013


Having had the benefit of a night's sleep on the issues Bradley mentioned
with my feedback, I think this rabbit hole goes a bit further than it may
be apparent and more issues that need to be discussed.

I want to discuss the performance issues I have seen with volume
transactions and where the likely solutions are.  I don't actually think
that the interface format is a big problem (I favor JSON because there are
fewer ambiguity issues than there are with XML on the parsing side), but
there are several other bottlenecks that need to be considered also.

The biggest issue is the question of state vs stateless interface.  If this
is done over HTTP and something like a web service, then certain kinds of
workflows become very problematic performance-wise because tracking state
across requests imposes significant overhead.

Typically what we found in LedgerSMB is that most workflows are perfectly
manageable in terms of response times, assuming a reasonable platform
underneath that can handle concurrency, but that bulk payment operations
are the big problem.  Our discussion of inventory is interesting but
probably a niche issue for NPO's, but bulk payments may be more common.
The performance issues we have seen here are worth sharing.

One of our users of LedgerSMB pays about 20000 invoices through the system
every week.  The largest workflows run about 5000 invoices through the
system at a time, payable to about 200 payees.  Obviously this poses some
performance challenges and we have done a significant amount of profiling
about where the time and effort is spent.  This is all done through a web
interface.  Because of concurrency issues we write to a lock table and
locks are expired with session logouts.  This is to prevent invoices from
being double-paid.

Now as far as performance what we found (on PostgreSQL here) was that
selecting this against a table with around 10 million line items shows
roughly the following performance breakdowns:

1.  A database query to just pull the open invoices runs in about 2 sec.

2.  Adding the locking logic slows it from 2 sec. to 49 sec in part because
one goes from pure reads to a mixture of reads and writes..

3.  Generating a user interface (using Template Toolkit) takes about 4
minutes.

So there you have going from seconds to minutes, which I presume is the
issue mentioned.  Now in this case it isn't a deal-killer because this is
about 5 minutes in between a longish workflow but it could be optimized a
lot.

Here are the lessons I draw from this.  Other interpretations are welcome.

The first is that a stateless API poses very significantly costs regarding
payment workflows.  If we are worried about performance at volume that's a
big thing to consider there.

The second thing is that at volume there are huge limits to what a
web-based UI can accomplish.  I have been building a point of sale system
based on wxperl with most of the common modules, and compared to fast-cgi
(with cached code, for most of it), the wxwidget-based client is vastly
more responsive (guessing 2 orders of magnitude faster than a comparable
web interface).

So if we are worried about performance at scale I think the API needs to
assume the possibility of a stateful interface with a "pure data" exchange.
 json or xml would work for that and there are good parsers for them in
many languages.  My concern over XML though is that while the schema
documents tighten up some checks for strongly typed languages there are
ambiguity in semantics that can make it difficult to ensure compatibility
between implementations.  With JSON, we'd have to dispense with those
checks (or rather push them to the client) but there are fewer traps
regarding inconsistent implementations.  If there are established
specifications output can be tested.  This doesn't solve everything though.
 But hey, at least it isn't X12......

The other option is this:  Decide that these workflows aren't necessarily
within the scope of this API but this doesn't preclude another one that
would handle these other issues differently.

Anyway, hope this is helpful.

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more.shtml
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sfconservancy.org/pipermail/npo-accounting/attachments/20131114/61bbd83a/attachment.html>


More information about the npo-accounting mailing list