Moving to github.io

I’m in the process of moving this blog to my github.io pages. The reasons are not that I don’t like WordPress, nor that I think github.io is superior in any way. In fact, I haven’t yet given WordPress up.

So this move is still in beta. While I’m not decided if I will permanently move out of WordPress, I will try to maintain both sites.

My current bloging workflow is like this:

  1. Writing my posts on reStructuredText (rst), mostly offline.
  2. Using a custom rst2wp.py to convert the files to WordPress’ HTML.
  3. Copy-and-paste the HTML to the WordPress Dashboard (sending them via email is also possible but it does not allow to proofread).
  4. Review online painfully and correct mistakes (I usually use the “an” article before words beginning with “s”). Have those corrections back in my original file so I keep a copy, and so on.

This, you may think, it’s really tedious; and you’d be right.

I have thought that having something that simply allows me to publish in rst would simplify things a lot. So I started looking around and found two candidates:

  1. A package called Pelican.
  2. A Sphinx based one called Tinkerer.

Since I know Sphinx already, I’m trying Tinkerer first. For the first couple of days I’ll be porting some of my old posts and customizing the theme to almost fit my current style in WordPress.

Integration v. Automation

Note. This post is frankly overdue. I started writing it about 5 weeks ago, and while I’m confident about its content, I’m still learning much about the subject. Nevertheless I think it’s good to go, so here it is.

Being an OpenERP bee these days has lead me to think a bit about integration, how does it happens inside OpenERP, its architecture, assumptions and other stuff. In a first iteration I used to think about integration in terms of integrating two (or more) OpenERP addons, for instance, integrating Sales with Accounting. At a very programming-oriented level this reduces, in the current state of OpenERP, to integrate their modules. But later, I found out that this approach is fundamentally flawed and causes many quirks.

This post starts with a tale about one of the many issues we have encounter when deploying OpenERP in an small enterprise. Then I will argue about what integration means. Lastly, I will turn to automation as a mean for integration and not the only way to do it.

The Integration tale

I want to start with a simple and very real[1] use case: Bob is a salesperson processing a CRM lead.

The story beings when Bob opens the lead and clicks on the Edit button and the first thing he tries to do is to fill the client’s contact info because it’s a new client. He clicks on the edit button besides the “Client” field which opens a window. He fills the data and clicks “save”, but suddenly it fails saying that Bob (a salesperson) should have filled the “Account payable” and “Account receivable” fields. So Bob says:

Oh, boy! What did I do wrong?. Let’s try again… Nope… Uh. What should I do? Hey! Peter. Do you know what’s happening here? This won’t let me enter this client’s contact data, it keeps telling me that these accounts are wrong? What the heck are these accounts anyway?

And Peter has no idea, so he calls Sally who’s a very pretty girl that probably does not have a clue either about what’s happening but it’s so nice to have a reason to approach her… Surprisingly Sally seems to be only one who actually paid attention to the information meeting that morning and she says:

Ah! This probably has to do with the Accounting module being installed. They say they will do that today… Let’s ask Stephanie.

And Stephanie realizes what’s happening and fix it somehow. An hour later Bob can resume his work and finish processing his CRM lead.

The end.

So, what is wrong with this story?

Computer problems.

Computer Problems. Taken from xkcb.

Depiction of wrongness

In my opinion there are two things that are clearly wrong with this:

  1. Requiring data when it’s not required. When you’re in a CRM use case requiring accounting-related seems really overstepping.
  2. Not being able to issue a request for completion. Bob should have been able to keep working and after the new client was saved a completion request should have been sent to those being able to complete the data.

    That completion request might just get ignored, but other use cases (like submit an invoice to that client) would actually require the data and thus the cycle would be closed after all.

Some would think that this is just a “configuration problem” as we will discuss below. But finding that solution took me several dives into the code and re-parsing of the documentation, so I’ll try to walk you through the process.

The issue. The making off

Warning

Technical stuff ahead. Wearing hardhats is mandatory.

By installing the “Accounting & Finance” addon [2] you touch many parts of your system besides installing new stuff. In fact, the module forces the installation of many other modules, including product [3].

It also modifies the "res.partner" model from the base addon. You may think about the "res.partner" model as an attempt to merge the contact information for both clients, suppliers and employees into a single entity [4]. It has many fields that comprise name, job position, contact information, etc…

When you install the account addon, the "res.partner" gets appended a whole bunch of other fields, including the said “account payable” and “account receivable” fields, which are also marked as mandatory:

'property_account_payable': fields.property(
    'account.account',
    type='many2one',
    relation='account.account',
    string="Account Payable",
    view_load=True,
    domain="[('type', '=', 'payable')]",
    help="This account will be used instead of the default one as the payable account for the current partner",
    required=True),
'property_account_receivable': fields.property(
    'account.account',
    type='many2one',
    relation='account.account',
    string="Account Receivable",
    view_load=True,
    domain="[('type', '=', 'receivable')]",
    help="This account will be used instead of the default one as the receivable account for the current partner",
    required=True),

This are not actual fields of the model "res.partner", but properties, which are “special”.

On using properties

Properties are of course related to the “solution” for the problem described above. But the solution is well hidden under the title of Database setup in the OpenERP Book. That’s the reason I’m using this case to open the OpenERP corner. If you deploy CRM before accounting [1] you’d probably find no interest in reading a topic called “Database setup”… you have set your database up already, haven’t you?

You should notice that both Account receivable and Account payable are, in fact, properties (i.e defined via fields.property). This actually means that those fields take their default values from a global configuration.

Those values were not properly set in our case cause there were no localized account chart that applied to our enterprise. We have to create all the accounts by hand, and yes, we missed (didn’t know) that we have to create those properties.

Our problem is solved by defining those properties in the configuration menu.

However this workaround is very unsatisfying:

  • It involves the administrator of the system because he’s the one that has access to the “Configuration Parameters”. AFAIK, the accountant himself/herself cannot change the defaults, unless you bestow all the powers on him/her.
  • It does not resolve the integration problem others might present in the future. Integration is harder that having some default values. For instance, Cuban accounting norms establish more than 10 accounts for receivables with empty slots for more if needed. They require to have different accounts (receivable and/or payable) for bills to/from people (B2C) separated from those bills to/from other enterprises (B2B), and also different accounts for long-term and short-terms bills. The last case cannot be decided by just looking at the client or supplier, more information about the economic fact is needed.
  • It does not actually resolve the current integration problem since the accountant needs to make sure the “Account Receivable” is the correct one for the client and he’s not notified when salesman Bob creates a new partner. So, what really happens is that the accountant needs to review journal entries and before posting them, and change the account if needed.

Integration v. Automation

An ERP should simplify things by integrating business areas, shouldn’t it? That’s the main driver behind the feature of automatically generating journal entries. Under this principle when an invoice is sent to a client a journal entry should be made recording we should get paid for that, ie. the client’s account receivable gets increased[5]. Likewise when we get a supplier invoice, an entry should record that we must pay that bill, ie. the supplier’s account payable gets increased.

You see now how the “Account Receivable” and “Account Payable” fields for the partner play their part in the automation of the accounting processes. This is deeply weaved into the account module’s source code. Meaning that there’s the assumption that partners have those properties we’re talking about. And that’s true because you have injected them and, if you configured everything as expected, they have their default values.

Notice the difference between the expectation of integration of business areas and how the integration happens in this case via a very specific kind of automation.

I’ll argue that the current state of this design is flawed. When standards change and/or are not applicable this kind of automation does more harm that it helps.

This is the reason the module that controls the “Anglo-Saxon accounting” [6] is very difficult to understand and the result artificial: they need an “interim” account to keep track the different stages. In the standard (for OpenERP) accounting the event to produce journal items in the debtor/creditor account is the creation of the invoice. In the anglo-saxon scheme the journal should be created at shipping time.

I argue that given another framework that clearly separates every actor and function will improve how this pattern could be implemented. I think that this framework must have:

  1. Signals and events.
  2. Actors like the accountant, and probably an automated agent for the accountant that could do the same the models do right now. But being responsive (ie. they respond to signals) they could be easily bypassed.

Of course there are more things needed. I’m thinking about those two plus the ones OpenERP already has.

I think that recognizing actors is the major improvement. Actors are abstractions about intelligence. If a person should be doing some kind of intelligent decision (like accountants), then you should encode (in your design) that decision as being taken by an actor.

Having artificial agents that could take over when the task is standard or programmable is also an option in this case. Anywhere in you design an actor does something, and agent could be replacing the human. The agents could be as dumb as the couple of rules we have now: create a journal entry each time an invoice goes to the valid state, and do it this way. But agents could be also provided of machine learning techniques and they could observe the how the human accountant proceeds when something happened. Of course this would require the human to proceed in case-by-case fashion and that’s not always true.

No matter if the machine learning is never done, I argue that designing with actors and agents will lead to better a implementation, easier to understand, maintain and evolve.

Notes

[1] (1, 2) This is no hypothetical at all. We’re actually deploying Accounting after having deployed Project Management and CRM. This has come with many surprises but that’s what this post is about.

[2] The word addon in here is important. There are actually two OpenERP addons named “Accounting & Finance”: the account addon and the account_accountant. The second one is flagged as an application and, thus, it takes a more prominent place in the listing of available applications. Installing the application forces the installation of the account module anyway.
[3] That is why I still do my personal (home) accounting with GNU Cash.
[4] This merge has problems of its own, but that’s a matter for another post.
[5] Though an account either gets credited or debited, I will avoid that accounting-related terms cause it’s not needed for the argument in this post. If you need to know, start by knowing that receivables have a debit normal balance and go from there.
[6] I’m not quite sure if this “anglo-saxon accounting” refers to different basis of accounting.

Baby steps towards reverse engineering in the Pythonic Query Language

Last week I took a recess of my OpenERP crusade and spent some time trying to figure out the problem of extracting an Abstract Syntax Tree for the query expression out of its compiled byte code.

A query expression like the following:

>>> this = iter('')
>>> query = (parent for parent in this
...          if parent.age > 40 and parent.children
...          if all(child.age < 5 for child in parent.children))

Is compiled into byte-code to something like this (in Python 2.7):

>>> import dis
>>> dis.dis(query.gi_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                60 (to 66)
              6 STORE_FAST               1 (parent)

  2           9 LOAD_FAST                1 (parent)
             12 LOAD_ATTR                0 (age)
             15 LOAD_CONST               0 (40)
             18 COMPARE_OP               4 (>)
             21 POP_JUMP_IF_FALSE        3
             24 LOAD_FAST                1 (parent)
             27 LOAD_ATTR                1 (children)
             30 POP_JUMP_IF_FALSE        3

  3          33 LOAD_GLOBAL              2 (all)
             36 LOAD_CONST               1 (<code object  at 0x1a21030, file "", line 3>)
             39 MAKE_FUNCTION            0
             42 LOAD_FAST                1 (parent)
             45 LOAD_ATTR                1 (children)
             48 GET_ITER
             49 CALL_FUNCTION            1
             52 CALL_FUNCTION            1
             55 POP_JUMP_IF_FALSE        3
             58 LOAD_FAST                1 (parent)
             61 YIELD_VALUE
             62 POP_TOP
             63 JUMP_ABSOLUTE            3
        >>   66 LOAD_CONST               2 (None)
             69 RETURN_VALUE

Extracting the original query expression out of the compiled byte code is sometimes referred as “decompilation” or “uncompilation”. Others prefer calling it “Reverse Engineering”. Anyway you call it, it is a hard task. And initially we simply avoid it.

When I met PonyORM I found that our idea of having queries expressed via comprehensions was already implemented. Despite my initial enthusiasm, I was forced to put the project in pause.

Last week I revisited the problem, but trying to decouple PonyORM’s from Python 2.7 is not an easy task. It depends on modules that no longer exists in Python 3.0, and their APIs are not easy to replicate. I decided to stop trying.

First, I thought that deriving a dynamic algorithm based on Petri Nets would be easy to do in a couple of days. My first draft solved the issue of decompiling the byte-code for chained comparison expressions like a < b < c. Here is one the drawings:

Handwritten Petri Net for Python byte-code

Handwritten Petri Net for Python byte-code

However, I found myself struggling with the Petri Net when it was not a DAG due to the absolute jumps in the byte-code for generator expressions.

Before proceeding to find a solution, I went back to search mode and looked for related articles and/or software. I stumbled upon an “old” package uncompyle2. This package has the nice property that is very easy to understand and it’s very easy to adapt. Moreover it’s based on published papers and a thesis you can download and read.

The whole idea is to apply compilers theory to the problem. This kind of ideas is very appealing to me. They have a context-free grammar that serves the purpose of building a recognizer for the python 2.7 byte-code.

So you can see that there are four productions for a generator expression:

#  Generator Expressions are expression
expr ::= genexpr

# This is the outlook of generator expression as an argument of a
# function.
genexpr ::= LOAD_GENEXPR MAKE_FUNCTION_0 expr GET_ITER CALL_FUNCTION_1

#  This one I don't know why: a generator expression is statement?
stmt ::= genexpr_func

# The outlook of a bare generator expression.
genexpr_func ::= LOAD_FAST FOR_ITER designator comp_iter JUMP_BACK

If you try to apply this to the byte-code shown above you will fail to see the LOAD_GENEXPR in the original byte-code. This is because it does not actually exists. It is produced by the uncompyle2’s tokenizer if the argument to the byte-code is a code-object itself with the “<genexpr>” name. This is simply done to simplify the grammar. Also the MAKE_FUNCTION_0 is produced by the tokenizer to mean the actual byte code MAKE_FUNCTION with argument 0. Same goes to CALL_FUNCTION_1 and JUMP_BACK. These are called “customizations” and must be dealt with in the parser, but they are easy to understand.

For example I’ve modified the package (still untested beyond an IPython shell) so that byte-code for the generator expression in Python 3.3 can be parsed [1].

Since Python 3.3 the MAKE_FUNCTION byte code is always preceded by two LOAD_CONST, the first one loads the code-object and the other loads the name. So, I simply change the grammar to meet those expectations:

@override(sys.version_info < (3, 3))
def p_genexpr(self, args):
    '''
    expr ::= genexpr
    genexpr ::= LOAD_GENEXPR MAKE_FUNCTION_0 expr GET_ITER CALL_FUNCTION_1
    stmt ::= genexpr_func
    genexpr_func ::= LOAD_FAST FOR_ITER designator comp_iter JUMP_BACK
    '''

@p_genexpr.override(sys.version_info >= (3, 3))
def p_genexpr(self, args):
    '''
    expr ::= genexpr
    genexpr ::= LOAD_GENEXPR LOAD_CONST MAKE_FUNCTION_0 expr GET_ITER CALL_FUNCTION_1
    stmt ::= genexpr_func
    genexpr_func ::= LOAD_FAST FOR_ITER designator comp_iter JUMP_BACK
    '''

So, that’s settled: xotl.ql will use this for the decompilation module.

I have already started to port the needed parts and I expect to have this done by the end of August (I must return to OpenERP).

The plan is:

  1. Port and test the uncompyle2 toolkit. That will be release 0.3.0.
  2. Revise and publish the AST. Since the AST will be thing coupling xotl.ql with translators it must have a high degree stability. Moreover the AST must remain the same across Python versions.This probably will span several releases, up to 1.0, in which time the AST will be declared stable and only changed in incompatible ways major versions jumps.
  3. Rewrite the translator in xotl.ql.translation.py to the AST. This probably will be synchronized with the changes in the AST itself.

Footnotes

[1] I have no interest in source code reconstruction, so I have not tested (and I won’t) anything beyond the extracted AST.

Announcing xoeuf or “OpenERP for humans”

Yesterday, I pushed to github our toolkit that helps us ease some tasks when programming with OpenERP. It’s name: xoeuf (pronounced like “hef”, just try to make it sound Frenchy). The name comes from the French for egg “œuf” and our usual “x”. The egg comes from Python’s tradition for binary distribution eggs. It’s too late to change to “wheels“.

Since documentation is still lacking there are some things that are noteworthy about xoeuf:

  • Unfold the powers of the shell. xoeuf allows to test code in a normal python (or better IPython) shell:
    >>> from xoeuf.pool import some_database_name as db
    >>> db.salt_shell(_='res.users')
    >>> self
        
    
    >>> len(self.search(cr, uid, []))
        48
    

    This feature works directly by opening a connection to your configured PostgreSQL server. So be sure to either have set OPENERP_SERVER environment variable or to have the standard configuration file in your home directory.

  • Model extensions for common programming patterns (xoeuf.osv.model_extensions). Those “methods” are automatically weaved into models when salting the shell (the salt_shell we saw above):
    >>> self.search_read(cr, uid, [], ('login', ))
        [{'login':  ... }, .... ]
    

    But you can use them as function in your code:

    from xoeuf.osv.model_extensions import search_read
    res_users = self.pool['res.users']
    res = search_read(res_users, cr, uid, [], ('login', ))
    
  • Get sane: spell things by name when writing. I have already mentioned that writing things in OpenERP requires some good eyes to see the meaning of something like “[(6, 0, [1, 4])]“. The xoeuf.osv.writers allow to simply tell that you want to “replace all currently related objects with these new ones”:
    from xoeuf.osv.model_extensions import get_writer
    with get_writer(some_modelobj, cr, uid, ids) as writer:
        writer.replace('related_attr_ids', 1, 4)
    

    This will simply invoke the normal write method with the right magical numbers.

So go ahead a try it and tell us.

Announcing the OpenERP corner.

I’m starting a new “column” in this blog. I call it the “OpenERP corner”. It’s going to be about anything I think is lacking in the OpenERP Book or somewhat misguiding in it’s technical documentation as well.

I think this column might help others like me seeking for orientation. It’s not intended to be accurate and will probably start some debate. That’s what I want.

I cannot reach for accuracy because OpenERP is a moving target. It changes on a daily basis. Also my blog time is being reduced to less than an hour per week and covering any OpenERP topic will take several hours. Anyway the “most accurate” place for seeking current information would be the help forum. And since I’m offline 99% of the time (guess why) I cannot participate much there.

The debate stuff is more like a hope. Engaging in a debate (if not lead to a flame war of taste) is always enlightening. I might as well be wrong when I say something, so that leaves space to be corrected (and taught).

So let’s stop now into this introduction and start writing my first post for the “OpenERP corner”.

See you in a couple of weeks.

The productivity of tomorrow trumps that of today

That’s probably harsh, but I think it is absolutely right. Doing crappy software today to be more productive today will make you less productive tomorrow. It’s this simple. And that’s cumulative too; meaning that if you neglect your future productivity, it will be slowly diminishing until a point of negative competitive disadvantage where you’ll find yourself struggling to keep up instead of innovating.

And it’s so damn exhausting to explain why…

Software complexity does not come from the tools, but from the mental framework required (and imposed at times) to understand it. So don’t ever think measuring Cyclomatic Complexity (CC) and other similar metrics will yield something close to the true measure of the quality of your code.

There are only two hard things in Computer Science: cache invalidation and naming things.

—Phil Karlton

def _check(l):
    if len(l) <= 1:
       return l
    l1, l2 = [], []
    m = l[0]
    for x in l[1:]:
       if x <=m :
          l1.append(x)
       else:
          l2.append(x)
    return _check(l1) + [m] +_check(l2)

This code has a nice CC of 4 which is very nice; yet it will take you at least a minute to figure out what it does. If only I had chosen to name the function quicksort

>>> _check([84, 95, 89, 4, 77, 24, 95, 86, 70, 16])
[4, 16, 24, 70, 77, 84, 86, 89, 95, 95]

A quiz in less than 5 seconds: What does it mean in OpenERP’s ORM the following lines of code?

group.write({'implied_ids': [(3, implied_group.id)]})

This line of code has a CC of 1: as good as it gets, isn’t it? But it’s so darn difficult to read that unless you have brain wired up to see “forget the link between this group and the implied_group”… To be fair there is someone out there in the OpenERP universe that cared a bit about this:

# file addons/base/ir/ir_fields.py

CREATE = lambda values: (0, False, values)
UPDATE = lambda id, values: (1, id, values)
DELETE = lambda id: (2, id, False)
FORGET = lambda id: (3, id, False)
LINK_TO = lambda id: (4, id, False)
DELETE_ALL = lambda: (5, False, False)
REPLACE_WITH = lambda ids: (6, False, ids)

But no one else is using it!

And, yes, it’s exhausting going over this.

Itches I have with OpenERP’s code base

These notes have been in my undecided-outbox since late January. I think it’s time to publish them, even unpolished as they are.

OpenERP has such a wonderful coding community that its documentation is ages behind. So from time to time (perhaps too often) you need to look at the source to fully grasp what’s troubling you. But the source is, well, at little bit under-cared for, if I may say.

Here are some itches I have with code:

  1. Some methods are so long (several hundreds of lines) that take about half an hour to grok all the details.

    And these are not core-framework methods I’m talking (complaining) about, but application-level methods. See for instance the reconcile method in addons/account/account_move_line.py (137 lines of code as of OpenERP’s 2014-02-26 code base).

  2. Many of them have neither a useful docstring nor comments to guide your reading.
  3. Too much code duplication. Probably because generalization opportunities are being disregarded or missed.

    The piece of code I’m going to dissect below is duplicated in even in the same file in two different methods. And it’s kind of a business rule.

  4. Too hard to read methods or, worst, pieces of them.

Let’s illustrate some of these itches in 5 lines of code. In the reconcile method, the lines to be reconciled must belong to the same company. This is the piece of that method that does that:

company_list = []
for line in self.browse(cr, uid, ids, context=context):
    if company_list and not line.company_id.id in company_list:
        raise osv.except_osv(_('Warning!'), _('To reconcile the entries company should be the same for all entries.'))
    company_list.append(line.company_id.id)

This is hardly readable. In fact, before I could enunciate the intention of the code it took me a backwards reading of the code. To make things worse, the same piece of code is found in the same file inside the method reconcile_partial of the same class! So the cognitive load is duplicated instantly.

The following is IMO a more readable option is the following:

lines = self.browse(cr, uid, ids, context=context)
if lines and len({line.company_id for line in lines}) > 1:
   raise ...

If you want to be more efficient and yet more readable (or maybe you want to support Python 2.6 and don’t use the set comprehension) you can do:

if lines:
    first_line, last_lines = lines[0], lines[1:]
    if any(line.company_id != first_line.company_id
           for line in last_lines):
        raise ...

If you read that, it clearly says “if any line’s company is not the same as the first line’s company” then raise an error.

And even better:

def ensure_same_company(lines):
    if lines:
        first_line, last_lines = lines[0], lines[1:]
        if any(line.company_id != first_line.company_id
               for line in last_lines):
            raise ...

And then reuse the function in every place where is needed. And the name “ensure” has a pre-condition formulation that is simple to understand by reading, not the code but the name only.

Why do I prefer comparing the company_id attribute directly is more debatable; but I have two good reasons:

  • Generality. If the company_id were actually an id (it is not in the original code) this code would work unchanged.

    The browse_record object implements the __ne__ protocol and does it right. The cost of calling the __ne__ should not be a performance sink given the normal use of the application (enforce this rules at the UI level as well).

  • Respect the principle of least astonishment. What’s is the id of a company’s id?

The return inside the for statement for “performance gain” is a false principle. The any built-in function is way faster. See it by yourselves:

>>> sample = random.sample(range(100000000), 900)

>>> def unique(sample):
...    x = []
...    for y in sample:
...        if x and y in x:
...            return False
...        else:
...            x.append(y)
...    return True

>>> def unique2(sample):
...     if sample:
...         first, rest = sample[0], sample[1:]
...         if any(y != first for y in rest):
...             return False
...         else:
...             return True
...     else:
...         return True


>>> %timeit unique(sample)
100 loops, best of 3: 10.2 ms per loop


>>> %timeit unique2(sample)
100000 loops, best of 3: 17.2 µs per loop

Notice that the any implementation is is orders of magnitude faster than the for loop implementation — from the nanoseconds to the milliseconds. Yes, I admit it. The gain is probably not for v. any but the usage of x in y. So let’s try a for loop that avoid the in test:

>>> def unique3(sample):
...     if sample:
...         first, rest = sample[0], sample[1:]
...         for y in rest:
...             if y != first:
...                 return False
...     return True

>>> %timeit unique3(sample)
100000 loops, best of 3: 16 µs per loop

Now, this implementation is faster, but very, very close to each other… Anyway, the clarity of any beats IMO that 1.6 µs the for loop saves.

“Software made for Belgians”

The title of this post is a (rather innocent) joke-like phrase that our team has coined. The phrase has its origins in a couple of events:

First, we’ve heard that Belgium has been declared Internet broadband connections a human right. Probably not exactly true. However, according to this several countries have promoted similar laws or statements.

Second, since we are now mainly using OpenERP (mostly made in Belgium) and we suffer from a 128 kbits per second Internet connection… Yes, you have read correctly and I made no mistake: 128 kilo-bits per second… And sometimes we need to access our OpenERP server over that connection. The fact is that OpenERP has a big fat upfront-loaded JS and makes lots of Ajax request that, summed up, amount to more than 1400 KB.

If you’re quick on math, you have guessed by now that it takes ages, eons and wasted human lives to load this application on a clean cache.

So, “Software made for Belgians” is any software that is built with the assumption (either consciously or not) that Internet access is available and is reasonably fast.

There are many pieces of software that have this property. For instance, npm opens many connections to download dependencies and this fails a lot under unreliable/slow connections. I’ve had npm sessions that I’ve needed to look for required packages and then do all the job of tracing requirements myself in order to install a single package.

You could think I’m against this kind of software. You’d be wrong. I’m against slow connection expensive [1] connections.

Software is built for a given set of requirements and following standard guidelines and assumptions. These days, a fast Internet connection is practically a must. When you are a freelance developer and you charge by the hour you should not need to waste 10 minutes waiting for www.google.com to load.

Notes

[1] Our country’s (sole) ISP has announced that it will (drastically) reduce the Internet connection prices. Our ADSL 128kbps connection that currently costs more than $ 900 (again, no mistake; it is that big the invoice for a shity connection) will cost about $ 110. Seriously…! Of course, that’s kinda relief; and we’ll switch to a better connection (still less that 1Mbps) for the same amount we’re paying right now… But, that’s just insane.

Ah, these prices are in CUC (Cuban Convertible Peso) but you may think about USD dollars.

Also these are “enterprise” prices anyway… There are no prices for “natural persons” beyond $ 4.50 per hour in a public room…

Composing deferreds — Building UI patterns

Note: This post was mainly written before New year, new projects. Nevertheless I keep the original wording and thus some references are made as if “News…” post was not published (nor even thought off).

In my last post I have talked about “Backboning” my current web application project. We have already deployed our first version (the one without backbone) yet. We have discovered errors, we have fixed them and we have learned about how our current AJAX calls behave under high latency (our server is far, far away and our office very low bandwidth).

Our clients have hardly noticed the latency issues, cause most of them consume our web app with a better connection; but we must be prepared. Specially cause one of the future goals is to be accessible from mobiles.

We have advanced some in our refactoring; roughly one feature have been completely refactored, and several other features are partially done.

At the same time we have created a branch for introducing some patterns in our application. That branch should serve well for both our current state and for our refactored version. This post is about those patterns and how we are approaching them.

High latency and feedback patterns

Some other programmers have promoted the idea that Fast Web Apps should not wait for the server calls to finish to update their client’s state and let the user go ahead and do more stuff. The benefits of such a technique are out there and are visible and true, so I will not cover them.

Under high latency network operations take longer, right? If we disregard this truth our apps will feel broken under such cases. Either we’ll lie about the operation been completed (and actually it might just fail), or the “refresh” will take too much and the user might click the button several times before quitting.

If you go down the first route you will need to find a good way to say sorry when the AJAX call fails. You can’t simply revert to the previous state cause that will confuse your users,

The second road it’s easier to fix by providing instant feedback about an in-progress operation. Though this might create the illusion of slowness for your users, and this is what Asynchronous UIs attempt to overcome.

Since must of our AJAX calls in our project are not asynchronous, ie. they produce their visual effects after the call either succeeds or fails, it’s easier to us to take an approach similar to the later. Backbone also triggers the sync event after AJAX call is done. So, the second road it is…

But actually since most of our clients have high speed connections they should never see the in-progress feedback. Our UI pattern goes like this:

  1. Avoid double click (probably by using _.debounce, more on that later).
  2. Proceed with the AJAX call, but also start a timer for ~300ms.
  3. If the timer ticks before the AJAX call, the update the UI with an in-progress feedback.
  4. In any case, when the AJAX call finishes update the UI (and remove the in-progress feedback if needed).

Update about debouncing – 2014-01-26

After our site went public we find out that under high latency, debouncing alone is not enough. It only avoid a rapid double-click but not several spaced clicks. So you need to do a bit of asynchronous UI changes. You need to change your button or whatever the user clicks on.

Writing deferreds

The pattern above may be translated to a couple of deferreds plus the AJAX one. jQuery.ajax returns a promise object: which is the outer side of a deferred; this is, promises are consumers of deferreds (those who want to react upon completion/rejection of the task the deferred represents), while the deferred object itself has methods for signalling those events.

Our “slow tasks feedback” pattern might be expressed like so:

var request = make_ajax_call();
var timer = elapsed(300);
$.when(whichever(timer, request)).done(function(){
   if (request.isPending()) {
      show_ui_feedback();
      request.done(function{
         remove_ui_feedback();
      })
   }
})

In this code there are, however, two kinds of deferreds there that are not provided by jQuery.

The elapsed deferred is quite simple:

// A deferred that resolves after a given `time` in milliseconds.
var elapsed = function(time) {
    var res = $.Deferred();
    var id = window.setTimeout(function(){
        clearTimeout(id);
        res.resolve();
    }, time);
    return res.promise();
}

The only thing I like to point is that you should not return the deferred object but its promise object. If you return the deferred anyone might call its resolve or reject methods and break your contract.

The $.when method is provided by jQuery and may take several promises and return a master deferred that will be resolved when all its delegated promises are also resolved. But what we need is a master deferred that will be resolved when any of its argument do. That’s why I created the whichever function:

var whichever = function() {
    var res = $.Deferred();
    var defs = Array.prototype.slice.apply(arguments);
    defs.forEach(function (fn) {
      fn.done(function () { res.resolve(fn); });
      fn.fail(function () { res.reject(fn); });
    });
    return res.promise();
}

Applying these patterns to current “blocking” applications is quite simple. In our experience, most of our user will never see the “loading…” feedback, and those using slow connections (and they are always conscious of that fact) won’t get desperate waiting for something to happen in their screens.

What about Spine or Backbone?

Note: This post was mainly written before New year, new projects. Nevertheless I keep the original wording and thus some references are made as if “News…” post was not published (nor even thought off).

After my last post I didn’t rest idle, but I went to download and read some of the noted libraries/frameworks. I wanted to learn as much as possible, an even considering to introduce some of those in my current project.

So, this post is about a work in progress: me studying some JS libraries/frameworks; and also me with a couple of priorities for my current project that help me evaluate them. So let’s start by my project in order to provide some context to my evaluation.

Continue reading