A few days ago, I discovered Pony ORM: A project that shares our vision of having queries in Python with a natural (pythonic) syntax.
Pony is much older than xotl.ql, and is deemed much more stable. We have drafted a comparison page in our documentation page.
This post is just a summary (expansion) of that comparison page, and of what’s currently on my mind about the future of xotl.ql.
A tour around Pony’s features
Pony ORM uses generator expressions as its way to express queries. That’s why is a “natural elder brother” of xotl.ql. A Pony query looks like:
persons = select(p for p in Person if 'o' in p.name)
Queries are expressed using the most natural Python constructions available for this task. This is quite the same goal xotl.ql pursues. However, there are two main differences between the projects:
- Pony ORM is an ORM and xotl.ql is not. Being an ORM, Pony concerns itself with accessing relational databases. And thus their scope a bit wider than ours.
- Pony ORM allows you to use the true Python’s logical operators like
This has to do with the way they access deal with the generator object. They cleverly inspect the bytecode that Python creates for the generator object and then they recreate a semantically equivalent syntax tree  for the query. After that they do what we call, in xotl.ql’s terms, query translation of the query to the SQL dialect that matches your database.
Brief update about xotl.ql status
Since I haven’t blogged about xotl.ql for a while, I think an update is in order.
Since my last post, which announced the release of xotl.ql 0.1.7; we have gone done three more release. The most important being the 0.2.0 release which introduced an implementation of a translator which was further improved in release 0.2.1.
This translator allows you to find any live object in the Python’s process memory (see the current translation test suite in github).
Further improvements include a protocol for sub-queries like in:
query = these(parent for parent in Person if parent.children if sum_(child.age for child in parent.children) > 60)
Where the argument to the
sum_ function is a kind of sub-query. The protocol allows operators to resolve their argument at query construction time, it is explained in our documentation page.
How does Pony might influence our future work
Perhaps the most impacting feature we would love to have is to write our queries with the true Python’s operators; and not having
not, etc. It’s too early for me to predict exactly how this will change, but I’ll do my best guess here:
- We like our current division between a language component (which is the sole purpose of xotl.ql) and the translator components that “compile” a query to a target object model and storage.
So, xotl.ql will remain focused on just the language part; and won’t include any production quality translator.
Other packages will include translators. We’re even thinking in a translator to query the OpenERP database.
- We’d like to keep supporting Python 2.7 and Python 3.2+; so if we implement a bytecode inspector we’ll have to do it for both versions. (I’m already studying both bytecodes to have a better decision supporting idea. So far the differences seem to be rather few and manageable.)
We’d like to keep our licensing scheme. I’m not quite sure about Pony’s license standings. It’s distributed under GNU Affero GPL version 3, but also say that is “free for non-commercial use”. Since every GNU GPL allows for commercial use of the software, this seems an amendment. But I’m not totally sure (see several tweets I’m having with Pony’s authors.)
- Update. After several tweets with Pony’s team, they told me I could use their code. So probably I’d be using the decompiling module as starting point.
There will be probably differences to keep xotl.ql database-agnostic, and porting to Python 3.2; but this would help me a lot. Thanks to Pony ORM Team.
So, after all, having known Pony was a lucky event: It has shown us that translation is feasible, that we’re not the only ones pursing a better query language for Python, and that reconstructing expressions from Python’s bytecode is not as bad as we thought.
In the following weeks, after other pressing deadlines are accomplished, I’ll be doing the following:
- Create a branch for experimenting the bytecode inspection in both Python 2.7 and 3.2.
I’m already studying Python 2.7′s bytecode definitions, I’ve also recalled that the PEAK project includes a BytecodeAssembler that might a good source for peaking (since they assemble instead of disassemble)
- Rewrite some tests with the expected new query language.
- Keeping coding and writing the docs.
|||Syntactical equivalence might not possible this way since Python uses the same bytecode for different syntactical constructions.
For example the following generators, which are semantically equivalent (but not syntactically) generate the same bytecode:
this = iter() g1 = (parent for parent in this if parent.age > 1 if parent.children) g2 = (parent for parent in this if parent.age > 1 and parent.children)