I’ve been reading about and considering language design choices (for my new pet project), and one thing I really like (though I rarely actually use in action) is Clojure’s transducers. I couldn’t find it in the talk introducing them, but I vaguely recall someone vaguely recalling that Rich Hickey said Clojure’d have much less laziness if he’d found the idea of transducers sooner.

Then in a completely different thought process (maybe there could be transducers, process transformations for thought processes as well?) about databases. I was considering databases I used so far, things I tried to achieve with them, the difficulties and nice things.

I’ve used relational, SQL-driven databases. SQL is that language that everyone eventually learns to some degree, but very few seem to want to write it and even fewer know what any of the vocab (BTree? filesort?) means. They perform okay for most tasks, but then very complex conditions, seven table joins and subqueries come up and things don’t look so bright anymore.

I’ve used key-value stores which are a natural fit if you have single-item caches like having bigger, complex documents rendered into Redis instead of running 702 queries every pageload. But they then very soon you’ll need fifty of those caches “at once” (or cleared at once), leading to weird scans and funny key structure (I’m looking at you, etcd).

I’ll have to admit I haven’t used document stores or EVAT databases under load. I used Mongo in an effort to track what I read online (which was actually my very first bit of online Clojure code now that I think about it), but one user opening the page once a month isn’t exactly “under load”. Similarly I’ve played around with the Crux (sorry, xtdb) tutorial, but once again that’s nothing but an introduction and impression.

I noticed that using pretty much any database is a conscious effort. Learn Datalog. Learn SQL. Learn any arbitrary DSL the “NoSQL” ones will throw at you – except many will eventually succumb and build their definitely-not-SQL something-else-QL. Is this really all necessary? I guess a thought like this brought around NoSQL in the first place. I wonder if it’s possible to bring about the database that you can use like “any” data aggregate in your language of choice.

If you can say my_data[12] to get the item with ID 12 from a map, why can’t you just say my_database[12] to achieve the same? If you keep mapping and reducing over your React/Redux state, should there really be such a difference in mapping and reducing over a cloud-based big data source? Sure there may be considerations about whether these should be blocking operations, but that’s more of an implementation detail.

My point is that in the end I just want to use the database as any other data structure. I don’t want to write a SELECT ... WHERE query, I want to filter the data. I don’t want to compose some arcane JSON schema to satisfy Elasticsearch, I just want to filter the data. I honestly don’t care about Rails counter_cache columns, I just want to know how many I have of some stuff.

I’ll provide my transducers (filter, map and their kin) to choose which bits of the data I want and the system should deal with the rest. Some may start mentioning certain “stored procedures” as landmines, but I’d rather first consider how much one can achieve with even a very limited vocabulary. Consider how many of the built-in SQL functions you’ve used in the past year – I’d bet it’s a well manageable number. I’ve written some quite baroque queries and I’d say it’s still fewer than ten.

The question is once again reusability and composability while maintaining efficiency. What does a seven table SQL JOIN look like? What’s the equivalent of a UNION? Transducers are all about how to abstract the essence of some process from the nitty-gritty. Joins and unions are rather about what it means to combine such processes of selection in various ways (and how to do it with as little work as possible). It might take some blood magick and eldritch contracts, but I’m sure it’s doable!