Moving away from ORMs

Introduction

I believe that the chance of a backend nowadays not having to persist data somewhere to be very slim. The monologue below is about applications that need to store data.

In order to persist data to a database, an application in X programming language requires a way to connect and send queries to the database. The interface for the database interaction is almost always provided by some package in X's standard library (most likely). Java has jdbc, Golang has sql, Python has pyodbc (which is not in its standard library) but the idea is the same.

The problem with the packages mentioned above is that for real life usage, they can become troublesome to work with, unless you consider string concatenation and interpolation a good way to build your queries. This is what makes us to look for alternatives and one of them are Object Relation Mappers which i began to avoid for the reasons i will mention in this post.

This post is mostly me listing things that i don't like about ORMs and how a more native SQL approach is what i will prefer any day of the week. Also, this is focused for the Java world mostly.

ORMs

ORMs are very popular these days, especially if your backend is built with a framework (which it should) where they are bundled with. The chances of using ORMs with frameworks like Spring Boot, Ruby on Rails, Laravel, Django for example are very high. Some more minimal frameworks do exist but even there you can see people using ORMs. The idea sounds nice, you have your tables represented as a class in your application layer and since we are using object oriented languages, that sounds very convenient.

Spring Boot is usually set up with Spring Data JPA which uses Hibernate, Laravel has Eloquent, Golang has Gorm and so on.

Most people will use whatever the framework provides (and this is normal) and start coding. Why would they search for alternatives if this is what the framework maintainers and community decided to ship already? Probably there are valid reasons.

There is a group of people though, that does not like the idea of ORMs, for a couple of reasons. Every time i was watching people discussing this debate (ORMs vs alternatives), there was always the following comment:

"ORMs are good for trivial queries but very bad when you need something more than a simple query."

And this unfortunately is the truth. Do you want to fetch the user using his id? Cool, ORMs can do that. Do you want to update a row? They can do that too. Do you want to execute a complex query? They can also do that. But this is where the problems start. The moment that you reach for an inner join with more than 2 tables, the problems start appearing.

Hibernate

The backend services at my workplace are mostly Kotlin services, which means that we are also using Hibernate. Our problems did not even start from complex queries, actually the query was very simple. We needed to fetch from a service a list of ids. That's it. We needed to fetch some ids of rows that a specific column was equal to a specific value. We realised that we had a problem, when the matching rows were 563.

We started by using the common "findByXIn" method in the repository interface, which returns a list of entities. That returned 563 objects which we had to keep in memory and then map over them to get their ids, something not really efficient. We also didn't need the entities themselves, we just needed their ids, so that's why we moved to use a @Query annotation and writing some HQL which essentially tried to select one field of the class, since HQL is working in OOP manner. But even that was not enough, 563 items needed 500ms. I could not accept that this was normal for a database. Companies have put all their business logic in the database layer but we needed 500ms to fetch 600 ids? Something was wrong.

After spending some time with the unreadable logs that hibernate was producing, i realised that the issue was a 1-1 relationship that our entity had with an other entity, which was eagerly loaded. My colleague and i tried different solutions to get rid of loading the related entity, but nothing worked. Also making it a lazy relationship was not an option.

hibernate_svc

This is a screenshot we took from our Datadog metrics. Each blue line is a query to the database. (Since quality is not that good, total time is 510ms)

Time was pressing and we had to move on to finish other things too, so we decided to create a custom repository where we used directly the entity manager. People familiar with Spring Boot and how it discovers repositories may empathise when they hear that we had to create a custom class with an absurd name that imlpements a custom repository, where we have to use the entity manager so we can fetch a bunch of ids.

Fetching now had more reasonable timings and we moved on. But i wasn't satisfied. It felt weird having to do these things to accomplish something that simple. I wanted to improve this. That's why i made a research for alternatives, and ended up with having to choose between Exposed, Ktorm, JOOQ and SQLDelight. The first two are ORMs so probably we would face the same issues - if not more - with Hibernate with less content available on internet for our issues, so i just skipped them. I had already seen good things about JOOQ before that, so when comparing it with SQLDelight i decided that it would be better to go with JOOQ since it was longer around, had a lot of features (many of them that we won't even need) and it had very good documentation. So i chose JOOQ as my alternative to Hibernate and that's what i proposed to my team.

JOOQ

JOOQ is a library created by Lukas Eder that allows you to write type safe SQL as if you were writing Java. We are using Kotlin, but this is not a problem. JOOQ interopts with Kotlin without any issues, and there are cases that you can express things in a more elegant way because of Kotlin's expressivness. The way it works essentially is that it uses your connection credentials to the database, and based on your schema it can generate the classes that represent your tables and their columns. After doing that, you just need to use it's very powerful DSL to write queries and execute them. That's it. You tell it what to do through code, and it converts it to SQL. It does not matter which database you are using, since JOOQ abstracts that part. Actually, it matters because for specific databases (e.g SQL Server, Oracle Express) you need to use the paid edition depending on your team size.

In the beginning my proposal was not very welcomed. We had overcome the issue with the 563 ids and it was working, why change it? I had hard time convincing them that JOOQ would solve a lot of issues for us but probably my arguments were not enough. We would have to learn something different than what we used in all our services and i could not convince them that it would be worth it. After being annoying for some time, they agreed that we can introduce it to a new service that we had to build. That way we could evaluate if it works for us and if it's indeed a better solution than hibernate. Convincing after that didn't take long, they just saw the truth. JOOQ was straight forward, was doing it's job very well, it was doing what we were telling it to do and it was pretty fast too. Auto generated records and POJOS directly from our database were also a nice feature.

This was the beginning for us to start replacing gradually Hibernate with JOOQ in our services, which we mostly have finished. I was excited about it, but we should also deliver features so that's why we decided to do the replacement in steps and it paid off. This is the result after rewriting the part that i mentioned above:

hibernate_svc

(Since quality is not that good, total time is 83.7ms)

JOOQ pitfalls

There is no silver bullet and of course JOOQ is not a panacea to our problems. It comes with it's quirks but personally i don't find them big quirks.

Build configuration: One very powerful feature of JOOQ is it's ability to generate classes, but this comes with a cost. Firstly you need to change your build flow so you generate what is required at build time. Fortunately There is a very good documentation about it. This of course, depending on your schemas size adds time to your build time, but can reduced by not generating classes if your schema has not changed.

Kotlin Dataclasses: Since we are using Kotlin, we would like to use it's null safety. Autogenerated classes from JOOQ if we use the Kotlin generator, currently produce dataclasses where all the fields are nullable. This is not very useful for us since we will have to use "?." in every field access. Worth mentioning is that this is not entirely JOOQ's issue, since NULL has a different meaning in programming languages and SQL so there is no a straight forward way to fix this. For now we are using Java Records instead of Dataclasses for POJOs.

Learning curve: It's a fact that everything new has it's own learning curve and this of course applies to JOOQ. You have to learn new things but this is not the core issue. The core issue in this specific case in my opinion is that you have to make a shift in the mentality about how you approach problems that interact with the database. ORMs made us think less about our SQL and let them do the job even if there are many cases where they do a very bad job. With a tool like JOOQ you are supposed to model your queries effectively since you are the one who is going to do the job, and JOOQ is just going to translate your code to SQL.

Conclusion

This is the end of this post. If you have reached this spot, first of all thanks. I just want to mention that this is just my personal opinion about ORMs. Obviously ORMs are very powerful and people have valid reasons to use them.

It's just that i feel that the reasons that i would have to use them have been eliminated. Having a tool that is so powerful and expressive like JOOQ, where it just does what you tell it to do without having to bypass it in order to achieve my goals is pretty much the main reason that i prefer it over ORMs, plus it's extensive documentation. Obviously, this is focused about Java world and you may not be able to find a similar library in other ecosystems.

Lastly, if your questions can not be answered using the documentation, Lukas will have probably either answered a similar question somewhere or you can ask a question on Stackoverflow and he will pop up there giving you a solution.