Over the last week or so we’ve been focused on bug fixes and the next round of performance improvements – with the assumption that we’ll be bypassing Hibernate for the persistent objects we use most frequently.
We started by taking one of our high volume power users and profiled the system with their data. Right away, the profile honed in on two major bottlenecks: The first bottleneck concerned how the relationship between Serialized Containers and their Non Serialized Contents are stored (using an association class called SerialNumberContent). The other concerned traversing the object model to resolve things like status and date inheritance.
We started by focusing on the contents issue, since it was the chewing up the most CPU cycles – and was the most puzzling of the two because there’s no obvious reason why this special relationship would cause performance problems.
We knew from profiling the system that the bottleneck for the contents issue was somewhere inside Hibernate. So we decided to use this simple association class as a test case for migrating objects out of Hibernate to a lower level JDBC ORM (we’re using Spring’s JdbcTemplate.)
The first step was to snip all the connections between SerialNumberContent and the rest of the system. This meant removing one-to-many references from SerialNumber and InventoryItem and removing object references from the association class, replacing them with plain id fields.
Then we moved code for fetching relationships from the domain to the service layer, and these service methods would then invoke the underlying DAO.
Once we successfully isolated the association class – but before removing it from Hibernate, we fixed all the ripple effects and retested. We noticed a roughly 35% speed boost just by isolating the object. This was good, of course, but important only from an academic perspective because we needed a 1,000% or 10,000% speed boost.
What we learned was that Hibernate does not cache the result when a query or cached value is null. If a query returns no results, that “null” result is never cached. In our use case, this meant that the fewer relationships between serialized containers and non-serialized items there are, the slower a system will run. It seems counter intuitive that NOT using a feature would cause that feature to introduce performance bottlenecks, but that’s exactly what happened.
Next, we developed a new base Data Access Object based on Spring JDBC Template and created a new DAO for SerialNumberContent. We swapped it out, debugged it, and retested. Low and behold, we get a 1,000%+ speed boost on that portion of the code. The serial number content portion of the availability calculation drops way down the list of CPU consumers in the profiler, to the point where something that had previously taken 220 seconds to run drops to 1.5 seconds.
This was a great result, but we knew it was still inefficient because we were running a live database query every time we looked up a relationship. There was no caching involved at all. Depending on the situation, this isn’t a bad thing, but we knew in this case that we were running the same queries over and over again. The information that this association class represents doesn’t change all that often – it needed to be cached.
We added some basic cache support to our base dao class (using EHCache) so that our basic save(), delete() and findById() operations all used the cache. Then, in the new concrete dao, the most frequently called methods (findByContainer() and findByContainerId()) were modified to use a custom cache.
Once this was debugged and plugged into the profiler, that 1.5 seconds dropped to 60 milliseconds, resulting in a performance boost over 3,000 times faster than the original speed before we started.
This is good news because it proves that a “theory” about how to get big performance gains works in practice. The next step is to apply this same technique to other performance bottlenecks in Flex – of which there are still many – but one less than there used to be.