Tuesday, September 17, 2013

The Database Revisited

I've been working on Weapon M again. It feels great to work on my own code, even if it's only for a few hours a week. The difference between programming as a hobby and programming as a job is like the difference between painting a canvas and painting a house.

I've made two decisions regarding the database. First, as much as I'd like an excuse to learn how to use a graph database, Weapon M isn't it. I can't see any benefit to it when the entire graph of POJOs is already accessible to scripts, and serialization works just fine for persistence. And second, I've decided to simplify the synchronization to a single lock shared by the entire database.

Crazy, you say? A huge performance bottleneck? That's what I initially assumed. But the simplicity of a single lock was so alluring that I had to write a simulation to test my assumption. It consisted of two small programs, each of which starts and then joins with two threads. One thread writes random values to the fields of mutable objects in an array, while the other reads them. In one program, the threads contend for a single lock. In the other, they synchronize on the individual object they are reading or writing.

Each program performed one million reads and one million writes on one hundred objects. The program in which the threads were contending for a single lock consistently runs in 140-170 ms; the other runs in 90-100 ms. That's a pretty significant difference, percentage-wise. But remember, we're talking about two million field accesses in a fraction of a second. Things do not happen that fast in Trade Wars! And this is on a G555 Celeron, a fairly low-end processor by today's standards. When the programs are modified to perform only ten thousand reads and writes, the performance difference is completely lost in the overhead of setting up the threads. And increasing the number of objects being manipulated to decrease the lock contention and give the program using individual locks more of an advantage does not significantly affect the results.

My conclusion is that there would be no significant performance penalty to using a single lock. The benefit would be significantly simpler code, eliminating the possibility of any lock-order bugs that could lead to deadlock.

No comments:

Post a Comment