In my Januray post, I focused on implementing a singleton correctly. This time I want to add performance into the mix and show you the best way to implement your singleton... or give you guidance to pick your best way.
Setting the scene
I'm using a display manager as an example, like GDM, LightDM, or others in the Linux world. Here is the motivating implementation for today:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |
Let me quickly go through the various parts. In A, you see the data type Resolution which illustrates two resolutions; you can imagine the rest. Next in B, you find the DisplayManger implementation. Diving into the implementation, you can see that I used my own advice from my last post and made the copy- and move-operations private in C. This is all just setup for today's focus.
To complete the picture, here is how I use the object:
|
Let's talk performance
Going back to the DisplayManager implementation, the interesting part starts with D, the default constructor, which of course must be private in a singleton. More on that in a moment. As a last item, you see E, where I use a block local static for the variable dspm.
Let's talk performance. With C and D we have two places where we can use different implementations that influence performance for DisplayManager objects, or better access. But you might not always have the full freedom to pick all the options.
In my DisplayManager implementation I present you with a simple case. The default constructor can be defaulted since DisplayManager only holds an object of type Resolution, a class enum which boils down to an integer type. I don't need any code inside the constructors body. There are cases when this doesn't apply and you need to write code for the constructor body. By that, we can distinguish two cases here:
- defaultable default constructor (user-declared constructor)
- a constructor with implementation (user-defined constructor)
If you look at the generated assembly for DisplayManager with a user declared constructor, you'll see this:
|
For now, let's say that's good.
Once you look at the generated code for an implementation with a user-defined constructor you'll get this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
Now you can see why I called the user-defined version good. Once the compiler is required to have a default constructor, it must insert a guard variable and check the state each time you access Instance which adds up to a good amount of code. Please notice that at this point you're looking at code generated with GCC 15 at -O3 and I did not even call SetResolution or GetResolution.
Another thing to consider is that __cxa_guard_acquire and __cxa_guard_release introduce slight delays to your program.
Here is a Compiler Explorer link that shows the two options.
All right, what else can we do? Right, you can use a different approach in E.
Using a static data memeber
Instead of implementing the singleton pattern using a block local static variable, you can go for a private static data member. Time to see how this implementation behaves. Here is my implementation where I kept the labels stable:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | |
You can see that the changes are only in E1, E2, and E3. The latter one is required just for completeness. The interesting change is in E2 where I no longer use a block local static but the static data member from E1. You still have the two options: user-declared and user-defined constructor.
For a user-declared constructor, my code results in:
|
Which is exactly the same code as for the previous implementation. Things get interesting when you start looking at the user-declared constructor case:
1 2 3 4 5 6 7 8 9 10 11 12 | |
That code looks much better than the one before. No locks are required this time, which not only leads to less assembly code but also faster code at the same time.
You'll find the two versions here on Compiler Explorer.
Summary
If you want to have good performance for your singleton implementation and you need to provide a constructor, you should go for the static data member implementation. In case you can default the default constructor, the two implementation strategies are equivalent performancewise. I would suggest using the block local approach as it saves having to define and initialize the singleton object in an implementation file.
Andreas