Tuesday, April 1, 2008

Static variables and singletons

I planned on putting this online Saturday night, but I got lazy. The laziness lasted longer than expected, but at long last, here we go.

This is just a paste of a word document. If you want a copy of the doc, let me know and I'll email it to you.

Next task: Figure out how to more effectively use blogger to post this sort of thing.



Jay’s Current Stance on Singletons and Static Variables
3/29/2008

"I hate being quoted" - Jay Allard



Overview
Singletons are really convenient. I’ve used them for many things over the years. As I’ve evolved, I’ve changed my stance on them several times. The changes range from minor tweaks to “why in god’s name did I ever think that was a good idea?”

This rant describes where I currently stand with singletons, static variables, and how I came to my current opinion.

Mixing Static and Instance State
Just a quick blurb based on some object I’ve seen. An object should not, in most circumstances, have any static data.

I ran into a situation on a side project a while back. I had to add new functionality to an existing object. I created unit tests for the object to define the new functionality.

In each test, I instantiated the object and tested whatever I wanted to test. The tests were working apparently randomly, but not really. Sometimes things passed, sometimes they failed.

It turns out that even though I was instantiating the object, it was driven by a lot of static data. So, instantiating a new instance gave you a new instance, but everything driving the object was static. Once the first instance was created, the static data was populated, so the subsequent tests didn’t behave as expected. Instantiating the object didn’t actually do anything. The instance was just exposing static data.

If you instantiate an object, you should get a fresh new object. If that object loads a bunch of stuff and you only want to load it once, then use another pattern. Either use a singleton so that it’s clear that it’s shared data. Or, don’t bother giving it a constructor at all. Make it a static object.

I only advise the creation of a static object, or singleton, as a better alternative to an insignificant instance. The better solution follows.

What is a Singleton?
A singleton is an instance of an object that is shared in the app domain. Rather than recreating the same type of object multiple times, just create it once, and allow everyone to use it.

Lets use a new cache object as an example. We’re going to ignore the fact that there are plenty of caches available, and build our own.

Cache cache = new Cache();
cache.Add("test", new object());
cache.Add("test 2", new object());

Now, you can retrieve things from the cache by key.

string hello = cache["test"];
string world = cache["test 2"];

Simple, right? Realistically, though, you only want to create the cache once then make it globally available. If you converted the cache object to a singleton, in its simplest form you would get:

Cache.GetCache().Add("test 3", new object());
Cache.GetCache().Add("test 4", new object());

You could alternatively change GetCache() to be a property name. Technically, that would work, but I disapprove. The method approach is, logistically, more accurate. You’re telling the object to do something. (It can be argued either way, but I’ll always use a method)

You may also see that we can achieve the same thing by exposing a public static variable. That’s smarts. Statics are the basis of the singletons, and a lot of times that may be the way you go, though I won’t.

By the time you get to the end of this document, you’ll learn that I no longer do singletons as described above. I do it with a slight design driven variation. But, the concepts are the same, so lets work through it.

Candidates for a Singleton
No State
A singleton is basically a server object. It can’t maintain state between calls because its getting hit by multiple threads at once, all with their own agenda.

Pay pay = new Pay();
pay.Hours = 3;
pay.PayPerHours = 375.00;
decimal myMoney = pay.Pay;

The Pay object cannot be a singleton. Multiple threads would overwrite the properties. By the time you got to “pay.Pay”, the state is completely unpredictable.

If you converted Pay to be method driven, you could then make a singleton of it.

int myMoney = pay.CalculatePay(3, 375.00);

Thread Safety
Thread safety becomes very important. If your object supports read and writes, then you have to manage the threads to make sure they don’t step all over each other.
For example, suppose the object checks a cache for a value. If the value is there, it returns it. Otherwise, it does a db lookup, populates the cache, then returns it. What if that happens twice simultaneously? Does the object end up on the cache twice?
Creating a Singleton
A singleton is an instance of an object assigned to a static variable.

Initialization
Option 1 – Inline Initialization – Jay’s Preference

In most cases, this is how I go about it.

public class Cache
{
private static Cache _cache = new Cache();

That will get compiled into to the constructor anyway, so its not much different from #1. I just like the syntax better.

In option #1, you saw that I added some initialization statements. How would you do that here?

public class Cache
{
private static Cache _cache = CreateCache();
private static Cache CreateCache()
{
Cache cache = new Cache();
cache.GetCache().Add("test 7", new object());
cache.GetCache().Add("test 8", new object());
return cache;
}

Note: This is only prudent when the object doesn’t have anything else to do. It’s just a cache; it doesn’t have any other functionality. That should usually be the case, but if you have a helper class with lots of stuff in it, then you may inadvertently create the cache when you do something irrelevant.

The static constructor is thread safe, so you don’t have to worry about thread management. It will only get hit once.
Option 2 - Static constructor
For some reason, I used to like using the static constructor. For some reason, I don’t like it anymore.

static Cache()
{
_cache = new Cache();
}

This gives you the opportunity to do some initialization right after you create it.

_cache = new Cache();
_cache.Add("test 5", new object());
_cache.Add("test 6", new object());

When an object has a static constructor, a flag is checked each time the object is hit to make sure that it has already been called. Regardless of how we do this, something is going to have to be checked on each visit anyway, so don’t get hung up on that.

Repeat Note: This is only prudent when the object doesn’t have anything else to do. It’s just a cache; it doesn’t have any other functionality. That should usually be the case, but if you have a helper class with lots of stuff in it, then you may inadvertently create the cache when you do something irrelevant.

The static constructor is thread safe, so you don’t have to worry about thread management. It will only get hit once.
Option 3 – Lazy Load
Back when I was a boy, this was my preferred approach. I did it this way to make sure that it only got created as needed, not when the class was used for any other reason. Since objects should only do one thing, that shouldn’t be a concern, so I don’t do this anymore. But, here you go.

private static object _singletonLock = new object();
public static Cache GetCache()
{
if (_cache != null)
{
return _cache;
}
lock (_singletonLock)
{
if (_cache != null)
{
return _cache;
}
_cache = new Cache();
return _cache;
}
}
Since this is a method, we’re responsible for the thread safety.
First, check to see if _cache already exists. If so, return it.
If not, lock the next chunk of code. We want to make sure only one thread hits this at any given time.

Check _cache again. It may have been created, by another thread, since the last time we checked. If it exists, then return it. Otherwise, create it then return it.

This Lock/Check/Lock approach doesn’t work in java. I read a bunch of stuff a few years ago. It was a hot topic, and it was proven that regardless of how logical the code is, it didn’t compile as you’d expect, so it wouldn’t work.

I only know it works in .Net based on experience. I should compile it and look at the IL to make sure its doing what I think its doing. It would be a great exercise. But, thus far, I have no reason to suspect that its not working.

A lesser approach is to lock the entire method. I don’t like that at all, though, because it only needs to be locked once, yet you’re locking it every time. Crazy, right?

Constructors
Once you’ve created an instance of your singleton object and assigned it to a static variable, are you done?

Possible Answers:
A: No
B: No

The correct answer is B: No. You are not done; at least, not if you want to enforce the SINGLE INSTANCE ONLY part of the singleton.

public class Cache
{
private static Cache _cache = CreateCache();
private static Cache GetCache()
{
return _cache;
}

Now, you can quite easily use the singleton.

Cache.GetCache().Add("Test 9", new object());
Cache.GetCache().Add("Test 10", new object());

Smashing. But, what’s to stop you from doing this?

Cache myLocalCache = new Cache();
myLocalCache.Add("Test 11", new object());
myLocalCache.Add("Test 12", new object());

Nothing is stopping you. By the definition of Singleton, you should stop it. You stop it by adding a private constructor. The private constructor will prevent any other object from instantiating cache


Final Product
public class Cache
{
private static Cache _cache = CreateCache();
public static Cache GetCache()
{
return _cache;
}
private static Cache CreateCache()
{
return new Cache();
}
private Cache()
{
}
public void Add(string key, object value)
{
//do someting
}
public object this[string key]
{
get
{
// do lookup
return null;
}
}
}

Now we’re sitting pretty. The only way anyone can use your cache object is by calling Cache.GetCache(). In order for this to work properly, your implementation must be thread safe. Your methods may get hit multiple times simultaneously.

That’s All Great, But Don’t Do It!
Did I just waste 5 pages? No, its all valid information. And I’m being over dramatic when I say “don’t do it”. That’s your choice go for it.

None of my business
I’ve come to the determination that it should be up to the application to decide what is a singleton and what is not. Who am I to say that there should be one and exactly one Cache object? Maybe the application would like to create 5 cache objects for different things. (Please keep in mind that this cache object is an ambiguous example. The question applies to anything that you want to make a singleton.)

If you come to the conclusion that you, as an object developer, have specific reasons to make sure that there is only one instance in any app domain, then it is your prerogative to make it a singleton yourself. What are some examples?
Maybe your object opens up a specific TCP port and listens on it. You can’t do that more than once. (Though, if the port was a parameter, then you could).
The Windows Workflow engine can only be started once per app domain. Somethine like that is a good candidate. (For the record, I don’t think WF should be a singleton for the reasons described in this section. But, it could be.)
I’m out of examples, but just 2 bullets is a waste of bullets. A list needs at least 3 bullets to be respectable.

I’m sure there are plenty of other examples, but in my experience, I’ve used singletons for convenience rather than necessity. I wanted my application to share one cache in an easily accessible way.

It comes down to this: Build objects. Let the consumer of the objects decide what they want to do with them. Its none of your business.

Testing
TDD is one of the primary forces that drove me to stop doing singletons.
In practicality, if you decided that you wanted something to be a singleton, then for most purposes its fine as a singleton. But, when you add testing to the equation, it becomes more problematic.

For example: As I develop my cache object, I’m going to write a ton of unit tests. Each of those tests should be stand alone (ie: not victim of anything that happened before it; won’t influence anything that happens after it?) How do you do that with a singleton?

For example, the test may be to add three items to the cache, then make sure the cache has three items.

The next test may be to make sure the cache initializes to empty. If the cache exposes a Clear() or Remove() method, then you can do it. If it doesn’t, then you’re stuck. The singleton instance is already populated. Test #2 will fail.

That’s only one minor example. We can invent plenty more.
Evolution Step 1: Default Singleton and Instantiate a Class

The previous conclusions were formed over time. Along the way, I took an incremental step towards supporting them. This worked out well since we already had existing singletons, but I wouldn’t do it for anything new.

One of the key parts of the singleton is the private constructor. The private constructor prevents anyone from instantiating the object. It remains entirely in your control, not the consumer’s.

Step 1 of the evolution was to eliminate that contructor, and change the getter to GetDefaultInstance.

public class Cache
{
private static Cache _cache = CreateCache();
public static Cache GetDefaultInstance()
{
return _cache;
}
private static Cache CreateCache()
{
return new Cache();
}
//private Cache()
//{
//}
public void Add(string key, object value)
{
//do someting
}
public object this[string key]
{
get
{
// do lookup
return null;
}
}
}

Now, you can either use the default instance (the singleton), or create your own. Technically, though, this is no longer a singleton. A problem with this approach is that the singleton is created even if you never use it. This could be a good place to use the Lazy Loading approach

Evolution Step 2: Have your application do it.
You have created a cache object and made it available to your applications. Its up to your application how to use it. If your application decides that there should be one and only one instance of the cache object for the entire application, then it can manage it.

public static class ApplicationCache
{
private static Cache _cache = new Cache();
public static Cache GetCache()
{
return _cache;
}
}

Now you can use ApplicationCache.GetCache() to get to your single instance of the cache object. (Of course, that doesn’t stop anyone from creating their own Cache if they’d like to).

If you have a few things like that, you could have one static class to expose them all to your application.
Conclusion
Going forward, I don’t see myself creating any objects that are automatically singletons. If I need a singleton, it will be at the application level.

The singleton is a very basic pattern that’s probably not worth 9 pages of information. But I did it anyway. I hope you find it useful.

No comments: