Object Ownership

Link: kdab.com/object-ownership
Author: Ilya Doroshenko
Publication date: July 5, 2023

Last time we touched upon object lifetime and today we wrap up the basics with a bit of a spicy topic of object ownership. We covered the lifetime quirks, and we found out that manual memory management can be a nightmare, even if we new and delete in the correct order. There must be something better than that. Well, there is but it comes with its own can of worms.

Structured cleanup

Since we know the rules of new and delete, namely new allocates and delete destroys, we never really cared about who is responsible for the object. This caused a lot of confusion in the past. For instance, some API codes from Win32 return strings that should be LocalFree()d, like FormatMessage or GetEnvironmentStrings. POSIX, on the other hand, has strdup as a common example of you should free it yourself. This model is confusing because you may have a lot of return statements, before which you should always call free or delete, depending on the operation. However, we have RAII since the very beginning of C++, which adds constructors and destructors. So, in 1998 resourceful people decided to add auto_ptr to the standard.

The premise was simple:

a simple explicit constructor, that took raw pointer from new
destroyed by either an explicit release/reset or a destructor on the end of the block

This was the first attempt at a structured cleanup. As time passed, the initial point began to crumble. More issues arose and the question came up:

Who owns the data?

Of course, the data is owned by the pointer class. But what if the data is needed elsewhere? In C++98 the only options were reference and raw pointer. If you copied the auto_ptr to another place it moved itself to that location, essentially transferring the ownership. That made it impossible to place them into standard containers since the copy of the object did not adhere to copy semantics at all, and the objects were not equal.

Simple example:

auto_ptr<CPClass> a(new CPClass);
auto_ptr<CPClass> b(a); //copy the a

printf("%p", a.get()); // prints 0
printf("%p", b.get()); // prints valid pointer

Another set of problems included inability to express custom deletions for allocated arrays, malloc allocation, or any specific destructor at all. Quite problematic one might say. C++11 invented move semantics, and along with them new types of pointers; unique_ptr was one of them. It disabled copying altogether, which forcd containers like vector to move the contents instead. Furthermore, the argument template also got a custom deleter possibility for custom pointers, array overload and fancy make_unique<>. This is a replacement for explicit new call, though, it lacks a custom deleter. For that usage you still have to use an explicit constructor. In the same standard auto_ptr was deprecated and it was removed completely in C++17.

Now for some express wordings: Object 1 owns another Object 2, if 2’s lifetime does not exceed 1’s and object 1 is responsible for destroying object 2. For fellow mathematicians, it is a weak ordering of lifetime.

The object may be destroyed earlier than its holder, but not the other way around. auto_ptr uniquely owns memory, hence, you can say, that it is a unique pointer. Well, kind of. But the implicit ownership transfer did not allow it to be stable enough. The unique_ptr came out and said: “I own the memory! And if you want it, then you will have to take it.”

Although two more pointers came along with unique_ptr, this is where we should dive a bit deeper.

Shared ownership

Let’s imagine the case, where there are several objects that communicate with one particular one. The easiest example from the real world is the communication with a printer from several devices.

If we project the same logic to the code, we could expect an object, which represents the printer inside the objects, which represent devices. Pretty simple, isn’t it? Now we impose a restriction: If the printer is out of scope, it shuts down.

Suddenly, the task becomes complex since we can’t say explicitly who owns the printer, and we need to keep it alive, while every device keeps working with it.

Component Object Model

The solution would require sharing the ownership between the consumers. How can we solve that? Microsoft pondered about this question and invented COM. Of course, I oversimplify because COM actually solves a lot more than simply sharing, but also hiding the implementation details, uniform representations, ABI control, cross-process communications, etc. But the one thing it does with it is the so-called reference counting. It counts how many objects own the COM interface and does the cleanup when all the consumers are out and refcount is 0.

Here the object itself is responsible for deleting itself and not the consumer, and the same function cleans the underlying memory, which is also responsible for releasing a reference called Release. Such a model is a bit confusing at first, but bearg in mind that it was invented even before C++98, in 1993 to be exact, everything comes into place. Does your complex object have some very hard destruction process aside from just delete? Maybe it is part of a memory pool and the memory should just go back to it? Fear not, interface->Release() got it for you. (The class should still implement the Release function; so, no magic here). The model is quite robust and is still in use by Microsoft to this day. Later iterations of Windows API included RAII wrappers, such as ATL CComPtr and CComQIPtr, WRL Microsoft::WRL::ComPtr and finally WinRT with winrt::com_ptr. All for their needs but with one purpose.

Explicit sharing

Well, of course, COM looked like a miracle back in the day. While it was a bit clumsy with implementation, it was doing its job. But what if we are not on Windows? You can still emulate COM and it will in fact work just as well. But implementing it is a nightmare for a regular programmer. C++11 added shared_ptr, which did the same job of sharing the data using atomic reference counting. It did not put the responsibility of destruction to the object, but called destruction himself. Also coming packed with custom deleter and array overloads (which were mostly missing for COM), it came with a particular function, which felt the same as make_unique. make_shared provided a way to construct a simple shared_pointer, but it is storing the reference counter with the object, when constructor allocates two blocks of memory; one for ref counter and one for the object itself.

Curse of sharing

So far, we have discussed only strong points, but what came with shared_ptr was a big problem. COM model enforced strict rules for marshalling and modification of the internal state, as well as concurrent access. Also COM always exported only interfaces, so the state stability was on the developer of the implementation. But shared pointer didn’t do anything in that regard, leaving room for a lot of bugs and exploits, that came along.

What is the problem? As we already know from Value Semantics, sharing a reference is bad, and often leads to entanglement and fragility of the code. There is one particular case, that I have seen in practice:

struct B;
struct A{
...
std::shared_ptr<B> b;
}
struct B{
...
std::shared_ptr<A> a;
}

I can construct either of those, but let’s choose A:

int main(){
std::shared_ptr<A> a = std::make_shared<A>();
std::shared_ptr<B> a = std::make_shared<B>();
a->b = b;
b->a = a;
}

Now, tell me, who owns who? What will happen after the main()?

The memory obviously will leak since we have created a quantum entanglement. This is obvious but the real example may not be, what if A and B are connected through several other classes? Debugging such a leak is borderline impossible! Of course, there is weak_ptr which solves the deal, but it is still hard to twist your mind around that. To fix the problem we need to:

struct B;
struct A{
...
std::shared_ptr<B> b;
}
struct B{
...
std::weak_ptr<A> a;
}
int main(){
std::shared_ptr<A> a = std::make_shared<A>();
std::shared_ptr<B> a = std::make_shared<B>();
a->b = b;
b->a = a;
}

Now everything is going to be deallocated after the main ends.

Conclusion

unique_ptr is a solid tool to ensure an object is only used in one place at a time. Great for value semantics, it does not break anything, robust. shared_ptr, on the other hand, is bad… Just kidding! Although the menace is lurking around, if you are sure, the state of underlying object is immutable, exempli gratia: a pool of shared textures for a game, it is fine. It is a bad design to share state, and it must be done sparingly and only in cases, where it is absolutely necessary! weak_ptr does not share state, and is a great helper to brake strong bonds, although I should mention costs of checking if the allocated object is alive. This may kill the performance – hence, still no magic solution.

Now we have made our way as close to the coroutine world as possible, we are ready to use the knowledge to our advantage and build predictable code. And remember unique – good, shared + immutable also good.

Enjoy advanced object lifetime!

DevStream

Scientific Applications & Development News