Memory system – Part 1 | Molecular Musings

Before we can really delve into the inner workings of the Molecule Engine’s memory system, we need to cover some base ground first. Today, we’re taking a very thorough look at new, delete, and all their friends. There are some surprising subleties involved, and judging from the interviews I conducted, sometimes even senior level staff messes up questions regarding the inner workings of new and delete.

To keep things simpler, we don’t deal with per-class new and delete, and we don’t want to mess with exceptions either.

new operator / operator new

Let’s look at a simple statement involving new:

T* i = new T;

This is the simplest form of the new operator. What does it really do behind the scenes?

First, a call to operator new is made to allocate storage for a single T.
Second, the constructor for T is called which constructs the new instance of T at the memory address returned by the previous call to operator new.

If T is a fundamental type (e.g. int, float, etc.), or does not have a constructor, no constructor will be called. The above statement will call the simplest form of operator new:

void* operator new(size_t bytes);

The compiler will automatically call operator new with the correct size for a given type, which is sizeof(T) in this case.

So far, so good. But we’re not finished with the new operator yet. There’s a second version of the new operator, with so called placement-syntax:

void* memoryAddress = (void*)0x100;
T* i = new (memoryAddress) T; // placement new

This can be used to construct instances of classes at a certain place in memory, which in essence is the only way of calling constructors “directly”. No memory allocation is involved here – the above calls a different overload of operator new, which is the following:

void* operator new(size_t bytes, void* ptr);

This form of operator new does not allocate any memory, and just returns the pointer.

The placement-syntax of the new operator is very powerful, because we can add our own overloads of operator new. The only rule is that the first argument to every operator new must always be of type size_t, which will automatically be passed to it by the compiler.

Let’s look at an example:

void* operator new(size_t bytes, const char* file, int line)
{
  // allocate bytes
}

// calls operator new(sizeof(T), __FILE__, __LINE__) to allocate memory
T* i = new (__FILE__, __LINE__) T;

Leaving differences between global operator new and class operator new out of the equation, every use of the placement form of the new operator boils down to the following internally:

// calls operator new(sizeof(T), a, b, c, d) to allocate memory
T* i = new (a, b, c, d) T;

Which is equivalent to:

T* i = new (operator new(sizeof(T), a, b, c, d)) T;

The magic of calling operator new is simply done by the compiler. Furthermore, every overload of operator new can be called directly (if we want to do so), and we can do whatever we want with the different overloads. We can even use templates, if we want:

template <class ALLOCATOR>
void* operator new(size_t bytes, ALLOCATOR& allocator, const char* file, int line)
{
  return allocator.Allocate(bytes);
}

This comes in handy later when we’re about to use different allocators. With the placement-syntax, memory using different allocators can be allocated with e.g. the following statement:

T* i = new (allocator, __FILE__, __LINE__) T;

delete operator / operator delete

Calling the delete operator on a previously new‘ed instance will first call the destructor, and then operator delete. There’s a difference to new, however: Regardless of which form of new we used to create the instance, the same version of operator delete will always be called:

// calls operator new(sizeof(T), a, b, c, d)
// calls T::T()
T* i = new (a, b, c, d) T;

// calls T::~T()
// calls operator delete(void*)
delete i;

The only time the corresponding operator delete is called by the compiler is when an exception is thrown inside operator new, so the memory can correctly be freed before the exception is propagated to the calling site. This is also the reason why every overload of operator new must always have a corresponding version of operator delete. But let’s not digress.

Like operator new, operator delete can be called directly (if we want to do so):

template <class ALLOCATOR>
void operator delete(void* ptr, ALLOCATOR& allocator, const char* file, int line)
{
  allocator.Free(ptr);
}

// call operator delete directly
operator delete(i, allocator, __FILE__, __LINE__);

However, do not forget that the destructor is called by the delete operator, not operator delete. Hence, in the above example, the destructor needs to be called manually:

// call the destructor
i->~T();

// call operator delete directly
operator delete(i, allocator, __FILE__, __LINE__);

If instances are created wih the simple placement-form of new, the destructor must always be called manually. Using delete on such an instance would invoke undefined behaviour (because the memory was never allocated with a call to new).

So far, we’ve only dealt with the non-array versions of new and delete.

new[] / delete[]

Ah, this is where the fun begins! Most people don’t realize it, but in something so fundamental such as new[] and delete[], there’s already compiler magic involved. The C++ standard just mandates what new[] and delete[] should do, but not how.

Let’s start with a simple example:

int* i = new int [3];

The above allocates storage for 3 ints by calling operator new[], and since int is an integral type, there’s no constructors to call. Like with operator new, we can overload operator new[] and use placement-syntax as well:

// our own version of operator new[]
void* operator new[](size_t bytes, const char* file, int line);

// calls the above operator new[]
int* i = new (__FILE__, __LINE__) int [3];

The behaviour of delete[] and operator delete[] is the same as with delete and operator delete. We can call operator delete[] directly, but must make sure to call the destructors manually.

But what happens with non-POD types?

struct Test
{
  Test(void)
  {
    // do something
  }

  ~Test(void)
  {
    // do something
  }

  int a;
};

Test* i = new (__FILE__, __LINE__) Test [3];

Even though sizeof(Test) == 4, our version of operator new[] will get called with an argument of 16 bytes. Why? Think about how the array needs to be deleted:

delete[] i;

The compiler must somehow know how many instances of type Test are to be deleted – otherwise it can’t call the instances’ destructors. So what almost every compiler does upon a call to new[] is the following:

For N instances of type T, request an allocation for sizeof(T)*N + 4 bytes from operator new[].
Store N in the first 4 bytes.
Construct N instances using placement new, starting at ptr + 4
Return ptr + 4 to the user.

The last bullet point is especially important: If your overload of operator new[] returns the memory address 0x100, the instance Test* i will point to 0x104! The memory layout of the 16 bytes would then be:

0x100: 03 00 00 00    -> number of instances stored by the compiler-generated code
0x104: ?? ?? ?? ??    -> i[0], Test* i
0x108: ?? ?? ?? ??    -> i[1]
0x10c: ?? ?? ?? ??    -> i[2]

When delete[] is used later on, the compiler inserts code which reads the number of instances N by going back 4 bytes from the given pointer, and calls the destructors in reverse order – if the type to be deleted is non-POD. Otherwise, there’s no 4 byte overhead added because no destructors need to be called (like in the new int[3] example above).

Unfortunately, this compiler-defined behaviour causes problems when using our own overloads for operator new, operator new[], operator delete, and operator delete[]. Even though we can call operator delete[] directly, we somehow need to figure out how many destructors to call (if any).

Which we can’t.

The reason is that we can never be sure whether the compiler inserted some extra 4 bytes in the allocation or not. This is totally compiler-dependent. It might work, but it could also horribly break with some user-defined types. And other compilers could do it differently altogether.

This is also the reason why using delete on instances allocated with new[] will most likely crash your code, and vice versa. The compiler-generated code simply tries to access memory which doesn’t belong to him (using delete[] for allocations via new), or not all instances of an array are correctly destructed (using delete for allocations via new[]), or else.

However, with the knowledge of what happens behind the scenes with calls to new, new[], delete and delete[], we can build our own allocation functions which correctly handle simple and array allocations for all types, can use our custom allocators, provide additional information like file name and line number, and more. The next post in this series will show how.

In the meantime, make sure to read the following articles as well, which also explain the concept of global operator new and class operator new:

new (C++)
delete (C++)

Molecular Musings

Development blog of the Molecule Engine

Memory system – Part 1

4 thoughts on “Memory system – Part 1”

Leave a comment Cancel reply

Share this:

Related

4 thoughts on “Memory system – Part 1”

Leave a comment Cancel reply