Memory allocation strategies: a linear allocator

During the next few weeks, I’d like to detail how the memory allocators inside Molecule work, starting with a simple, non-growing linear allocator today in order to cover some base first.

If you haven’t done so already, now would be a good time to read up on Molecule’s memory system and its inner workings: Part 1, Part 2, Part 3, Part 4, Part 5.

In case you are not interested in all the details, here’s how the memory system works in a nutshell:

  • Instances are always allocated/deleted via ME_NEW/ME_DELETE, respectively.
  • A so-called memory arena takes care of allocating/freeing memory and properly constructing instances or arrays thereof.
  • The memory arenas are also responsible for handling debugging facilities like bounds checking and memory tagging which can be configured using policies.
  • Internally, a memory arena will always ask one of the allocators when allocating/freeing raw memory.

This means that any of the different allocators follows a very simple interface, and never needs to know about whether bounds checking, memory tracking, or something similar is enabled or not. Its only job is to allocate/free raw memory.

That being said, the interface of the linear allocator is the following:

class LinearAllocator
  explicit LinearAllocator(size_t size);
  LinearAllocator(void* start, void* end);

  void* Allocate(size_t size, size_t alignment, size_t offset);

  inline void Free(void* ptr);

  inline void Reset(void);

  char* m_start;
  char* m_end;
  char* m_current;

There’s a few things to notice here:

  • Allocate() and Free() are not virtual methods. Allocators can either be used as part of a memory arena, or stand-alone for allocating raw, “untyped” memory, e.g. for per-frame GPU command-buffers. In the former case, memory arenas simply take allocators as one of their policy-based template parameters. In the latter case, there’s no virtual function overhead for an operation as simple as bumping a pointer.
  • One of the constructors takes a memory range [start…end) in which it will allocate memory. This can either be stack-memory (fixed-size or allocated via alloca()), or heap-memory allocated via e.g. VirtualAlloc(). This allows us to use containers and other data structures with both heap memory as well as scratchpad memory on the stack, which is a very useful feature to have.
  • The other constructor internally uses a page allocator when allocating physical memory for the given size. We will talk about virtual memory and physical page allocators in one of the next posts, discussing API functions like VirtualAlloc().
  • The memory region handled by this type of allocator can never grow.
  • A linear allocator cannot free individual allocations, but rather frees them all at once upon calling Reset(). Free() is essentially an empty function, and Reset() just resets the internal pointer member to the start.

The only thing really left to discuss is the Allocate() method. As can be seen, it takes a third argument called offset which might leave you wondering what exactly that is for.

The reason for this is simple: of course we want to be able to allocate memory of any given size with arbitrary alignments, but sometimes we don’t want the memory returned by the allocator to be aligned, but rather aligned to some offset because we internally need to add book-keeping data or other things like border bytes when bounds checking.

Let’s take bounds checking as an example to illustrate the situation. Suppose we want to allocate 120 bytes, aligned to a 16-byte boundary. Without bounds checking, a memory arena would simply just call linearAllocator.Allocate(120, 16, 0). With bounds checking, we want to add a border of 4 bytes both to the left and right of the allocation, which increases the size to 128. Furthermore, we want to ensure that the memory handed to the user is aligned to 16 bytes. Without the additional offset, we would get 128 bytes aligned to a 16-byte boundary, add our 4 bytes of bounds checking border, and return the resulting pointer to the user, destroying the alignment in the process. That’s what the additional offset parameter is for: in such a case, we can simply call linearAllocator.Allocate(128, 16, 4) which is the same as saying “allocate 128 bytes, and make sure that the returned pointer + 4 is aligned to a 16-byte boundary”.

Eventhough such alignment restrictions could have been handled inside the memory arena in a generic way, I decided to let the allocators take care of that, because it leads to less wasted memory because of alignment, and can be implemented rather easily in most cases.

void* LinearAllocator::Allocate(size_t size, size_t alignment, size_t offset)
  // offset pointer first, align it, and offset it back
  m_current = pointerUtil::AlignTop(m_current + offset, alignment) - offset;

  void* userPtr = m_current;
  m_current += size;

  if (m_current >= m_end)
    // out of memory
    return nullptr;

  return userPtr;

All we have to do is add the offset first, align the resulting pointer, and subtract the offset again. This ensures proper alignment with the least amount of wasted space.

That’s basically all there is to implementing a simple linear allocator.

In the next posts, we will take a look at things like a simple stack-based (LIFO) allocator, how to implement growing allocators, virtual memory/page allocators, and pool allocators.

28 thoughts on “Memory allocation strategies: a linear allocator

    • Thanks!
      I’ve been a bit swamped with work lately, but can now completely focus on Molecule again. There’s lots of things I want to write about, especially memory stuff and a bunch of other low-levelly things.

  1. Pingback: Memory allocation strategies: a stack-like (LIFO) allocator | Molecular Musings

  2. Great blog, I must say. It’s really helped my understanding of the lower-level side of things.

    With regards to your memory arena and allocator coverage, I’m curious about some of the implementation details. To be specific, it’s regarding some of the allocator class methods and construction you have described for your policy-based memory arena design.

    In this article you mentioned the ‘Linear Allocator’ can only call ‘Reset’ to clear memory. How does this fit in when the ‘Linear Allocator’ is used as the allocation policy for an arena, where you have just the singular arena interface and inaccessible private members? I’m pondering the same thing of the ‘Purge’ method from your more recent growing allocators. Similarly, the constructor in the arena only takes an ‘area’ start and finish (“m_allocator(area.GetStart(), area.GetEnd())”), whereas the ‘free list’ ‘Pool Allocator’ seemingly requires additional constructor arguments (‘object size’, ‘maximum alignment’) for free list construction. The same goes for the growing stack allocator with differing constructor arguments again.

    Unless I’m taking the code examples all too literally, I’m finding myself in a state of minor confusion. Perhaps I’ve missed something quite obvious. Thanks in advance.

    • No, you are not missing anything obvious. I have changed the memory arena implementation slightly, so that it simply holds a pointer to an allocator now:

      template (WordPress ate my template params)
      class MemoryArena : public MemoryArenaBase
      MemoryArena(AllocationPolicy* allocator, const char* name);
      AllocationPolicy* m_allocator;

      This makes it easier to use different allocators with vastly different constructor arguments in conjunction with memory arenas. Furthermore, it still keeps allocators and arenas separate, so you are free to call any method offered by the allocator as you wish. As an example, a LinearAllocator is currently used like this:

      core::HeapArea heap(someHeapSize);
      core::LinearAllocator linearAllocator(heap.GetStart(), heap.GetEnd());
      LinearArena sampleArena(&linearAllocator, “A descriptive name for the arena”);

      Whenever you want to reset the allocator, simply call linearAllocator.Reset(). Similarly, you can call growingStackAllocator.Purge() whenever you want it to purge its freed allocations.
      Does that make things more clear?

      • Thanks for the clarification, this makes it more clear. However, I can see a minor drawback with this method – are your debugging facilities in the arena still valid if you simply reset the allocator? I suppose, that these would be presented as memory leaks, so tracking allocations using such combination would be a little more inconvenient (if possible at all). Did you try to resolve this in any way?

      • Yes, they are still valid. The point of using arenas is that they handle construction/destruction as well (using ME_NEW and ME_DELETE, respectively). So if you were to allocate something using ME_NEW, call Reset() on the allocator, but never call ME_DELETE (or Free()), the arena would report unfreed allocations. Even though the actual raw memory might be usable again, this means that the destructor of certain objects was never called, hence it is reported.

        The standard modus operandi for the linear allocator is the same as for every other allocator used with an arena: Each call to ME_NEW needs to have a matching call to ME_DELETE somewhere.
        If you use a linear allocator without an arena instead, you can call Allocate(), Free() and Reset() freely, but need to make sure it is done correctly. I could add a simple mechanism to the linear allocator that ensures that Reset() can only be called if each allocation has been freed, similar to what is implemented for the StackAllocator (check ME_ENABLE_STACK_ALLOCATOR_CHECK in the SDK docs).

  3. Hi,
    first off, thank you for such an amazingly detailed series in such a important topic. It’s been a world of help.

    Now to my question.
    In the MemoryArena::Free that you mentioned in your earlier post ( memory-system-part-5 ) there is a call to m_allocator.GetAllocationSize(originalMemory) which assumes there is some internal bookeeping to keep track of the size of each allocation done in any allocator.
    I haven’t seen this function in any of the allocator interfaces that you’ve uploaded so far, and it would be highly interesting to have some insight on what you did there, as half the workings of the Free function depend on what the GetAllocationSize returns.

    Thank you in advance for your help Stefan, and keep up the awesome work.

    • Thanks Jaime!

      Well, how GetAllocationSize() is implemented is up to the allocator. Here’s a quick breakdown of what individual allocators do internally:
      – Linear and Stack Allocator: Allocate an extra 4 bytes for each allocation, and store the size in the first 4 bytes, right before the pointer handed to the user.
      – Pool Allocator: The size is trivially known and doesn’t need to be stored somewhere.
      – Micro Allocator: Forwards the call to the correct pool that was used for the allocation. I’ll talk about Micro Allocators in a future blog post.
      – General Purpose Allocator: Stores the size in the allocation header.

      In code, it looks like this:

      inline size_t LinearAllocator::GetAllocationSize(void* allocation) const
      void* as_void;
      char* as_char;
      size_t* as_size_t;

      // grab the allocation's size from the first N bytes before the user data
      as_void = allocation;
      as_char -= sizeof(size_t);
      return (*as_size_t);

      Hope that answers the questions you had!

      • Yes. Thank you so much.
        I actually implemented it last night for the linear allocator in a very similar way to the one you described.

        Thanks again!

      • Hi.

        Does this function presume allocation pointer is 4B aligned?

        linearAllocator.Allocate(16, 2, 0) will return a pointer that is 2B aligned and if query
        GetAllocationSize is made with this returned pointer, would not this try to dereference size_t variable on 2B aligned address and crash on some platforms?

      • The function itself does not assume any alignment, and you can deal with alignment in several ways.
        One way would be to enforce a minimal alignment (e.g. 8 or 16) on platforms that do not allow reads/writes from/to arbitrarily aligned addresses, and take whatever alignment the user wants on all other platforms.

      • Sorry one last question regarding alignment. I won’t spam anymore, atleast not in this topic 😉

        Lets say that user for whatever reason, wants to allocate 128 bytes, with the offset 6B. So in this case the returned ptr from the function + 6B will represent 4B aligned address, e.g linearAllocator.Allocate(128, 4, 6). So far so good. Allocator also need to store allocation size of 4B right in front of the memory handed to the user. In this case pointer returned to the user is not 4B aligned(+6B is aligned), and with this pointer GetAllocationSize is called, to query the allocation size, which can create problems due to alignment.

        I’m probably missing something here.

      • You’re not missing anything, I’d simply chalk this up to ‘user-error’ :).
        The offset is mostly there to account for debug features such as bounds-checking, to ensure that as little memory as possible is wasted due to alignment.
        If a user comes and asks for an allocation that is to be aligned at ‘address + 6 bytes’, then so be it.

  4. Hi Stefan,
    Thanks for the amazing series. I’m following it an learning a lot. It’s really useful as I am making a small engine to improve my skills and I found your design choices really interesting and clever.
    However, I have encountered a problem while writing the allocator. For memory alignment I’ve seen you are using a particular function: alingTop. I have done some research about writing a function like yours but I’ve found little information. Could you be so kind to explain how to implement one?

    Thank you again,

    • Hi Marco,

      You just need to align a memory address to an upper power-of-two boundary. It’s the same as rounding up an integer, but applied to arbitrary pointers.

      Rounding integers up/down can easily be accomplished using bitwise operators when rounding to powers-of-two. I’m sure with that you can figure it out – watch for corner cases, though :).

      • Hi Stefan,
        thanks for the answer 🙂 I wrote a routine now that seems to work quite well! Thanks for the hint again and please keep posting about the development of you engine!
        All the best,

  5. Pingback: Translating a human-readable JSON-like data format into C++ structs | Molecular Musings

  6. Pingback: Stateless, layered, multi-threaded rendering – Part 4: Memory Management & Synchronization | Molecular Musings

  7. Pingback: Temporary allocations in C++ | Prog stuff

  8. Pingback: Coherent Labs » Temporary allocations in C++

  9. Once again a fantastic post 🙂 Stupid question; why is the [start… end) interval exclusive at end ?

    • Because it’s coherent with how zero-based addressing works.
      As an example, if you have an array of size 10, access to elements [0]-[9] is valid, but not [10].

      Similarly, if you allocate space for 100 bytes and your pointers sit at (start) and (start+100), the byte at (start+100) can *not* be addressed.

      • Ah thx; so when specifying an area’s size for an integer, you would write for example:
        HeapArea heapArea(sizeof(int) + 1);
        MemoryArena area(heapArena);
        int* data = ME_NEW(int, arena);

        Right? If I set the heap’s size to sizeof(int), msvc throws an exception.

      • Without knowing how your HeapArea and MemoryArena is implemented, there is not much I can help you with?
        Passing sizeof(int) + 1 is suspicious, though. What do you need the extra byte for? Doesn’t make sense to me.

    • I don’t want to sound harsh, but I’m not going to look for bugs in your code, given the fact that most of it is a 1:1 verbatim copy of code posted on my blog, without giving any attribution, credit, link, or similar in your Comet Engine code hosted on Github.

      That being said, if you’re storing the allocation size somewhere, there won’t be space for an int if you allocate only sizeof(int) + 1 bytes – where should the allocation size go?
      Or your alignment might be off. Or your pointer arithmetic.
      There are many things that can go wrong, but I’m sure you can figure it out by using a few printfs or the debugger. The linear allocator should be very easy to debug.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.