In today’s post, we will finally take a look at the last remaining piece of the new job system: adding dependencies between jobs.
I recently had the need to retrieve the type of a template argument as a human-readable string for debugging purposes, but without using RTTI – so typeid and type_info were out of the question.
Continuing from where we left off last time, today we are going to discuss how to build high-level algorithms such as parallel_for using our job system.
This week, we will finally tackle the heart of the job system: the implementation of the lock-free work-stealing queue. Read on for a foray into low-level programming.
As promised in the last post, today we will be looking at how to get rid of new and delete when allocating jobs in our job system. Allocations can be dealt with in a much more efficient way, as long as we are willing to sacrifice some memory for that. The resulting performance improvement is huge, and certainly worth it..
Back in 2012, I wrote about the task scheduler implementation in Molecule. Three years have passed since then, and now it’s time to give the old system a long deserved lifting.
The last post of this series basically concluded with the following questions: how do we efficiently allocate memory for individual command packets in the case of multiple threads adding commands to the same bucket? How can we ensure good cache utilization throughout the whole process of storing and submitting command packets?
This is what we are going to tackle today. I want to show how bad allocation behavior for command packets can affect the performance of the whole multi-threaded rendering process, and what our alternatives are.