In performance-sensitive applications like games it is crucial to access data in a cache-friendly manner. Especially when dealing with a large number of objects of the same type, e.g. individual components in an entity-component-architecture, we should make sure to read as little data as possible. However, simple arrays-of-structures are often not suited for this, with structures-of-arrays yielding better performance. But the latter are not natively supported by the C++ language.
I’m proud to announce that the first evaluation SDK for our input technology is now available! A new version of the core technology has also been released, with some minor additions and improvements.
Check out www.molecular-matters.com for more information on the input library. Further SDKs will follow during the next few months.
One task that is pretty common in game development is to transform data according to some sort of hierarchical layout. Today, we want to take a look at probably the most well-known example of such a task: transforming joints according to a skeleton hierarchy.
After lots of work I’m proud to finally announce that the first evaluation SDK for our core technology is now available!
Check out www.molecular-matters.com for more information on the core library. Further SDKs will follow during the next few months.
Even though a task scheduler can help with alleviating the burden of having to distribute small pieces of work to different threads, it cannot help preventing a few issues common in multi-threaded programming, especially in multi-processor environments.
Continuing from where we left off last time, this post explains how parent-child relationships are handled inside the task scheduler, and how streaming tasks can be split automatically by the scheduler.
In this part of the series, we will discuss Molecule’s task model in detail, and have a look at the underlying C++ code and some subleties we need to watch out for, as well as some unique optimization opportunities.
With multicore hardware becoming the norm in both PC/console-based gaming as well as on mobile platforms, it is crucial to take advantage of every processor core and thread being thrown at us developers. Therefore, it is important to build technology alleviating the task of writing correct, multi-threaded code.
In order to achieve that, this series will try to explain how to build a load-balancing, work-stealing task scheduler.
Everybody knows that writing correct multithreaded code is hard, even when using proper synchronization primitives like mutexes, critical sections, and the likes. (Ab)using the volatile keyword for synchronization purposes makes a programmer’s life even harder – read on if you care to know why, and help spreading the word.
A fast content pipeline is crucial for today’s engines, allowing the user to quickly iterate on features, assets, etc. Molecule’s content pipeline enables hot-reloading of every asset the engine understands (textures, models, shaders, …), while retaining blazingly fast loading times.
Whenever you’re baking signals into texture maps, there’s quite some issues to watch out for, regardless of the signal you’re baking. This post will try to explain issues I ran into while implementing a baking framework.
I’ve recently had the need for a simple ray-tracer since I started working on a precomputed radiance transfer (PRT) baking tool. There’s multitudes of sample code, implementations and SDKs available, so I wanted to share my findings after a bit of research.
Eventhough gamma-correct rendering is absolutely crucial in order to get good and realistic lighting results, many programmers (sadly) still don’t care about it. Today, I want to show how different the visual results of lighting in gamma-space vs. linear-space actually are, and what you can do about it.
There’s a horrible disease out there, and a lot of programmers – both junior and senior – are affected by it: Singletonitis. Please, let’s help stopping the plague from spreading further by reading and understanding this post.
Let’s face it, performance on modern processors (be it PCs, consoles or mobiles) is mostly governed by memory access patterns. Still, data-oriented design is considered something new and novel, and only slowly creeps into programmers’ brains, and this really needs to change. Having co-workers fix your code and improving its performance really is no excuse for writing crappy code (from a performance point-of-view) in the first place.
Nowadays, with each platform (even mobiles) supporting some sort of SIMD registers and operations, it is crucial for every codebase/engine to fully utilize the underlying SIMD functionality. However, SIMD instruction sets, registers and types vary for each platform, and sometimes between compilers as well.
This blog post explains how a platform-agnostic SIMD implementation can be built, without having to port large portions of the codebase for each new supported platform.