Using runtime-compiled C++ code as a scripting language: under the hood

Some time ago, I announced that the Molecule Engine uses C++ as a scripting language. Today, I can share implementation details and a few additional tricks that were used to keep compilation times and executable sizes down.

Prerequisites

Why use C++ for scripting when there are many readily available scripting languages out there? Scripting languages boast a few advantages compared to compiled languages:

  • They are mostly dynamically typed. Scripters don’t need to worry about whether something is a string, an integer, a boolean, or something entirely different.
  • They are mostly interpreted, making it easier to reload changes on-the-fly.
  • Scripters don’t have to deal with pointers, references, and other language details.

Scripting languages also exhibit some disadvantages compared to compiled languages:

  • They are not as fast as native code, even when JIT-compilation is used. Additionally, JIT-code is not an option on several (console) platforms for security reasons.
  • Proper debugging needs a 3rd-party debugger, or you have to roll your own debugging tools.
  • Calling C++-code from script code (or vice versa) always needs some kind of wrapper/translation layer in between, which has to be implemented either manually or automatically (using e.g. SWIG). I find both methods tedious.

For scripting in Molecule, I wanted to have the best of both worlds:

  • Scripting code should be as fast as native code.
  • Scripters should never have to deal with low-level language details directly. They shouldn’t need to worry about pointers, references, ownership, memory management, etc.
  • It should be possible to reload script code in a fraction of a second, while the engine is running. I love short iteration times.
  • Debugging should be supported out-of-the-box, preferably by using the platform’s “native” debugger such as Visual Studio, gdb, SN Systems debugger on PS3, etc.
  • The language should be statically typed. In my experience, dynamically typed languages are great for writing code initially, but absolutely suck when you have to find out why some code in a 50kb script file won’t work.
  • Script code should have direct access to all engine code (if desired) without any translation layers in-between.

With a few tricks we can make C++ fulfill all of the above requirements.

How it works

Conceptually, the idea is simple: each script consists of a single C++ file that is compiled into its own dynamic library (e.g. .dll on Windows, .prx on PS3). Whenever the file contents change, we store the script state into memory, unload the library, compile the script, load the new library, and restore the script state from memory. Done.

On the engine side, a script is a class that implements an interface with a few virtual functions. The script system is responsible for updating all registered scripts each frame. The dynamic library exports two C functions responsible for creating and destroying a script.

The script interface is defined as follows:

class ME_NO_VTABLE_INIT ScriptBase
{
public:
  void SetupSharedLibrary(void);

  void SetEnvironment(const ScriptEnvironment& environment)

  virtual void Startup(void) ME_ABSTRACT;
  virtual void Shutdown(void) ME_ABSTRACT;	
  virtual void Serialize(ScriptSerializerBase* serializer) ME_ABSTRACT;	
  virtual void Update(float deltaTime) ME_ABSTRACT;

protected:
  gameScripting::LogBase* m_log;
  gameScripting::DebugDrawBase* m_debugDraw;
  gameScripting::WorldBase* m_world;
};

Whenever a script is created, the creation function initializes the shared library (more on that later), and then the script system sets up the environment – this initializes the protected members which are used by each script to access exposed engine functionality.

This gives us a statically typed language, native debugging using our favorite debugger, the performance of native code, and fast iteration times if we manage to keep the compile times down. Additionally, we can directly access the full C++ codebase and all engine code by linking with the engine libraries.

As you may have guessed already though, there are quite a few things to watch out for in order to make the system work efficiently.

Avoiding low-level language details

Depending on how your engine is setup, this can either be trivial to accomplish, or result in a lot of work. Because Molecule only uses IDs and handles to identify components, entities, and the likes, I did not have to come up with another solution for hiding raw pointers from users. Scripts use the exact same types like regular C++ engine code does, e.g.:

graphics::MeshComponentId mesh = m_world->AddMeshComponent(m_meshEntity[i], math::MatrixIdentity(), "cube", "floortiles");

In case the engine code still identifies assets using raw pointers, I would highly recommend implementing a system that only ever returns opaque data to the user instead of pointers.

Compilation

Scripts are compiled using the platform’s native toolchain, e.g. on Windows I use cl.exe for compiling the C++ code. Molecule’s content pipeline uses a directory watcher to be notified whenever a file changes. As soon as a script modification has been detected, the content pipeline spawns cl.exe in a new process, and compiles the single .cpp file with the correct command-line options. Both the compiler to use as well as the options can be setup per script, similar to any asset consumed by the engine.

Instead of using the batch-file that ships with Visual Studio for setting up the compilation environment, I wrote my own that only does the minimal amount of setup work needed in order to use the compiler, vastly improving compile times. Additionally, each script includes only one file which makes all required engine parts available to the script. This include file uses a pre-compiled header, which further improves compile times.

A simple script such as the one used in the video now takes 0.3s to compile instead of >1 second, which includes spawning a new process, compiling the script, and reloading the script into the engine.

Exception handling

Scripters make mistakes. Programmers make mistakes. Therefore, it would be nice to have at least some kind of protection against mistakes that could bring the whole engine down. Molecule generally uses a custom exception filter that is used to catch things such as access violations, page faults, etc.

The idea is to install a custom exception filter when running the scripts, handle any exception inside the filter, and notify the system that the script “crashed” without having to close the main application. There is one caveat though: simply returning from the exception filter would take us back to the faulty script, which is not something we want to do. Instead, we throw a C++ exception inside the exception filter which is caught by the surrounding code in the script system, as in the following code:

const LPTOP_LEVEL_EXCEPTION_FILTER oldExceptionFilter = SetUnhandledExceptionFilter(&ScriptExceptionFilter); 

bool success = true;
try
{
  script->Update(deltaTime);
}
catch (const ScriptException&)
{
  ME_ERROR("ScriptSupport", "An exception has been caught, forcing the script to be disabled. See the log output for more details.");
  success = false;
}
catch (...)
{
  ME_ERROR("ScriptSupport", "An unknown error occurred, forcing the script to be disabled.");
  success = false;
}

SetUnhandledExceptionFilter(oldExceptionFilter);

The ScriptExceptionFilter captures the call stack and logs some info about the exception that occurred, and then simply executes throw ScriptException();.

Note that in order for this to work we need to compile this translation unit with C++ exceptions turned on. In shipping builds, we can simply call the script’s Update() function, not using any exception mechanisms at all.

Debugging

Capturing the call stack inside the exception filter only works when the correct symbols have been loaded, which can be achieved using SymLoadModuleEx. This works as long as we use StackWalk64 for capturing the call stack.

However, the debugger also needs to load the symbols which is done internally only when calling LoadLibrary. From what I gathered, LoadLibrary internally sends an event to the debugger, which is kind of an undocumented feature that could be simulated by calling RaiseException with the proper parameters.

In Molecule, I copy both the .dll and the .pdb to a temporary location, and load them using LoadLibrary in development builds. In shipping builds, the .dlls are embedded into the resource packages, and loaded from memory because the .pdbs are not needed.

Note that you need to compile the C++ scripts with the /PDBALTPATH option if you want LoadLibrary to find the correct .pdb file. Once everything is setup correctly and the debugger can find the .pdb file, you can set breakpoints and debug your scripts just like you would any other C++ code.

Global state

As you probably know, dynamic libraries can be a bit of a hassle if you have lots of global state, and things like singletons because those “live” inside the main application, as well as inside the dynamic library.

Fortunately, Molecule does not have even one singleton, and only a few global variables. Among those globals are things such as the head and tail pointers to the intrusive linked list of loggers, the string hash database, and the global D3D device. And that’s about it.

Nevertheless, we need to make sure that both the main application and all dynamic libraries access the same objects.

On Windows, we can use Named Shared Memory to store all global state from within the main application, which is later retrieved by each dynamic library upon startup in SetupSharedLibrary().

Optimizing script code size

During development, script code can access all exposed engine functionality via the interfaces stored as protected members. By using interfaces that are setup by the main application, the script does not need to link with all the engine code, but rather only with the C and C++ libraries. Calling a function from the interface is just a regular virtual function call, with the v-table being contained in the main executable already. Not having to link all libraries decreases the code size, and speeds up the compilation process.

But we can do better! Using dumpbin on the generated object file tells us that the script .dll still contains a lot of things we don’t need.

Earlier we said that each script .dll exports two C functions: one for creating the script, and one for destroying the script. We could use ordinary new and delete for that purpose, but that would pull in operator new and operator delete from the runtime libraries.

Instead, we can use a static buffer inside the .dll that is large enough to hold the script, and use placement new to initialize it. This gets rid of requiring an implementation for operator new and operator delete.

Additionally, Molecule scripts do not need static C++ instances, initterm(), atexit(), and a lot of other things supported by the C/C++ runtime. In fact, you do not need to link the C/C++ runtime at all if you are careful: by providing your own _DllMainCRTStartup function and some dummy symbols in order to keep the linker quiet, we only need to link with kernel32.lib – and nothing else.

After those optimizations, the .dll for the script shown in the video is about 3kb, which is less than the 6kb of script (C++ source) code.

Important to note is that you can still link with the engine libraries and use the engine directly as from within the rest of your code. This is especially useful when programmers need to help scripters or need to aid in debugging, and would like to use the engine’s visualization or debugging features. Simply add the C++ code, link the engine libraries (either by changing the compilation options or using #pragma comment(lib)), and presto: all the functionality is available without having to restart the application!

Remarks and conclusion

With the above optimizations employed, C++ can truly be used as a scripting language, offering all the benefits normally only available to scripting languages.

A student of mine has successfully implemented a similar system on the PS3 using .prx libraries (Thanks Niki!).

Advertisements

33 thoughts on “Using runtime-compiled C++ code as a scripting language: under the hood

  1. Why don’t you use __try/__except form of SEH? That way you can control the execution, but don’t have to use C++ EH.

    DWORD handler(DWORD code) {
    log(…);
    return EXCEPTION_EXECUTE_HANDLER;
    }

    success = true;
    __try {
    Update(dt);
    } __catch(handler(GetExceptionCode()) {
    success = false;
    }

    • Good point, that would also work.
      The reason why I use a custom exception filter is that I already had the implementation lying around (dealing with EXCEPTION_POINTERS), so I could just reuse 99% of the code, including stack tracing among other things.

  2. Hi, what would you do if you had to support a platform, where dynamic libraries are not supported (iOS)?

    • I would assume it doesn’t really matter since the benefits of hot-loading script code are a development consideration not a shipping one…

      You would simply link the script code into the executable statically for the shipping builds and only support shipping builds on that target platform (iOS). I assume this is exactly what Unreal Engine 4 is doing..

      • Yes, this is also what I would do.
        Granted, I didn’t give it that much thought for platforms like iOS because it’s not one of my target platforms.

  3. Pingback: Recommended Reading: RunTime C++ Scripting | Anthony's Scribbles

  4. What happens when the script is doing something that crashes the engine ? :), I guess the engine is not fully protected by the script’s bad doings. Is the script using only IDs when controlling stuff in the engine ?

    • Yes, the scripts only use IDs to access things in the engine, just like regular C++ code.
      As noted under the paragraph “Exception Handling”, the engine has a protection mechanism against bugs such as access violations caused by the script. Those will be caught, and the script will be disabled, without crashing the engine. The only way to crash the engine so that it needs to be restarted is if you somehow stomp the engine’s data structures.

      Instead of using an unhandled exception filter, you could also implement a system where the scripts are all started in a separate process, using RPCs to talk to the engine. This would completely protect the engine from scripts, and could be used in development builds.

  5. Hello,

    Nice stuff ! I have a question though, if multiple scripts are including the same C++ header, when the user change code in the header, how do you down which scripts to recompile ?

    Cheers !

    • Hi,

      The content pipeline has a dependency system in place: each asset uses certain options to be built, and can have an unlimited number of dependencies. Those dependencies are gathered each time an asset is compiled, and are stored in the asset build database. Among the dependencies, this build database also stores the timestamp of the file that was last compiled, and the version of the tool that compiled it. This database is used for fast incremental builds that only build what needs to be built. The dependencies themselves are gathered from the respective compilation tool, e.g. the tool that builds the scripts is responsible for parsing the include files. Similary, the tool that compiles shaders is responsible for parsing shader include files.
      For each incremental build, we can now simply consult the database, check all dependencies (+ their timestamp) of all assets and gather a list of assets that need to be compiled again, and only compile those.

  6. Hi there,

    I’m very enthusiastic about the current interest in runtime compiled c++ and looking to play around with it a little myself. There is one thing that I cant fathom which is how to best deal with dangling pointers to objects that have been “recompiled” and constructed.

    I’ve saw one example where the old objects are not updated and only new objects contain the changes. This seems elegant enough but doesn’t have the feature of seeing the changes update immediately but avoids the dangling pointers problem.

    Another where the objects do appear to update immediately. I guess the simple way would be object handles instead of pointers but that gives me performance concerns.

    How are you approaching this specific problem?

    Thanks
    DB

    • Hi Dominic,

      Using handles or IDs for referring to objects is not a problem performance-wise, as long as you make sure that system-wide updates can access all those objects contiguously. I wrote about this in detail in the following posts:
      Ownership
      Internal References
      External References

      Basically, all the user ever gets is an ID to an entity/component/object. The user can use this ID to manipulate that object, or get a raw pointer to the underlying instance which is guaranteed to be valid for one frame. The data referenced by the IDs is always stored contiguously, and used by the various systems internally.

      The Bitsquid Blog also has more info on this.

      Hope that helps,
      Stefan

  7. This is awesome stuff, thank you. I’m actually working on a real time sketching tool and have been searching for a scripting solution that meets requirements of speed, ease of use, iteration time, debugging, etc. The only issue is that I may need to sandbox the scripts to run untrusted code, and it seems I’d ultimately need to modify the compiler somehow to avoid pointers (I have opaque handles too, so I don’t need pointers.). For UI responsiveness I don’t really want to run the scripts in a separate process, but may have to. If you have any thoughts on this I’d appreciate hearing them. Would also be great if you eventually released a standalone component for this.

    • Can you elaborate a bit on what you mean by “untrusted code” in this context? Is this code your customers would write? Or is this some kind of 3rd-party code?
      If you really wanted to completely shield your executable from possible wrong-doings inside any C++ script code, you would need to run the scripts in a separate process, and possibly use IPC for talking to the main process.

      Releasing a stand-alone product for something like this is pretty hard, I would say. Why? Because it is a very intrusive process, and depends a lot on how the engine/codebase using it is structured. I know of a few engines which would give you nightmares if you ever tried to do something like this, because they rely so heavily on global state, singletons, and suchlike. For turning this into a product, I think it is much more feasible to do something along the lines of Indefiant’s Recode.

  8. Very interesting topic. There was already a question about iOS and runtime-compilation of C++ scripts. As I understand, on systems like android/iOS where we cannot start runtime compilation we should provide already compiled binaries (compile it with all source code) and put it to assets folder, so they will behave as usual script file (lua, Python, etc.). Is this correct?

    Thaks for article.

    • As I understand, on systems like android/iOS where we cannot start runtime compilation

      Compilation is never started on the target system itself.
      The asset pipeline compiles the code into a shared library (e.g. a DLL on Windows), which is then loaded by the runtime.

      and put it to assets folder, so they will behave as usual script file (lua, Python, etc.).

      If a platform does not support shared libraries, all the scripts (C++ code) are directly compiled into the main executable.

  9. Hello Stefan,

    I was really fascinated by your use of C++ as a scripting language and wanted to try my hand at implementing something similar in my own engine. The problem is I’ve reached a roadblock with my implementation. I’ve got my object files down to about 5 kb from 1 kb script files, cut out the C and C++ runtimes as well, however I’m at a loss at setting up the vtables for the Script base interfaces. I’m no wizard at C++ so I’ve been confused at how you’re able to call engine functions without linking the engine library, the only way I’m able to do that within my engine is by providing static function pointers, any attempt at calling member functions of my Scene for instance, results in undefined references up the wazoo. Still I was focusing mainly on the Script interface class and I’ve narrowed down the problem to the fact that the vtable of the derived Script class is undefined ( __cxxabiv1_class_info yadda yadda). Now I think I’ve found a way to resolve this issue, by compiling with -fno-rtti, but the problem is my game engine needs dynamic type information for certain component tricks as well as some multiplatform shader magic.

    My question is, do you use a similar method to -fno-rtti? I know you use MSVC, and I’m guessing that ME_NO_VTABLE_INIT masks __declspec(novtable) which I assume removes the vtable entry from the abstract base class but I’m unsure why this has to be done, and there really isn’t an equivalent in GCC/G++ without using -fno-rtti on the entire project.

    I’m also really really wondering how you’re able to call functions from the engine without linking and without getting undefined reference errors.

    Thanks in advance man

    • however I’m at a loss at setting up the vtables for the Script base interfaces. I’m no wizard at C++ so I’ve been confused at how you’re able to call engine functions without linking the engine library, the only way I’m able to do that within my engine is by providing static function pointers, any attempt at calling member functions of my Scene for instance, results in undefined references up the wazoo.

      As long as you use abstract interfaces from within the script code, you can call any function of a particular interface you like and don’t need to care about the underlying implementation. That’s the beauty of virtual functions and abstract classes. The implementation of an interface is always provided by the engine runtime, and set up once the script has been created.

      I’ve assembled a small example here: Gist

      My question is, do you use a similar method to -fno-rtti?

      I always compile everything without exceptions, and without RTTI. Still, to me this doesn’t explain the issue. Sounds like you’re trying to use something from the Script base class in your derived Script, but the base class implementation isn’t there. You can either implement those functions in the header, or include the .cpp (if you want), or link with just that part of the library.

      I know you use MSVC, and I’m guessing that ME_NO_VTABLE_INIT masks __declspec(novtable) which I assume removes the vtable entry from the abstract base class

      Indeed, I use __declspec(novtable) on the abstract base classes. The novtable is a bit misleading though, and the name of the macro hopefully makes it clearer: the abstract base class still has a vtable (it needs one, otherwise you could never call virtual functions on it), but it omits setting up a pointer to the vtable in the constructor of the base class. It’s a tiny MSVC-only micro-optimization.

      Let me know if you have more questions!

      • Ah, see what’s confusing me about the example is that on the DLL side you’re only calling functions through the interface which is completely understandable, but the ScriptInterfaceImpl is compiled with the rest of the engine, so how are you able to change the ScriptInterfaceImpl at runtime?

        I’m wondering is all of your engine’s functionality behind abstract interfaces? So you have, for instance, a GraphicsInterface and a PhysicsInterface etc., that would make much more sense.

      • so how are you able to change the ScriptInterfaceImpl at runtime?

        I don’t. Only scripts are runtime-compiled, not the interfaces or the engine code.
        Of course you could also do that, see Runtime Compiled C++.

        I’m wondering is all of your engine’s functionality behind abstract interfaces? So you have, for instance, a GraphicsInterface and a PhysicsInterface etc., that would make much more sense.

        No, not internally. The engine has very little (abstract) interfaces in the runtime code.
        But everything that I want to be exposed to scripters is accessible by using the provided interfaces, where the implementation of an interface merely calls engine functions. This cuts down compilation time for scripts significantly, and can be used more easily by non-programmers who write script code. Furthermore, with this approach your scripts don’t need to link against engine code.

        People who want to use the engine code directly can link against the engine’s libs, and don’t need to use the interfaces provided to a script. This is mostly for programmers writing scripts, or programmers helping with debugging script code.

      • Oh, also before I forget, I understand how to create a detour _DllMainCRTStartup@12, but the LoadLibrary function doesn’t return the HANDLE, just NULL. Is there anything specific DllMainCRTStartup should be doing other than returning a boolean

      • Is there anything specific DllMainCRTStartup should be doing other than returning a boolean?

        Depends on what you want to do in your scripts.
        Normally, this function is responsible for initializing the C and C++ runtime(s), calling constructors for static instances, etc. If you don’t need that functionality in your scripts (I don’t), then there is nothing you have to do. Just be aware of what’s not going to work then.

      • Ah ok, I’ve got things working now, thanks Stefan! I never knew using purely virtual functions meant you don’t have to link with engine code, I mean I knew the vtable basically used function pointers but I never thought it completely through. It’s pretty fascinating seeing it in action on my own! I’m looking forward to your implementation of parallel_for on your threading system, I’ve got a similar thread pool implementation to one described by Intel, and it has the skeleton of a work-stealing process ready but I was intimidated on how to tackle the problem effectively. Thanks for the help, and looking forward to future blogs!

  10. Awesome. I’m on my way on making my own lightweight game engine, mostly for school projects, and the way you’re solving design and implementation problems with C++ (a language which is, I think, often awful and complicated) is very inspiring ! I may have missed it but, how do your scripts access other scripts content ? Can a script in a game object call another script’s function in another game object ? You said each script include only ONE file, which is the engine interface, so I’m not sure… Thanks a lot !

    • Each script includes one file, which is the PCH included by all scripts. This PCH is lightweight and contains only the minimum amount of declarations and interfaces that scripts need in order to gain access to engine functionality.

      Those engine interfaces also give scripts access to the entity component system, so a script can grab whatever component (ID) it wants and invoke functions based on that. This would be how you access game objects and their components from within scripts.

      • That works fine for components such as meshes and transforms, but what about a script component ? Can I grab it and call its functions ? Or simply communicate with it ?

  11. Hey stefan! I’ve been implementing a similar system for an engine i’m writing. So far i’ve managed to get it to work on Mac OS X through Clang. Now i attempted to do the same on Windows and it seems to compile and load just fine, but my engine functions don’t seem to be getting called from the script. In the Update() method i called a simple Log function which is used by the engine, but nothing is happening. I’ve linked the DLL against the engine’s lib file. Is there something i’m missing?
    Some googling suggested that you need to use GetProcAddress(GetModuleHandle(NULL), “FunctionName”) to call functions on the executable. I shall try the ScriptInterface method also. What do you think?

    • GetProcAddress is used for getting functions exported from a DLL, not the other way around. Using the approach I described, you’d only call virtual functions on an interface, with the implementation of the interface (a concrete derived class) being provided by the engine.
      If you’re going to call engine functions directly by linking against the engine’s lib file, you need to make sure all state is setup correctly and you don’t have global instances living inside both the .exe and the .dll. What it is your log function is doing? And how is it implemented internally?

      • Thank you for your reply!!!
        I tried the ScriptInterface approach and everything seems to work now. My Log functions are now called in the concrete classes overridden methods, which seems to work great. I can’t seem to understand why it doesn’t work when linking the engine lib directly.
        As for the logger, it simply writes to a file using std::ofstream. The logger consists of a bunch of free functions and any state required by it is kept in a globle LoggerState struct.
        I also tried writing a temporary Log function that uses std::cout to log to the console. That didn’t work either.

      • I can’t seem to understand why it doesn’t work when linking the engine lib directly.

        When you link to the engine lib, you essentially duplicate your global state in each of your scripts/DLLs. But it probably is only set up correctly in your main executable, hence functions that require global state won’t work when being called from your DLL. I wrote about possible solutions in the “Global state” paragraph.
        I recommend reading up on .EXEs & .DLLs before going down that route.

      • Thank you for the reply!! I think i’m starting to understand things now! 😀 I’ll be sure to read up on those.
        Speaking of global variables, since your RenderBackend consists of free functions, i was wondering where do you store things such as your pool of Vertex and Index buffers. I’m assuming you have such a pool, since otherwise you can’t use Handles. On my engine i currently have a RenderDevice class which is equivalent to your RenderBackend. I store a single instance of the RenderDevice along with all the other classes that require global access into a struct named EngineContext (inspired by the Context class from Urho3D). An instance of the EngineContext struct is allocated during engine startup through a custom allocator. So technically there is only one global variable which is an EngineContext pointer. Is this a bad approach?

      • It’s not a bad approach per se, but depends a bit on *what* you store inside the EngineContext. Having an engine context that stores other instances by value is not bad, but I would advise against structs storing *pointers to instances* excessively, because that can lead to a lot of pointer chasing, e.g. g_engineContext->renderDevice->DoSomething().

        The setup I use is the following: render backend functionality lives inside a single namespace, split across several .cpp files. Inside each of these .cpp files I store what the engine needs to implement the functionality in that file, so there is no overarching render context or similar. Individual parts of the backend are initialized when the backend starts up, and free their buffers/pools when the backend is shut down.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s