This tackles a similar issue to D1739, but for components, and also it's actually faster.
Currently, the ComponentManager creates component via AllocFunc -> new, then stores pointers in a bunch of places. The key container is m_ComponentsByTypeID.
Instead of allocating these components with new, it's better to use an arena-like allocator. It's probably better to store components of the same type next to each other than the ones of a same entity together.
This introduces a container that's also an allocator, guaranteeing pointer stability & fast access. It's essentially an std::deque<CCmpWhatever> backend with an EntityMap<IComponent*> (rather equivalent to std::unordered_map<entity_id_t, IComponent*>) interface. Except that C++ typing needs this at runtime.
It combines a few advantages:
- Fast random access, one indirection, similar to a vector. The performance of unordered_map might be better with very high sparsity, I'm not sure.
- Very good cache locality, as the data is mostly dense (free slots are reused), making for very fast iteration, and generally good performance when accessing the same components of different entities (good for messaging or and various other usage). unordered_map would be slower at iterating, even with arena-backed data, as the iterator isn't in the same place as the data, and there's no guarantee it's iterated in key order. EntityMap would, but it's also sparser, which is bad, particularly for large data.
- Neat interface, don't have to store a separate allocator (this is slightly arguable, but I think it's neater like that).
- Like our current approach, pointers are stable, so the component cache can stay the same.
Unfortunately this changes hashes, since the iteration order is not the same as SVN. But on a simple test like just opening COmbat-Demo-Huge and doing nothing, we can measure the raw message overhead, and I'm seeing about 10% improvement in SimUpdate. I would expect the game to be generally slightly faster, particularly as time goes on.
TODO:
- Verify that the sparse key index is actually faster than unordered_map, as with some sparsity that might actually change.
- Run some more profiling
- Add tests.