On my gitlab-repo i implemented the change for only one component, It's easyer to review.
All performance tests are made on a non-visual replay of "Combat Demo Huge Auto" from trainer tools.
Store system components directly inside CComponentManager instead of in the map.
I tried to improve memory locality (reduce cache-misses). I can't compare the cache-miss results (I have to learn the tool a bit bether ;))
For me the median Frame time improved from 56ms to 52ms +/- 2ms
D4844 Does implement the idea for a non-system component. This POC uses the system components because they were simpler to port and I assume system components are accessed more ofter resulting in a bigger performance difference.
There could be future optimisation:
- make somthing coresponding to CmpPtr without having to query the interface. (QueryInterface is currently the function taking the most cpu-cycles) see this commit
- Get rid of interfaces and template HandleMessage on the type of message.