All performance tests are made on a non-visual replay of "Combat Demo Huge Auto" from trainer tools.
Store system components on inside CComponentManager in stead of on the heap.
I tried to improve memory locality (reduce cache-misses). I can't compare the results (I have to learn the tool a bit bether ;))
For me the median Frame time improved from 56ms to 52ms +/- 2ms
Such a change can also be made for other(non-system) components. I used the system components because they were simpler to port and I assume system components are accessed more ofter resulting in a more significant performance difference.
There could be future optimisation:
- make somthing coresponding to CmpPtr without having to query the interface. (QueryInterface is currently the function taking the most cpu-cycles)
- Get rid of interfaces and template HandleMessage on the type of message.