A great deal of time consumption is taken by the Vertex Buffer. (As found by profiling) This change attempts to alleviate some of that overhead by lessening the times looping through large collections takes place.
Details
- Reviewers
vladislavbelov wraitii - Commits
- rP24834: Performance improvements to VertexBuffer.
Things continue to work?
Diff Detail
- Repository
- rP 0 A.D. Public Repository
- Branch
- temp
- Lint
Lint OK - Unit
No Unit Test Coverage - Build Status
Buildable 12720 Build 24702: Vulcan Build Jenkins Build 24701: Vulcan Build (macOS) Jenkins Build 24700: Vulcan Build (Windows) Jenkins Build 24699: arc lint + arc unit
Event Timeline
Hi @dm :)
Do not forget to add yourself to binaries/data/mods/public/gui/credits/texts/programming.json
Hey there,
Thanks for the patch. I've added you to the Contributors group so you should get autobuilds from now on. Your patch is missing context depending on how you generated it, you need to pass -U5000 for git or -x -U5000 for SVN.
Here are some comments to keep you waiting for a proper review.
Can you provide some graphs using profiler2 ? So we can see the difference?
Thanks for the patch.
source/renderer/VertexBuffer.cpp | ||
---|---|---|
233 | Use C++ style casts. | |
301 | Don't add braces for single line statements see https://trac.wildfiregames.com/wiki/Coding_Conventions | |
310 | Same here | |
311 | Missing final '.' | |
source/renderer/VertexBuffer.h | ||
164 | m_ModifiedSinceBind maybe m_Dirty ? |
{F1196112}I have attempted to run profile2 (before and after) here. (First time, with some difficulty, so I could have done something incorrectly.) I attempted to attach the "render" image here.
For comparison, I have been using gprof to compare change attempts.
Before this change, I am seeing something like this when I replay a previous multiplayer game:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total time seconds seconds calls ms/call ms/call name
23.37 859.39 859.39 976221851 0.00 0.00 CVertexBuffer::Bind()
Where after I see :
1.66 2004.42 53.99 1061132679 0.00 0.00 CVertexBuffer::Bind()
CPU utilization seems to be down, at least in the few instances I have monitored, and the multiplayer game seems to play with fewer time errors displaying.
source/renderer/VertexBuffer.cpp | ||
---|---|---|
1 | Year as well. | |
72 | Why here and not in the construction list? After m_Target(...). | |
180 | You can write it as iter = std::prev(m_FreeList.erase(iter));. | |
306 | You don't need needReset, you might do the reset in a else branch of the if (needUpload). | |
source/renderer/VertexBuffer.h | ||
1 | You need to update the year. | |
105–109 | { on a new line. | |
164 | Maybe m_HasNeededChunks? |
With more testing, in high number deathmatch, there was still high usage in CVertextBuffer::Release (gprof)
% cumulative self self total
time seconds seconds calls ms/call ms/call name
23.15 274.51 274.51 9109564 0.03 0.03 CVertexBuffer::Release(CVertexBuffer::VBChunk*)
There seems to be fragmentation of the buffer over time. This new change attempts to reuse buffers of the same size, to reduce the free list count. In addition, this switches from std::list to std::deque. The % time spent in this method is greatly reduced:
1.10 645.87 11.75 12765372 0.00 0.00 CVertexBuffer::Release(CVertexBuffer::VBChunk*)
before and after profiling images are attached replaying the same saved game.
Main thread beforeMain thread after
Render comparison
Render report section
The game is still running correctly. I recognize an performance improvement up to 10%.
It seems to me you don't actually need m_Neededon the chunks since m_HasNeededChunks makes it redundant. Further, I would actually suggest m_NeedsUpload as that appears to be what this does.
It seems like VertexBuffer is a further level down from VertexBufferManager, which also allocates buffers in an std::list. We appear to have a double searching-logic, where we first find a VertexBuffer with room in it, then a Chunk inside the vertex buffer itself.
I feel like this duplicated logic is somewhat un-necessary and we could clean it up in just having chunks in the vertexBufferManager, but I don't have too much time to look into this for now.
If you have the motivation & time, you can look into that direction. From a cursory reading, we don't actually create too many VBs, mostly particle emitters and overlay renderers would be inefficient.
With that being said, this seems like an incremental improvement, so I would agree with committing this is you don't have said time.
source/renderer/VertexBuffer.cpp | ||
---|---|---|
132 | From what I can tell you already have the iterator here in most cases, so you should be able to avoid doing a find call. |
Slight rebase & cleanup.
I'm not entirely sure how this was profiled (I don't notice much of a difference, but perhaps it only starts mattering after a long while), however the graphs above seem rather legit.
Will commit shortly.
Successful build - Chance fights ever on the side of the prudent.
Linter detected issues: Executing section Source... source/renderer/VertexBuffer.cpp | 180| » » » iter·=·std::prev(m_FreeList.erase(iter)); | | [MAJOR] CPPCheckBear (eraseDereference): | | Iterator 'iter' used after element has been erased. source/renderer/VertexBuffer.h | 55| class·CVertexBuffer | | [MAJOR] CPPCheckBear (syntaxError): | | Code 'classCVertexBuffer{' is invalid C code. Use --std or --language to configure the language. Executing section JS... Executing section cli...
Link to build: https://jenkins.wildfiregames.com/job/docker-differential/2855/display/redirect
source/renderer/VertexBuffer.cpp | ||
---|---|---|
180 | From cppcheck above Iterator 'iter' used after element has been erased. |
source/renderer/VertexBuffer.cpp | ||
---|---|---|
180 | seems spurious. erase() returns a new iterator to the element that followed, and that ought to be std::pref-callable, and then we just assign to iter. |
I am restarting the build here to pass the fixed cppcheck on this. Sorry in advance if there is a false negative..
Build failure - The Moirai have given mortals hearts that can endure.
Link to build: https://jenkins.wildfiregames.com/job/docker-differential/2924/display/redirect
Successful build - Chance fights ever on the side of the prudent.
Linter detected issues: Executing section Source... source/renderer/VertexBuffer.cpp [ 180] » » » iter·=·std::prev(m_FreeList.erase(iter)); **** CPPCheckBear (eraseDereference) [Section: Source | Severity: MAJOR] **** ! ! Iterator 'iter' used after element has been erased. Executing section JS... Executing section cli...
Link to build: https://jenkins.wildfiregames.com/job/docker-differential/2925/display/redirect
source/renderer/VertexBuffer.h | ||
---|---|---|
152 | I have little doubts about std::deque, because erasing takes linear time. I think it'd be better to use allocator container to have O(1) for erase and the same cost for the capacity increment as the std::deque. But maybe in the next patch. |
I accepted this but I don't really feel all that confident to merge it. @vladislavbelov I'll let you handle it.
Build has FAILED
builderr-debug-macos.txt /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols ../../../source/renderer/VertexBuffer.cpp:125:43: warning: '&&' within '||' [-Wlogical-op-parentheses] else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~ ../../../source/renderer/VertexBuffer.cpp:125:43: note: place parentheses around the '&&' expression to silence this warning else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count)
Link to build: https://jenkins.wildfiregames.com/job/macos-differential/2829/display/redirect
See console output for more information: https://jenkins.wildfiregames.com/job/macos-differential/2829/display/redirectconsole
Build is green
builderr-debug-gcc7.txt ../../../source/renderer/VertexBuffer.cpp: In member function 'CVertexBuffer::VBChunk* CVertexBuffer::Allocate(size_t, size_t, GLenum, GLenum, void*)': ../../../source/renderer/VertexBuffer.cpp:125:43: warning: suggest parentheses around '&&' within '||' [-Wparentheses] else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ builderr-release-gcc7.txt ../../../source/renderer/VertexBuffer.cpp: In member function 'CVertexBuffer::VBChunk* CVertexBuffer::Allocate(size_t, size_t, GLenum, GLenum, void*)': ../../../source/renderer/VertexBuffer.cpp:125:43: warning: suggest parentheses around '&&' within '||' [-Wparentheses] else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
See https://jenkins.wildfiregames.com/job/docker-differential/4492/display/redirect for more details.
Build has FAILED
builderr-debug-macos.txt /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgraphics_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libatlas_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgui_dbg.a(precompiled.o) has no symbols ld: warning: text-based stub file /System/Library/Frameworks//CoreAudio.framework/CoreAudio.tbd and library file
Link to build: https://jenkins.wildfiregames.com/job/macos-differential/2831/display/redirect
See console output for more information: https://jenkins.wildfiregames.com/job/macos-differential/2831/display/redirectconsole
Build has FAILED
builderr-debug-macos.txt /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgraphics_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libatlas_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgui_dbg.a(precompiled.o) has no symbols ld: warning: text-based stub file /System/Library/Frameworks//CoreAudio.framework/CoreAudio.tbd and library file
Link to build: https://jenkins.wildfiregames.com/job/macos-differential/2835/display/redirect
See console output for more information: https://jenkins.wildfiregames.com/job/macos-differential/2835/display/redirectconsole
Build is green
builderr-debug-macos.txt /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgraphics_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libatlas_dbg.a(precompiled.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgui_dbg.a(precompiled.o) has no symbols ld: warning: text-based stub file /System/Library/Frameworks//CoreAudio.framework/CoreAudio.tbd and library file
See https://jenkins.wildfiregames.com/job/macos-differential/2854/display/redirect for more details.
I made some profiling with my new benchmark map.
red vanilla, green deque, blue vector
source/renderer/VertexBuffer.cpp | ||
---|---|---|
171–183 | That's incorrect, since elements aren't sorted. It's not going to be committed. |