Page MenuHomeWildfire Games

Performance improvements to VertexBuffer
Needs ReviewPublic

Authored by dm on Jan 3 2020, 4:14 PM.

Details

Summary

A great deal of time consumption is taken by the Vertex Buffer. (As found by profiling) This change attempts to alleviate some of that overhead by lessening the times looping through large collections takes place.

Test Plan

Things continue to work?

Diff Detail

Repository
rP 0 A.D. Public Repository
Branch
/ps/trunk
Lint
Lint OK
SeverityLocationCodeMessage
Errorsource/renderer/VertexBuffer.cpp:181CPPCheckBear (eraseDereference)CPPCheckBear (eraseDereference)
Unit
No Unit Test Coverage
Build Status
Buildable 15018
Build 32478: Vulcan BuildJenkins
Build 32477: Vulcan Build (macOS)Jenkins
Build 32476: Vulcan Build (Windows)Jenkins
Build 32475: arc lint + arc unit

Event Timeline

dm created this revision.Jan 3 2020, 4:14 PM
Angen added a subscriber: Angen.

Hi @dm :)
Do not forget to add yourself to binaries/data/mods/public/gui/credits/texts/programming.json

Stan added a subscriber: Stan.Jan 3 2020, 4:26 PM

Hey there,

Thanks for the patch. I've added you to the Contributors group so you should get autobuilds from now on. Your patch is missing context depending on how you generated it, you need to pass -U5000 for git or -x -U5000 for SVN.

Here are some comments to keep you waiting for a proper review.

Can you provide some graphs using profiler2 ? So we can see the difference?

Thanks for the patch.

source/renderer/VertexBuffer.cpp
233

Use C++ style casts.

301

Don't add braces for single line statements see https://trac.wildfiregames.com/wiki/Coding_Conventions

310

Same here

311

Missing final '.'

source/renderer/VertexBuffer.h
164

m_ModifiedSinceBind maybe m_Dirty ?

dm added a comment.Jan 4 2020, 1:32 AM

{F1196112}I have attempted to run profile2 (before and after) here. (First time, with some difficulty, so I could have done something incorrectly.) I attempted to attach the "render" image here.

For comparison, I have been using gprof to compare change attempts.
Before this change, I am seeing something like this when I replay a previous multiplayer game:

Flat profile:

Each sample counts as 0.01 seconds.

%   cumulative   self              self     total time   seconds   seconds    calls  ms/call  ms/call  name
23.37    859.39   859.39 976221851     0.00     0.00  CVertexBuffer::Bind()

Where after I see :

1.66   2004.42    53.99 1061132679     0.00     0.00  CVertexBuffer::Bind()

CPU utilization seems to be down, at least in the few instances I have monitored, and the multiplayer game seems to play with fewer time errors displaying.

dm updated this revision to Diff 10873.Jan 4 2020, 1:51 AM

Updated based upon input

dm updated this revision to Diff 10874.Jan 4 2020, 2:38 PM
dm marked 3 inline comments as done.
nani awarded a token.Jan 4 2020, 2:50 PM
vladislavbelov added inline comments.Jan 10 2020, 1:18 PM
source/renderer/VertexBuffer.cpp
1

Year as well.

77

Why here and not in the construction list? After m_Target(...).

181

You can write it as iter = std::prev(m_FreeList.erase(iter));.

307

You don't need needReset, you might do the reset in a else branch of the if (needUpload).

source/renderer/VertexBuffer.h
1

You need to update the year.

104

{ on a new line.

164

Maybe m_HasNeededChunks?

dm updated this revision to Diff 11086.Jan 18 2020, 3:48 AM
dm marked 2 inline comments as done.
dm marked 4 inline comments as done.
dm marked an inline comment as done.Jan 18 2020, 3:50 AM
dm marked an inline comment as done.
dm updated this revision to Diff 11144.Jan 22 2020, 2:42 AM

With more testing, in high number deathmatch, there was still high usage in CVertextBuffer::Release (gprof)

%   cumulative   self              self     total

time seconds seconds calls ms/call ms/call name
23.15 274.51 274.51 9109564 0.03 0.03 CVertexBuffer::Release(CVertexBuffer::VBChunk*)

There seems to be fragmentation of the buffer over time. This new change attempts to reuse buffers of the same size, to reduce the free list count. In addition, this switches from std::list to std::deque. The % time spent in this method is greatly reduced:

1.10    645.87    11.75 12765372     0.00     0.00  CVertexBuffer::Release(CVertexBuffer::VBChunk*)

before and after profiling images are attached replaying the same saved game.

Main thread before
Main thread after
Render comparison
Render report section

Stan added a comment.Jan 22 2020, 8:09 AM

Some notes. Can you add yourself to the credits? Keep up the great work!

source/renderer/VertexBuffer.cpp
138

Range for loop?
std::find?

170

Same here?

231

nullptr?
new u8[0] ?

dm updated this revision to Diff 11224.Jan 30 2020, 1:39 PM
dm changed the visibility from "Public (No Login Required)" to "All Users".

Fixed noted items

dm marked 4 inline comments as done.Jan 30 2020, 1:40 PM

The game is still running correctly. I recognize an performance improvement up to 10%.

elexis changed the visibility from "All Users" to "Public (No Login Required)".Feb 15 2020, 7:02 PM
wraitii added a subscriber: wraitii.EditedJun 2 2020, 6:55 PM

It seems to me you don't actually need m_Neededon the chunks since m_HasNeededChunks makes it redundant. Further, I would actually suggest m_NeedsUpload as that appears to be what this does.

It seems like VertexBuffer is a further level down from VertexBufferManager, which also allocates buffers in an std::list. We appear to have a double searching-logic, where we first find a VertexBuffer with room in it, then a Chunk inside the vertex buffer itself.
I feel like this duplicated logic is somewhat un-necessary and we could clean it up in just having chunks in the vertexBufferManager, but I don't have too much time to look into this for now.
If you have the motivation & time, you can look into that direction. From a cursory reading, we don't actually create too many VBs, mostly particle emitters and overlay renderers would be inefficient.

With that being said, this seems like an incremental improvement, so I would agree with committing this is you don't have said time.

source/renderer/VertexBuffer.cpp
137

From what I can tell you already have the iterator here in most cases, so you should be able to avoid doing a find call.

wraitii accepted this revision.Aug 3 2020, 3:56 PM
This revision is now accepted and ready to land.Aug 3 2020, 3:56 PM
wraitii updated this revision to Diff 13028.Aug 3 2020, 3:56 PM
wraitii edited the summary of this revision. (Show Details)
wraitii removed a reviewer: wraitii.

Slight rebase & cleanup.

I'm not entirely sure how this was profiled (I don't notice much of a difference, but perhaps it only starts mattering after a long while), however the graphs above seem rather legit.

Will commit shortly.

This revision now requires review to proceed.Aug 3 2020, 3:56 PM
Owners added subscribers: Restricted Owners Package, Restricted Owners Package.Aug 3 2020, 3:56 PM
wraitii accepted this revision.Aug 3 2020, 3:56 PM

(accepted too quickly lol)

This revision is now accepted and ready to land.Aug 3 2020, 3:56 PM
Vulcan added a comment.Aug 3 2020, 4:16 PM

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/renderer/VertexBuffer.cpp
| 180| »   »   »   iter·=·std::prev(m_FreeList.erase(iter));
|    | [MAJOR] CPPCheckBear (eraseDereference):
|    | Iterator 'iter' used after element has been erased.

source/renderer/VertexBuffer.h
|  55| class·CVertexBuffer
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classCVertexBuffer{' is invalid C code. Use --std or --language to configure the language.
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/2855/display/redirect

Stan added inline comments.Aug 3 2020, 11:20 PM
source/renderer/VertexBuffer.cpp
181

From cppcheck above Iterator 'iter' used after element has been erased.

wraitii added inline comments.Aug 4 2020, 11:09 AM
source/renderer/VertexBuffer.cpp
181

seems spurious. erase() returns a new iterator to the element that followed, and that ought to be std::pref-callable, and then we just assign to iter.

Itms added a subscriber: Itms.Aug 6 2020, 4:50 PM

I am restarting the build here to pass the fixed cppcheck on this. Sorry in advance if there is a false negative..

Vulcan added a comment.Aug 6 2020, 4:54 PM

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/2924/display/redirect

Vulcan added a comment.Aug 6 2020, 5:10 PM

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/renderer/VertexBuffer.cpp
[ 180] »   »   »   iter·=·std::prev(m_FreeList.erase(iter));
**** CPPCheckBear (eraseDereference) [Section: Source | Severity: MAJOR] ****
!    ! Iterator 'iter' used after element has been erased.
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/2925/display/redirect

vladislavbelov added inline comments.Oct 9 2020, 1:23 AM
source/renderer/VertexBuffer.h
147

I have little doubts about std::deque, because erasing takes linear time. I think it'd be better to use allocator container to have O(1) for erase and the same cost for the capacity increment as the std::deque. But maybe in the next patch.

wraitii resigned from this revision.Nov 4 2020, 10:29 AM

I accepted this but I don't really feel all that confident to merge it. @vladislavbelov I'll let you handle it.

This revision now requires review to proceed.Nov 4 2020, 10:29 AM

@vladislavbelov any plans to commit this improvement?

Stan updated this revision to Diff 15242.Wed, Jan 13, 11:13 AM

Rebase, add missing includes/reorder them { on new line

Build has FAILED

builderr-debug-macos.txt
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols
../../../source/renderer/VertexBuffer.cpp:125:43: warning: '&&' within '||' [-Wlogical-op-parentheses]
                else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count)
                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~
../../../source/renderer/VertexBuffer.cpp:125:43: note: place parentheses around the '&&' expression to silence this warning
                else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count)

Link to build: https://jenkins.wildfiregames.com/job/macos-differential/2829/display/redirect
See console output for more information: https://jenkins.wildfiregames.com/job/macos-differential/2829/display/redirectconsole

Build is green

builderr-debug-gcc7.txt
../../../source/renderer/VertexBuffer.cpp: In member function 'CVertexBuffer::VBChunk* CVertexBuffer::Allocate(size_t, size_t, GLenum, GLenum, void*)':
../../../source/renderer/VertexBuffer.cpp:125:43: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
   else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count)
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
builderr-release-gcc7.txt
../../../source/renderer/VertexBuffer.cpp: In member function 'CVertexBuffer::VBChunk* CVertexBuffer::Allocate(size_t, size_t, GLenum, GLenum, void*)':
../../../source/renderer/VertexBuffer.cpp:125:43: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
   else if (numVertices < (*iter)->m_Count && best_iter == m_FreeList.end() || (*best_iter)->m_Count < (*iter)->m_Count)
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See https://jenkins.wildfiregames.com/job/docker-differential/4492/display/redirect for more details.

Stan updated this revision to Diff 15245.Wed, Jan 13, 12:02 PM

Fix condition merging

Build has FAILED

builderr-debug-macos.txt
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgraphics_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libatlas_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgui_dbg.a(precompiled.o) has no symbols
ld: warning: text-based stub file /System/Library/Frameworks//CoreAudio.framework/CoreAudio.tbd and library file

Link to build: https://jenkins.wildfiregames.com/job/macos-differential/2831/display/redirect
See console output for more information: https://jenkins.wildfiregames.com/job/macos-differential/2831/display/redirectconsole

Build has FAILED

builderr-debug-macos.txt
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgraphics_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libatlas_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgui_dbg.a(precompiled.o) has no symbols
ld: warning: text-based stub file /System/Library/Frameworks//CoreAudio.framework/CoreAudio.tbd and library file

Link to build: https://jenkins.wildfiregames.com/job/macos-differential/2835/display/redirect
See console output for more information: https://jenkins.wildfiregames.com/job/macos-differential/2835/display/redirectconsole

Stan updated this revision to Diff 15271.Wed, Jan 13, 11:06 PM

Use vector as it's generally faster

Stan added a comment.EditedWed, Jan 13, 11:09 PM

Green std::vector, Red std::deque, Blue Vanilla

Build is green

builderr-debug-macos.txt
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libsimulation2_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libengine_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgraphics_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libatlas_dbg.a(precompiled.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../../binaries/system/libgui_dbg.a(precompiled.o) has no symbols
ld: warning: text-based stub file /System/Library/Frameworks//CoreAudio.framework/CoreAudio.tbd and library file

See https://jenkins.wildfiregames.com/job/macos-differential/2854/display/redirect for more details.