Page MenuHomeWildfire Games

[RC] Thread the pathfinder computations
Needs ReviewPublic

Authored by wraitii on Dec 27 2016, 1:51 PM.

Details

Reviewers
Kuba386
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Trac Tickets
#4324
Summary

This implements threading of the pathfinder workers.

Details of the architecture:

  • Workers are dispatched paths in a round-robin-ish fashion. It's naive, but it works well enough for now.
  • Paths are then fetched in reverse order and messages are sent.

The Long pathfinder grid is owned by the pathfinder and so safe to use across turns. The obstructions manager' shapes aren't, so protected behind a mutex.

The three bits to comment on are mainly:

  • Do you think the protection in ObstructionManager is sufficient?
  • Are you OK with the serialisation logic / dependency between Pathfinder and ObstructionManager?
  • Do you find it OK to thread_local the pathfinders data?
Test Plan

May review code and architecture. Am interested in performance improvements that people would report. An MP game would be great.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
wraitii added inline comments.May 1 2019, 10:59 AM
source/simulation2/components/CCmpPathfinder.cpp
748

It's just there to be specialised, and it pushes requests so I don't think so?

source/simulation2/components/CCmpPathfinder_Common.h
141

Should. It also is, but it should be and remain so in the future.

167

I think it's better to init in the definition whenever possible going forward. Not sure if that's a convention we have or not?

source/simulation2/helpers/HierarchicalPathfinder.cpp
402 ↗(On Diff #7886)

nah, who cares about the time profiling debug overlay honestly.

source/simulation2/helpers/Render.cpp
38 ↗(On Diff #7886)

Likewise: I don't think it really matters to profile this unless you're actually looking at how long this takes, but it's a function used by debug code and profiling that seems useless.

Stan added inline comments.May 1 2019, 11:12 AM
source/simulation2/components/CCmpPathfinder.cpp
748

That line doesn't pushes request does it ?

source/simulation2/helpers/Render.cpp
38 ↗(On Diff #7886)

Ah okay :)

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/differential/1314/display/redirect

wraitii marked 6 inline comments as done.May 1 2019, 11:35 AM
wraitii added inline comments.
source/simulation2/components/CCmpPathfinder.cpp
748

It inserts them at the end (by moving), so that's similar to pushing.

Stan added inline comments.May 1 2019, 11:40 AM
source/simulation2/components/CCmpPathfinder.cpp
748

static_assert(sizeof(T) == 0, "Only specializations can be used"); What part of that statement does that ? Oo There is only one line in that function

wraitii marked 2 inline comments as done.May 1 2019, 12:06 PM
wraitii added inline comments.
source/simulation2/components/CCmpPathfinder.cpp
748

As said above, this is just the template function which gets specialised by L709/714. This base template cannot be used (it's not implemented but there's no clean way to write that because of issues in XCode/VS, see https://stackoverflow.com/a/37593094).

Stan added inline comments.May 1 2019, 12:07 PM
source/simulation2/components/CCmpPathfinder.cpp
748

Ah thanks for the explanation :)

wraitii updated this revision to Diff 8148.May 25 2019, 10:56 PM
wraitii marked an inline comment as done.

Cleaned up of D1918.

I've switched to using static, thread-local variables for cached variables. This is a tad ugly until C++ 17 get around the corner (as we won't have to initialise the static members out of line then).

The pro is that it makes it irrelevant to the helpers whether we are threaded or not, making their implementation much easier and foolproof. The con is that one cannot use debug with the threading (well one can but it would never show anything), so you have to de-activate threading for that. Possibly this could be done at runtime, though this isn't implemented yet.

I suspect this has a small performance overhead - I'm assuming it's irrelevant here but it should be profiled.

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/differential/1504/display/redirect

wraitii added inline comments.Jun 10 2019, 9:16 AM
source/simulation2/components/CCmpPathfinder.cpp
854

Realised last night that I need to make sure the results are ordered consistently for threaded/non-threaded/different # of threads here. I think they are by accident right now, but and "ENSURE (is_sorted)" is probably worth adding.

wraitii updated this revision to Diff 9019.Jul 20 2019, 7:06 PM
wraitii added a reviewer: Restricted Owners Package.

Rebased, cleaned up slightly.

Works, still need a mutex for Obstructions but it works, it's only for sanity.

So anybody willing - please test this.

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/152/display/redirect

wraitii updated this revision to Diff 9042.Jul 21 2019, 7:23 PM

Mutex obstructions. This is RC, but we need to agree on whether or not this is acceptable.

wraitii added inline comments.Jul 21 2019, 7:25 PM
binaries/data/mods/public/gui/options/options.json
89

erf, also need to change this.

source/ps/ThreadUtil.cpp
49

well, RC except for this bit which I can now remove :p

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/176/display/redirect

i tried doesnt work for me i watch pyrogenesis thread in top program of linux and have put general options number of thread to 3 but it doesnt show more than one thread at 100% cpu in top so i dont get it to work. any suggestions wut to do more to get them running? i put D1918 and D14 as diff on my a24 svn version r22642.

ffffffff added a comment.EditedAug 12 2019, 10:18 AM

ok its working now but not as expected but i havent look deeply into code now just can say that one thread (main thread) is going when testing pathfinding with 6-16 threads with 1000-2000 units moving in a map very fast to 100% but the other threads are showing only small up usually 1-3% cpu usage some times 20% but very rare and though because main thread is at 100% exhausting and seem to not get more cpu time above 100% it starts lagging ofc and fps dropping even if non visual window (window hidden to taskbar). So i was expecting that the pathfinder work would be go well across all threads and especially not touching the main thread any more because there is also then graphics computation and other game logic stuff hitted which results again in fps droppage and lag game which should be avoided. So i think that would be the first step. then we have to look.

As an FYI: I confirm this is working on my machine, but don't expect all 4 cores to be 100% CPU, pathfinding is only a small part of it.

This screenshot of Profiler2 shows 3 pathfinder threads computing paths: as you can see, it's only a fraction of the total time.


This is the summary of the pathfinder thread activity.

This patch doesn't fix lag, nor does it magically make 0 A.D. multi-threaded, but it helps significantly against pathing lag spikes.

wraitii retitled this revision from Thread the pathfinder computations to [RC] Thread the pathfinder computations.Aug 12 2019, 6:41 PM
wraitii edited the summary of this revision. (Show Details)
wraitii updated the Trac tickets for this revision.
Stan added inline comments.Sep 11 2019, 3:38 PM
source/ps/ThreadUtil.cpp
49

Is the todo above still to do ?

58

Shouldn't it be like when compiling where you type j(numberofcores +1)

source/simulation2/components/CCmpPathfinder.cpp
837

Is it hard to do ?

source/simulation2/components/CCmpPathfinder_Common.h
69

Doxygen ?

77

Doxygen ?

77–94

Doxygen ?

82

Doxygen ?

98

Doxygen ?

source/simulation2/helpers/LongPathfinder.cpp
31

Is it used anywhere ?

384–387

Why the change ? Why not add m_Grid as well ?

wraitii added inline comments.Sep 11 2019, 5:36 PM
source/ps/ThreadUtil.cpp
49

Yes that can be removed now, just need to do it.

58

Nah, we want one core left for the main thread to do its own thing if possible.

source/simulation2/components/CCmpPathfinder.cpp
837

It's trickier, but it's also a micro-improvement that needs further testing.

source/simulation2/helpers/LongPathfinder.cpp
384–387

The items are now thread_local global variables, instead of members of LongPathfinder, unlike M_Grid.

Stan added inline comments.Sep 11 2019, 5:50 PM
source/ps/ThreadUtil.cpp
58

How can we guarantee Oses will do that ?

source/simulation2/helpers/LongPathfinder.cpp
384–387

Ah I see, could nullptr the last while at it :)

Stan added a comment.Sep 13 2019, 11:26 AM

Just played a one hour sp game with it and did not notice any big bugs. Dunno if the performance was better or not :)

wraitii updated this revision to Diff 9756.Sep 14 2019, 11:40 AM

Some adjustments.

This uses a more complex trick in ObstructionManager to ensure that one can't forget to lock the mutex before modifying 'dangerous' data.

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/675/display/redirect

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/vs2015-differential/165/display/redirect

wraitii updated this revision to Diff 9757.Sep 14 2019, 11:41 AM

Don't add a .orig file

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/676/display/redirect

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/vs2015-differential/166/display/redirect

wraitii edited the summary of this revision. (Show Details)Sep 14 2019, 11:44 AM
Stan added inline comments.Sep 14 2019, 12:41 PM
source/simulation2/components/CCmpObstructionManager.cpp
1147

Was any decision reached on the usage of auto ?

1171

Same as above

source/simulation2/components/CCmpPathfinder_Common.h
169

Process worker maybe ?

wraitii updated this revision to Diff 9769.Sep 15 2019, 10:07 AM

Use macros to try and make the 'protected data' thing less ugly. Might not actually be much better.

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/690/display/redirect

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/vs2015-differential/175/display/redirect

wraitii updated this revision to Diff 9776.Sep 15 2019, 12:21 PM

Rerun to hopefully have a passing build now that earlier diffs are committed.

Successful build - Chance fights ever on the side of the prudent.

Link to build: https://jenkins.wildfiregames.com/job/vs2015-differential/180/display/redirect

wraitii added inline comments.Sep 15 2019, 12:57 PM
source/ps/ThreadUtil.cpp
58

There's no such thing as guaranteeing with multithreaded code, but this leaves at least one core open for regular computations (such as graphics).

source/simulation2/helpers/LongPathfinder.cpp
31

Seems unused in the current implementation, but won't make unrelated changes here.

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/ps/ThreadFrontier.h
|  28| class·ThreadFrontier
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classThreadFrontier{' is invalid C code. Use --std or --language to configure the language.

source/simulation2/helpers/LongPathfinder.h
|  34| ·*·TODO:·maybe·VC2012·will?
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.

source/ps/ThreadUtil.h
|  21| namespace·ThreadUtil
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'namespaceThreadUtil{' is invalid C code. Use --std or --language to configure the language.

source/simulation2/components/CCmpObstructionManager.cpp
| 138| »   BEGIN_PROTECTED_DATA()
|    | [MAJOR] CPPCheckBear (unknownMacro):
|    | There is an unknown macro here somewhere. Configuration is required. If BEGIN_PROTECTED_DATA is a macro then please configure it.

source/simulation2/components/CCmpPathfinder_Common.h
|  34| #include·"graphics/Overlay.h"
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.

source/simulation2/components/ICmpObstructionManager.h
|  53| class·ICmpObstructionManager·:·public·IComponent
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classICmpObstructionManager:' is invalid C code. Use --std or --language to configure the language.

source/simulation2/helpers/VertexPathfinder.h
|  72| class·VertexPathfinder
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classVertexPathfinder{' is invalid C code. Use --std or --language to configure the language.
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/696/display/redirect

Stan added a comment.Sep 15 2019, 6:07 PM

The premacro version did not hang while saving.

wraitii updated this revision to Diff 9793.Sep 15 2019, 6:24 PM

Don't hang when saving.

The issue is complex and actually very annoying.
Because I start processing patsh at the end of Turn N, and send messages on the start of turn N+1, it can happen that we serialize in between.
At that point, the obstruction manager would lock on to the mutex (because it was in Common), and this hanged forever.
I don't actually have to lock in ObstructionManager because serializing can be a const-operation, _but_ there was still a bug: we would have deleted the requests that we were processing when saving, so those would be lost forever...
For this reason I have to clear the requests when fetching messages, which means they must be duplicated, which shouldn't be too bad but is annoying.

The alternative would be to not compute paths between turns, but that loses much of the point of the diff...

wraitii added inline comments.Sep 15 2019, 6:25 PM
source/simulation2/components/CCmpPathfinder_Common.h
177

Forgot to use that in this diff, but it'll be in the next.

wraitii edited the summary of this revision. (Show Details)Sep 15 2019, 6:42 PM

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/ps/ThreadUtil.h
|  21| namespace·ThreadUtil
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'namespaceThreadUtil{' is invalid C code. Use --std or --language to configure the language.

source/simulation2/components/ICmpObstructionManager.h
|  53| class·ICmpObstructionManager·:·public·IComponent
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classICmpObstructionManager:' is invalid C code. Use --std or --language to configure the language.

source/simulation2/components/CCmpPathfinder_Common.h
|  34| #include·"graphics/Overlay.h"
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.

source/simulation2/components/CCmpObstructionManager.cpp
| 138| »   BEGIN_PROTECTED_DATA()
|    | [MAJOR] CPPCheckBear (unknownMacro):
|    | There is an unknown macro here somewhere. Configuration is required. If BEGIN_PROTECTED_DATA is a macro then please configure it.

source/ps/ThreadFrontier.h
|  28| class·ThreadFrontier
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classThreadFrontier{' is invalid C code. Use --std or --language to configure the language.

source/simulation2/helpers/VertexPathfinder.h
|  72| class·VertexPathfinder
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classVertexPathfinder{' is invalid C code. Use --std or --language to configure the language.

source/simulation2/helpers/Spatial.h
|   1| /*·Copyright·(C)·2016·Wildfire·Games.
|    | [NORMAL] LicenseYearBear:
|    | License should have "2019" year instead of "2016"

source/simulation2/helpers/Spatial.h
|  35| class·SpatialSubdivision
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classSpatialSubdivision{' is invalid C code. Use --std or --language to configure the language.

source/simulation2/helpers/LongPathfinder.h
|  34| ·*·TODO:·maybe·VC2012·will?
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.

source/simulation2/serialization/SerializeTemplates.h
|   1| /*·Copyright·(C)·2016·Wildfire·Games.
|    | [NORMAL] LicenseYearBear:
|    | License should have "2019" year instead of "2016"

source/simulation2/serialization/SerializeTemplates.h
|  28| template<typename·ELEM>
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.

source/simulation2/components/CCmpObstruction.cpp
|   1| /*·Copyright·(C)·2018·Wildfire·Games.
|    | [NORMAL] LicenseYearBear:
|    | License should have "2019" year instead of "2018"
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/707/display/redirect

Stan added inline comments.Sep 18 2019, 1:12 PM
binaries/data/config/default.cfg
409

For some reason, the value does not appear in the box unless it's set manually (it just shows blank)