Details

Reviewers

Sandarac
elexis
bb

Commits

rP20123: Improve wolve target-finder performance

Summary

There are still lagspikes when having many organic NonGaia entities around when wolves are spawned.
Doing some time measuring shows that the GetNonGaiaEntities call only consumes about 3 milliseconds,
but the sorting of entities by distance and sending of commands consumes about half a second.

Tested using ExecQueryAroundPos from the RangeManager (which is also used by the Damage component) as discussed in IRC a couple of days ago in IRC
but that only ended up in 30 second freeze per wave!

Instead found two other optimizations:

We only need to compare the 2D coordinates, as the height is entirely irrelevant on this flat map.
We can more than reasonably assume that all wolves spawned on the same spawn point have the same position.

The optimization initially proposed in D184 to reduce targetCount from 3 to 2 didn't have any measurable performance impact at all!

Test Plan

Read the patch, test quickly that it doesn't break.
If you don't believe me that it enhances the performance, embrace the loop with var t1 = Date.now(); and warn("Took " + (Date.now() - t1));,
press F9, type Engine.SetSimRate(20); and spawn some units with the gift from the gods cheat on a normal or larger map.

My test results:
First run:

WARNING: Took 401 WARNING: Took 428

Second run:

WARNING: Took 385 WARNING: Took 407

Third run:

WARNING: Took 392 WARNING: Took 427

Same replay with the first optimization only:

First run:

WARNING: Took 300 WARNING: Took 319

Second run:

WARNING: Took 297 WARNING: Took 316

Same replay with second optimization only:

First run:

WARNING: Took 309 WARNING: Took 298

Second run:

WARNING: Took 313 WARNING: Took 290

Same replay with both optimizations:

First run:

WARNING: Took 241 WARNING: Took 221

Second run:

WARNING: Took 235 WARNING: Took 222

With both optimizations and targetCount = 2 instead of 3:

WARNING: Took 235 WARNING: Took 257

WARNING: Took 244 WARNING: Took 252

Diff Detail

Repository

rP 0 A.D. Public Repository

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

elexis created this revision.Mar 3 2017, 12:13 AM

elexis edited the test plan for this revision. (Show Details)Mar 3 2017, 12:14 AM

NOTFASTER2 KBDownload

for the ones who want to test the ExecuteQueryAroundPos freeze

Remove early continue and warn because the trigger helper function always spawns the units, even if the spawn point is entirely blocked.
Therefore the attackerPosition can also be moved outside of the function.

Build is green

Updating workspaces.
Build (release)...
Build (debug)...
Running release tests...
Running cxxtest tests (302 tests)..............................................................................................................................................................................................................................................................................................................OK!
Running debug tests...
Running cxxtest tests (302 tests)..............................................................................................................................................................................................................................................................................................................OK!

http://jw:8080/job/phabricator/458/ for more details.

Harbormaster completed remote builds in B668: Diff 690.Mar 3 2017, 4:26 AM

Build is green

Updating workspaces.
Build (release)...
Build (debug)...
Running release tests...
Running cxxtest tests (302 tests)..............................................................................................................................................................................................................................................................................................................OK!
Running debug tests...
Running cxxtest tests (302 tests)..............................................................................................................................................................................................................................................................................................................OK!

http://jw:8080/job/phabricator/459/ for more details.

Harbormaster completed remote builds in B669: Diff 691.Mar 3 2017, 6:05 AM

Sandarac requested changes to this revision.Mar 3 2017, 7:08 AM

Sandarac added inline comments.

binaries/data/mods/public/maps/random/polar_sea_triggers.js
8 ↗	(On Diff #691)	This is too much of a decrease for `firstWaveTime`; from a quick test, it is clear that AI players are overwhelmed quickly.

This revision now requires changes to proceed.Mar 3 2017, 7:08 AM

Remove debug hunk. Fail. :-)

Build is green

Updating workspaces.
Build (release)...
Build (debug)...
Running release tests...
Running cxxtest tests (302 tests)..............................................................................................................................................................................................................................................................................................................OK!
Running debug tests...
Running cxxtest tests (302 tests)..............................................................................................................................................................................................................................................................................................................OK!

http://jw:8080/job/phabricator/462/ for more details.

Harbormaster completed remote builds in B674: Diff 698.Mar 3 2017, 3:46 PM

DistanceBetweenEntities doesn't use distanceToSquared, but if that sqrt operation is neglible, we should definitely use that helper function

DistanceBetweenEntities is actually relatively slower. But these values are microseconds, so we can still use the more readable code:

WARNING: 1: 821
WARNING: 2: 2003
WARNING: 1: 957
WARNING: 2: 1956
WARNING: 1: 928
WARNING: 2: 1942
WARNING: 1: 932
WARNING: 2: 2021
WARNING: 1: 977
WARNING: 2: 2071
WARNING: 1: 989
WARNING: 2: 1838
WARNING: 1: 846
WARNING: 2: 1793

D204 suffered from spawn lag when the footprint component tried to find spawn points while they were totally unneeded (as the units were garrisoned right afterwards):
(some previous alpha releases also had "shunk ship lag" for the same reason).
Removing the footprint spawn point lookup and just spawning the wolves exactly at the trigger point further improves performance:

with footprint:

WARNING: Spawned 138 in 65ms
WARNING: Spawned 207 in 107ms
WARNING: Spawned 207 in 121ms
WARNING: Spawned 138 in 89ms

without footprint:

WARNING: Spawned 138 in 60ms
WARNING: Spawned 138 in 53ms
WARNING: Spawned 207 in 100ms
WARNING: Spawned 69 in 33ms
WARNING: Spawned 138 in 60ms

Not a performance optimization, but wolves should really not spawn right besides mines, so that the player can build a wall around them

From the gameplay and design point of view (I won't speak of code), I think there should be really more wolves and more powerfull ones but spamming them more or less randomly (and really big wolves hord roaming between players). But that's just a personal feeling.

elexis mentioned this in D204: Danubius Random Map Script with TriggerScripts + DoRepeatedly Trigger function.Apr 18 2017, 10:21 PM

temple mentioned this in D850: Use subdivision stuff in obstruction manager shape tests.Sep 1 2017, 2:42 AM

The first thing I noticed was this line:

	for (let target of targets.sort((ent1, ent2) => getDistance(ent1) - getDistance(ent2)).slice(0, targetCount))

It sorts a huge vector but then only grabs the first few elements, which seems pretty wasteful. And it does that for every attacker.
I wrote a quick ugly thing to skip the sorting and it looks like it has an impact. Here's an a22 game:

And here's a game with D185 and Petras (since I didn't have any replays) where I tried to replicate the settings of the other game (3v3, medium, 200 pop, 30m). They never got their populations up to the same level, so the targets array wasn't as long and there wasn't as much savings:

The hashes were okay in both. Here's the quick ugly thing:

let good_targets = [];
let good_dists = [];
for (let target of targets)
{
	let dist = getDistance(target);
	let i = good_dists.findIndex(element => { return dist < element; });
	if (i != -1 || good_targets.length < targetCount)
	{
		if (i == -1)
			i = targetCount;
		good_targets.splice(i, 0, target);
		good_dists.splice(i, 0, dist);
		if (good_targets.length > targetCount)
		{
			good_targets.pop();
			good_dists.pop();
		}
	}
}

for (let target of good_targets)
	ProcessCommand(0, {
		etc.

Finding the closest targets for each attacker isn't possible without looking at each possible target, right?
So if I understand your patch correctly, it still looks at all entities but doesn't create a sorted array but instead just stores the closest N targets in an array? sounds good. Thanks for the hint.
Also element => { return dist < element; } can be written as element => dist < element.

(Perhaps we could also do a range query with a fourth of the mapsize and then do check the entire mapsize if that fails. Not sure if it's worth the code though.)

In D185#33613, @elexis wrote:

So if I understand your patch correctly, it still looks at all entities but doesn't create a sorted array but instead just stores the closest N targets in an array?

Yep.

(Perhaps we could also do a range query with a fourth of the mapsize and then do check the entire mapsize if that fails. Not sure if it's worth the code though.)

You mean adding a query for every attacker? My original thought is that that would slow things down too much since there's so many attackers, but I tested and it's faster than even my sorting fix.

In the NOTFASTER patch there was a huge typo, for (let target of targets.splice(targetCount)) instead of for (let target of targets.slice(0, targetCount)). Because splice returns the removed objects rather than the remaining ones, you were actually attacking every target except the first three! Of course the 10000m radius was a bit much, too.

Here I search for targets within a 200m radius (210m is a civic center in town phase). In my replay they didn't need more than that until turn 3259 = 163 on the graph, after the game was basically over (Edit: Not quite over, but it was 3v2 at that point, which is why there was an area where wolves weren't near to a player). (It was 3v3 on a medium map.) If the 200m radius failed I used all targets plus the sorting fix, so the hashes still match.

I didn't test any other radii, but maybe even a smaller value would be reasonable? Or letting wolves be idle if there's no targets nearby? But e.g. looking around 75 or 145, it seems maybe there's not really any more lag to eliminate; the remaining spikes are there because there's more units and more activity.

Here's the final version I used. I only tested it with the medium 3v3 Petra game.

polar_sea_triggers.js3 KBDownload

In D185#33735, @temple wrote:

You mean adding a query for every attacker?

One query for each group of attackers (since a bunch of wolves are spawned at the same position).

huge typo

Oh dear, splice vs slice again.

If a player had been wiped out, the wolves spawned down there should be able to move to the other side of the map.
So doing a 200m radius search first and then a complete query would be ok.
The graphis look however like the range queries are negligible.
So that optimization is only useful if its like < 5 lines of code.
Oh I copied that already from some damage dealing component?
If you want to upload a patch combining all the performance optimizations we have figured out, you can commandeer the revision proposal as you can see that Im not too motivated to work on this patch :P

Okay then :)

I did a 4v4 multiplayer, used "gift from the gods" cheat to make a lot of units for each player, and had waves starting at 2m and every 1m after that. The units were idling so the speed gains here might be exaggerated compared to a normal game. I was moving around building stuff during the first wave (ComputeShortPath), that's why that's taller than the others.

All four curves below group the attackers by spawn point and end up finding the same targets (same hashes). The first one sorts a list of all targets to find the closest three, the second one uses EntitiesNearPoint() with a 10000m radius to get a sorted list of targets, the third one is similar to the first but avoids the sort, and the fourth one uses EntitiesNearPoint() with a 200m radius and then does the third one if that fails. I deleted two players before the last two waves, so wolves in that area were far from targets.

The big difference is between the first curve and the others. But since "both" seems to be better than the other two and the code's already written, why not use it? I guess the code's messier so that's one reason. I don't have a strong feeling one way or the other.

Build is green

Updating workspaces.
Build (release)...
Build (debug)...
Running release tests...
Running cxxtest tests (306 tests)..................................................................................................................................................................................................................................................................................................................OK!
Running debug tests...
Running cxxtest tests (306 tests)..................................................................................................................................................................................................................................................................................................................OK!
Checking XML files...

http://jenkins-master:8080/job/phabricator/1959/ for more details.

Harbormaster completed remote builds in B674: Diff 698.Sep 3 2017, 8:12 PM

bb requested changes to this revision.Sep 5 2017, 5:53 PM

bb added a subscriber: bb.

bb added inline comments.

binaries/data/mods/public/maps/random/polar_sea_triggers.js
38–41 ↗	(On Diff #3470)	not sure we we do this here and not only when it is used since i guess the 200 m will work in most cases and then its useless. (or we only make the let here and only execute the query when we need it and its not done yet)
57 ↗	(On Diff #3470)	rangeManager
63 ↗	(On Diff #3470)	space and name goodTargets
65 ↗	(On Diff #3470)	dunno if it help but the ones we already found will be used anyway so we don't need to find them, probably add them already or so?
68 ↗	(On Diff #3470)	store this earlier since we use it above too (L58)
77 ↗	(On Diff #3470)	goodDists

This revision now requires changes to proceed.Sep 5 2017, 5:53 PM

elexis added inline comments.Sep 5 2017, 6:01 PM

binaries/data/mods/public/maps/random/polar_sea_triggers.js
57 ↗	(On Diff #3470)	Why camel case? That format is used for variable names, so if we refer to one, it should be cmpRangeManager. If you want to use proper english, it'd be RangeManager, but the common noun would be ok for me too.
68 ↗	(On Diff #3470)	(thats indeed a reason to request changes in a performance optimization patch)

updated per some comments

temple marked 7 inline comments as done.Sep 5 2017, 8:37 PM

temple added inline comments.

binaries/data/mods/public/maps/random/polar_sea_triggers.js
65 ↗	(On Diff #3470)	I guess, but it's a rare case when one or two units are nearby but not three, and then we have to look through all targets anyway so what's an extra one or two, and we'd have to test if each is already in the goodTargets list. So I didn't include this suggestion.

elexis added inline comments.Sep 5 2017, 8:41 PM

binaries/data/mods/public/maps/random/polar_sea_triggers.js
62 ↗	(On Diff #3520)	space
67 ↗	(On Diff #3520)	Don't need to distinguish multiple falsy value here right? so !allTargets (can be done by bb if the rest is good)

temple updated this revision to Diff 3521.Sep 5 2017, 9:06 PM

temple marked 2 inline comments as done.

bb added inline comments.Sep 5 2017, 10:11 PM

binaries/data/mods/public/maps/random/polar_sea_triggers.js
64 ↗	(On Diff #3521)	aren't perhaps
89 ↗	(On Diff #3521)	I guess this is not correct since what if the first found unit is the closest unit then we put that one on targetCount and push the rest after that, so we get [,,foo,foo2,foo3...fooN] I guess `i = goodTargets.length` but pls verify

temple updated this revision to Diff 3524.Sep 5 2017, 10:43 PM

temple added inline comments.Sep 5 2017, 10:49 PM

binaries/data/mods/public/maps/random/polar_sea_triggers.js
89 ↗	(On Diff #3521)	Splice adds to the end of the array if the starting point is too big rather than creating blank elements, e.g. it does [foo] rather than [,,foo]. But that's not obvious so I've changed it.

performance improvement noticed code looks sane and didn't saw any more performance improvements quickly (nor will see after looking hours on the patch :P).

Closed by commit rP20123: Improve wolve target-finder performance (authored by bb). · Explain WhySep 6 2017, 11:00 AM

This revision was automatically updated to reflect the committed changes.

Thanks for the patch and review!

Performance optimization for Polar Sea spawning
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 3537

ps/trunk/binaries/data/mods/public/maps/random/polar_sea_triggers.js

Performance optimization for Polar Sea spawningClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 3537

ps/trunk/binaries/data/mods/public/maps/random/polar_sea_triggers.js

Performance optimization for Polar Sea spawning
ClosedPublic
Actions

Revision Contents
Changeset List