Page MenuHomeWildfire Games

BoolArray array wrapper.
Needs ReviewPublic

Authored by nani on Sep 22 2018, 2:06 AM.

Details

Reviewers
FeXoR
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Restricted Owners Package(Owns No Changed Paths)
Summary

Array wrapper that uses uint16 array to store individual bits.

Has two extra methods useful for TileClass: https://code.wildfiregames.com/D1631

Test Plan

Use in Tileclass.js

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

nani created this revision.Sep 22 2018, 2:06 AM
nani created this object with edit policy "nani".
nani changed the visibility from "Public (No Login Required)" to "nani".Sep 22 2018, 2:13 AM
nani changed the edit policy from "nani" to "All Users".
nani edited the summary of this revision. (Show Details)Sep 22 2018, 5:28 PM
nani changed the visibility from "nani" to "Public (No Login Required)".Sep 22 2018, 5:54 PM

Does our spidermonkey support Uint8Array/Uint32Array? It'd be good to compare them by performance/memory usage.

binaries/data/mods/public/maps/random/rmgen/BoolArray.js
2

A comment starts with a capital letter and a space before.

5

Math.ceil(size / 16) => (size + 15) >> 4.

Also some of your lines have semicolons, some don't. Use the same style for whole file. I assume we usually use semicolon always.

11

size / 16 => size & 0xF. The same below.

25

Follow the code convention for naming.

smiley added a subscriber: smiley.Oct 28 2018, 10:45 AM

Why not move this to globalscript/? There could be use cases outside rmgen (triggerscripts comes to mind)

nani updated this revision to Diff 6963.Nov 2 2018, 2:51 PM
nani added a comment.Nov 2 2018, 2:55 PM

Does our spidermonkey support Uint8Array/Uint32Array? It'd be good to compare them by performance/memory usage.

I checked the map gen times for 8 ,16 ,32 in multiple maps and haven't found any notable difference. Haven't checked the memory usage.

Stan added a subscriber: Stan.Nov 2 2018, 6:22 PM
Stan added inline comments.
binaries/data/mods/public/maps/random/rmgen/BoolArray.js
2

uint16_t I think.

41

++i

54

++i

Stan added a comment.Nov 2 2018, 6:23 PM

Your bitwise operations might be more costly than the potential gain.

nani added a comment.Nov 2 2018, 7:41 PM
In D1637#65843, @Stan wrote:

Your bitwise operations might be more costly than the potential gain.

While that might be true the methods isSomeNotZero and isSomeNotOne used in in TileClass.js do improve map generation time (tested 20% improvement average if i remember correctly).

Stan added a comment.Nov 2 2018, 8:13 PM

Sure I'm not saying this is is useless :)
You could cache the results of the bitwise operator in an array maybe. Array access will be faster than any computation.

elexis added a subscriber: elexis.Nov 3 2018, 9:01 AM
In D1637#65603, @smiley wrote:

Why not move this to globalscript?

If the feature proposal is accepted, it should be moved over there.

Does our spidermonkey support Uint8Array/Uint32Array?

Sure, but the problem is that you would have to go through the JS Interface for every read or write access if we use that.

It'd be good to compare them by performance/memory usage.

We need a dedicated test to check the performance gain. The trade-off between the advantages and disadvantages of code complexity and performance must be sufficient.
It could be a file that when executed, runs the same operation with both data structures and measures the time passed.

binaries/data/mods/public/maps/random/rmgen/BoolArray.js
54

(that's the coding convention we use, postfix incrementor only when the return value is read from)

58

(missing semicolon, since L48 is an assignment,)

Using hexadecimal for bitwise operations would make it more clearer and intuitive,

nani updated this revision to Diff 7071.Dec 26 2018, 5:17 PM
nani marked 10 inline comments as done.

Fix typos.

binaries/data/mods/public/maps/random/rmgen/BoolArray.js
2

@nani

On the importance of the feature:
we need to know how much memory and performance cost this influences.

The maximum mapsize that random map scripts can currently create is 512.
If one sets 1024, it will trigger an OOS error during the random map stage.

It would be desirable if we could use higher-resoluted grids to store TileClasses.
For example 4*4 subtiles for every tile.
This is necessary to place VisualActors more precisely and create maps that look similar to what LordGood keeps posting, see here https://wildfiregames.com/forum/index.php?/topic/25558-skirmish-maps-number-of-players/&do=findComment&comment=371734
But you can see that currently a giant map generates tens of seconds slower than a normal sized map. The same would happen if we would increase the resolution of the tileclass grids.

So it seems we have to replace the current code with code that is both faster and less memory consuming.

So I don't know if this is the case for this patch.
If it isn't, we probably don't want it.
If it is, the patch should not only introduce a prototype, but also replace either the previous code (if covering all previous features) or be used by the maps that don't use any feature that this one covers.

On the importance of having exactly one patch per feature:
It's confusing to split one feature across two patches: D1637 introduces exclusively unused code (hypothetical code), D1631 replaces used code and introduces it with a reference to non-existent code.
So it's one patch split across two revision proposals, but the patches should be self-contained.
I recall asking to split patches, but what should be split are the unrelated features:

  • Patch 1 implementes Feature/bugfix 1.
  • Patch 2 implements Feature/bugfix 2.
  • If patch 1 is committed, but patch 2 isn't committed, there should be no negative impact from having patch 1 committed but not patch 2. It should be possible to checkout any state of the svn repository without encountering a reference to an undefined function.

Unique feature 1 is the boolarray in D1637,
Unique feature 2 is everything in D1631 that is unrelated from the boolarray. For example the <= 27 change.

Since the code here is an intentional performance improvement and might allow for a gamechanging memory impact,
we need to be able to predict or measure the performance and memory impacts of each feature by itself,
since one feature might be good, but the other might be bad.

The measurement that I recall being taken was testing every of your diffs against none of your diffs.
So how can we conclude from that that the BoolArray wrapper is an improvement, if mathmatics say that it could be the <= 27 change that is so benefitial that it exceeds the worsening introduced by the BoolArray wrapper.

Merge conflicts are expected (both features having to change the same line), but that doesn't invalidate the importance of being able to judge the validity of each of each feature separately.

The performance and memory cost of the BoolArray wrapper could be negative, then the patch (but not the 27 thing) should be dismissed,
or it is a step towards being able to create LordGood grade random maps.

So we should not only determine this patch to be a step, but also the direction of the step. Can either measure, or try to argue from the memory use of the implementation.

On the importance of considering all alternatives:
Assume that the BoolArray wrapper is implemented and determined to be a bit faster, and that we want to implement LordGood grade random maps using higher resoluted tileclass grids (boolarrays).
Then exactly this area of code will be the bottleneck, won't it?
That might deny the entire design of both the current code and this patch (mightn't it? ?!)

We need to imagine, construct, determine, make up any possibility that would allow us a performance benefit of two orders of magnitude (100x speed hack) to achieve LordGood grade random maps using higher resoluted tileclass grids.
If I'm not mistaken, @smiley once proposed to involve C++ more into RandomMap generation. I don't recall in which aspect, but at first it sounds meh. It's a great advantage to have everything in JS. No need to compile again. Many newcomers can read JS, whereas many newcomers have problems considering even to compile. It also allows writing code much faster if compilation (JIT) finishes about instantly. People had spent a lot of time moving RandomMap code from C++ to JS to gain these freedoms. So it would be reversing a decision and then reversing again. I.e. there has to be a very good reason to go back to C++.

@vladislavbelov How can we implement a 2D bool array (rasterization of an an arbitrary area) with the best performance and memory-footprint? We need it to be 100x faster and use 16x less memory!! C++ or JS as possible
If that is a problem we can't solve, we need to change the problem!?

Since I was mentioned here and all of this has been discussed with the author before.
Using this would be a significant improvement. Especially for tileclasses. Since it allows eliminating up to 16 tiles in one iteration. This was confirmed by both of us.
Regarding the 27. Just revert the commit that introduced it. Only rangeOp existed initially. And the mismatch issue was found. (I could not reproduce the mismatch you mentioned on water maps, which really does not make any sense to me).
And I proposed moving the tileclasses to c++.

smiley removed a subscriber: smiley.Mar 31 2019, 11:11 PM
Stan added inline comments.Mar 31 2019, 11:16 PM
binaries/data/mods/public/maps/random/rmgen/BoolArray.js
48

Not sure but maybe those functions could use a slice, and array.some()

Stan added reviewers: Restricted Owners Package, Restricted Owners Package.Mar 31 2019, 11:18 PM
elexis added a subscriber: smiley.Apr 1 2019, 10:40 AM
In D1637#73726, @smiley wrote:

Since I was mentioned here and all of this has been discussed with the author before.

Thanks for the requested response (I only recalled you spoke about doing something with C++, but not which aspect would be moved.)

Using this would be a significant improvement. Especially for tileclasses. Since it allows eliminating up to 16 tiles in one iteration.

Which iteration do you mean specifically?

Currently TileClass comes with

this.inclusionCount[i] = new Int16Array(size); //int16
this.rangeCount[i] = new RangeOp(size);

This revision proposal uses an array of an u16-array, the previous code with RangeOp removed also uses only an array of an u16 array?
So the question is whether there is benefit further than the rangeOp removal.

Regarding the 27. Just revert the commit that introduced it. Only rangeOp existed initially.

Reverting the commit is an action, but first we need to establish that this is correct and the best choice among alternatives. First the reasons, then the actions. We discussed all of this few months before, but the reasons should be in this document.
If we discovered all possible choices, discovered all implications of the choices well, the argument will speak for itself which of the choices we should use. If we post an imperative, it may be perceived as a personal request, but if the arguments speak for themselves, the comment poster should be out of the question. Just in general. Of course nothing stops people from still considering this a personal request after having posted a good argument. ¯\_(ツ)_/¯

The question is whether rangeOp could provide value, or whether it was a too hypothetical idea that covers truly no use case.
IIRC it could only allowed a DensityConstraint. One could keep the fancy complex code instead of deleting it, fixing it where needed, making it optional, and add a DensityConstraint, try to come up with a map that uses it, and show that the artistic value added is worthy (not only added to make the code used). I think nani went to that place already, but I didn't check the last step yet.

There are different aspects, the 27 condition, the rangeOp, and the ability to add a tileclass multiple times per square. I need to run the code again, it's not so easy to tell from a webbrowser, but I see the tool of reverting something being likely indeed.

And the mismatch issue was found. (I could not reproduce the mismatch you mentioned on water maps, which really does not make any sense to me).

There is a mismatch between the < 27 and >= 27 code path, creating different maps, nani had investigated it too. It would be good to know which one of the two is wrong in order to fix it if it's the one we keep. IIRC One of the two algorithms included the last item of every line of an area, the other one excluded it.
I might have talked about another mismatch, but I don't recall something water maps.

And I proposed moving the tileclasses to c++.

So the question is

  • whether it's feasible at all (may not result in a JS/C++ interface overhead bottleneck), and if it is,
  • how much it would add (more than nanis patch? so much more that it's worth sacrificing the JS-pureness?).
  • whether the TileClass prototype would be implemented in C++ or whether only a 2D bool array type would be implemented in C++. Both might or might not be promising, JSInterface_IGUIObject gave an example how to define JS object types in C++.

If C++ is truly better, then this patch may be a significant improvement that would still be wrong.
If C++ would add overhead, then we may consider alternative JS magic and possibly conclude this patch to be the ideal solution.

There is a mismatch between the < 27 and >= 27 code path, creating different maps, nani had investigated it too. It would be good to know which one of the two is wrong in order to fix it if it's the one we keep.

I guess you would be out of the loop on this one. But the 27 mismatch was found. And I failed to find any mismatches. But nani claimed that he found mismatches in water maps (only). (This was independent work from both sides, so it is hard to establish conclusions).
Regarding, rangeOP, I have plans to replace it with something working on the same principle but much clearer. But, that is not likely in the near future.

whether it's feasible at all (may not result in a JS/C++ interface overhead bottleneck), and if it is,

Back then, I started implementing it in a way which does not expose any tile class to js.
It would have worked something like,

Engine.AddTileClass(ID, size);
Engine.AddToTileClass(ID, [p, q, r]);
Engine.TileClassGetPointsInRadius(ID, p, radius);

And a C++ Tileclass which gets added to a map or something accessed via ID.
Did not gave much thought to the design implications and to be honest never wrote much either. The end result could have been pretty terrible for all I know.

elexis added a comment.Apr 1 2019, 3:09 PM
In D1637#73755, @smiley wrote:

There is a mismatch between the < 27 and >= 27 code path, creating different maps, nani had investigated it too. It would be good to know which one of the two is wrong in order to fix it if it's the one we keep.

I guess you would be out of the loop on this one. But the 27 mismatch was found. And I failed to find any mismatches. But nani claimed that he found mismatches in water maps (only). (This was independent work from both sides, so it is hard to establish conclusions).

It depends on how one tests, if one tests on a map where all distances are smaller than 27, then it obviously creates the same results as the code where the <= 27 condition is replaced with true.
What one needs to compare is mapgen result of the code where the condition is always true with the result where the condition is always false.
Pretty certain that optimization gave different results than unpatched vanilla code on Jebel Barkal, otherwise I would have used it to play.

The end result could have been pretty terrible for all I know.

Terribly bad, terribly good, or anything in between. We will find out sooner or later if we want 0 A.D. maps to reach the next level. Yay future! or something

smiley added a comment.EditedApr 1 2019, 3:20 PM

Well yes. The current TileClass.js uses wrong values for the rangeOP. The way I tested it, there would have been no false results like that. That was how the error was found, so I guess the methodology is correct.

nani added a comment.Apr 4 2019, 7:22 PM

This diff does two things: reduce the size of the memory allocated needed for the array and exploit the 16 bit type values to to check the binary information of upto 16 tiles in one go.
Would improve generated maps times by 3~4x times for all sizes if combined with the other diff of TileClass.

how much it would add (more than nanis patch? so much more that it's worth sacrificing the JS-pureness?).

How much of a gain do you think it needs to achieve to make it worth sacrificing the JS-pureness? (answer expected as a number, 100%, 200%?)

elexis added a comment.Apr 5 2019, 2:16 AM

Would improve generated maps times by 3~4x times for all sizes if combined with the other diff of TileClass.

The factor 3-4x is still misleading nani (see post above).
If the Boolwrapper would make it 50% slower than the current implementation, then the other patches combined could be 6-8x as fast.
x + y + z = 3 doesn't tell us that x > 0, it could even be x = -100, y = 101, z = 2.

In D1637#74131, @smiley wrote:

How much of a gain do you think it needs to achieve to make it worth sacrificing the JS-pureness? (answer expected as a number, 100%, 200%?)

100% is a lot, if a map generation time can be reduced from 30s to 15s with that, that would put the threshold for an argument not to use that quite high up.
But for 15% I'm not so convinced that it's necessary to sacrifice JS pureness.
A percent number would already imply a constant factor, but it could also scale differently.
The disadvantage of using C++ would have to be so bad that it's even worth things taking twice as long.
I can't predict whether the JS/C++ overhead will make it worse or not.
We can only say for sure if we see can compare two different implementations, compare and derive all consequences of the patch (including cost of developers to become familiar with this, editing this, etc.).
The question is also whether we can achieve higher rmgen quality with JS improvements, so that 100% tileclass performance improvement would only be a difference between 3 seconds and 5 seconds mapgen time.
If you think about trying the C++ implementation, the procedural style using RegisterScriptFunction can be exchanged for object-oriented style, defining the prototype in C++ like in JSInterface_IGUIObject.cpp, so that JS can continue to use var clFoo = new TileClass();. (This might or might not be faster too.)
Notice that if someone was to increase the resolution to go beyond 1 class per tile, one might consider renaming tileclass then (-> JS pureness).
Probably will never find out what's best without having implemented it, and then it would be frustrating to ditch it once it works.

smiley added a comment.Apr 5 2019, 7:40 AM

The tileclass optimization patch is using this patch only. So, the performance improvement is a direct result of using this. Or we can always write a new data structure that can eliminate half the remaining grid on each iteration.

nani added a comment.EditedApr 5 2019, 1:41 PM

@elexis

MapSizesvnsvn + D1637 + D1631time reduction factor
MainlandNormal7.2 s2.3s3.1
MainlandGigant26.5 s5.8s4.5

As for memory consumption:
If we suppose spidermonkey was using 32 bit integer (I know javascirpt doesn't have ints but most probably uses a int32 for this code) for each array value and now we use 1 bit then the memory used is 1/32 of the original code.

Stan added inline comments.Apr 5 2019, 1:49 PM
binaries/data/mods/public/maps/random/rmgen/BoolArray.js
48

Wonder if that's slower or faster but that function can be replaced by

return this.arr.slice(a >> 4, b >> 4).some(bit => ~bit);

and the function above by

return this.arr.slice(a >> 4, b >> 4).some(bit => bit);
smiley added inline comments.Apr 5 2019, 1:54 PM
binaries/data/mods/public/maps/random/rmgen/BoolArray.js
48

slice creates a new Array, so slower. Don't know by how much.

nani added inline comments.Apr 5 2019, 2:31 PM
binaries/data/mods/public/maps/random/rmgen/BoolArray.js
48

Slice doesn't include last index
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/slice

and slice creates a little bit of overhead (not 100% sure), as this function is called a massive number of times is important to use just keep it down to a minimum.

Also, as smiley says, slice "creates" (some kind of reference/pointer I suppose) a new subarray so it potentiates overhead.

Thanks for the input anyway :)

Stan added inline comments.Apr 5 2019, 2:56 PM
binaries/data/mods/public/maps/random/rmgen/BoolArray.js
48

You're welcome, both codes looked a lot alike, so I was looking for a way to simplify them :)

smiley added a comment.Apr 5 2019, 4:47 PM

A giant mainland map doesn’t really prove it. A large grid with a few points is what mainland offers and is exactly what the bool array method excels at. I wonder how it would perform in a grid where it would have to go to the bit-by-bit check quite frequently.

elexis added a comment.Apr 5 2019, 5:37 PM
In D1637#74229, @nani wrote:
MapSizesvnsvn + booArray + tileClass patchtime reduction factor

"svn + booArray + tileClass patch" sounds a lot like measuring multiple patches at a time, and I'm not sure what "tileClass patch" refers to, in particular whether it includes the 27 diff and other logic changes.

We should have one patch per feature and measure one feature at a time.

smiley added a comment.Apr 5 2019, 5:40 PM

Tileclass patch refers to actually using this thing. And there are a few cleanups in there like nuking rangeOP. But the whole performance improvement is the direct result of solely using this.

smiley added a comment.Apr 5 2019, 6:56 PM

(I assume you have checked the other sizes. There is a tradeoff between faster iteration and the number of false negatives.)

smiley added a comment.Apr 5 2019, 7:38 PM

But for 15% I'm not so convinced that it's necessary to sacrifice JS pureness.

Somewhere between 30% (most cases) to 50% faster.
Of course, it depends on how much mapgen is spent in tileclasses. Some maps spend more times in placers. But for a giant Ambush map, the 30% was 86 seconds. So, mixed feelings about this. But, you never know until you write code.

Somewhere between 30% (most cases) to 50% faster.

Thanks for testing it, sounds promising. I will try too once I open my IDE again and closed some hundred revision proposals...

But for a giant Ambush map, the 30% was 86 seconds

Ambush is an animal. Canyon too.

21:20 < nani0> btw vladislavbelov[ how do you organize all your current projects (pending patches), do you use git branches or just save *.diffs?

(Branches are useful if you have multiple patches that depend on each other, so you can rearrage the commits and pop one of them at a time into the review queue. If it's a standalone patch that doesn't have merge conflicts with others, you can create a revision proposal or paste and then there is one backup of it. Also I used a messy directory per year/month/day for 0ad notes, abandoned patches etc.)

smiley added a comment.Apr 7 2019, 6:28 PM

(Maybe add an iterator?, for something like tileclasses, it may not be relevant, but I assumed this was not to be restricted rmgen/ )

In D1637#74458, @smiley wrote:

(Maybe add an iterator?, for something like tileclasses, it may not be relevant, but I assumed this was not to be restricted rmgen/ )

Can't really speak on hypotheticals. But less really speaking on it: Bool-arrays would be generic, so they might have a use case for iterators, but what would be the use case for TileClasses outside of rmgen/? Also depends on how the interface would look like, since the class would mostly be accessed by JS, not C++, unless you're planning for C++ rmgen support, which maybe thinkable, but sounds a bit hardcore (as the map would have to be compiled into the binary or into a separate binary, or shared library). I would guess JS has legitimate reason to iterate over a tileclass grid, but dunno if that would be faster to accomplish via next() prev() JS functions rather than accessing the N'th position?

Mostly we need to know whether TileClasses must follow JS or C++ (comparing the most performant JS with the most performant C++). I have reason to take some measurements, which I can accomplish as soon as I finished my stuff journey, run this application again and fixed defects. Otherwise the other participants remain free to use the service to share or produce more information to help decide on the case.

smiley added a comment.EditedApr 12 2019, 8:14 AM

80% improvement on some maps. Already decided. As mentioned before, js “pureness” is a moot point. Take what you can.
Btw, JS iterators only have a next() function.

Stan added a reviewer: FeXoR.Sep 10 2019, 10:31 AM