Details
See it compiles fine, doesn't trigger weird behaviours, and also that the sound produced by one unit is always the same.
Diff Detail
- Repository
- rP 0 A.D. Public Repository
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 7152 Build 11663: Vulcan Build Jenkins
Event Timeline
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
190 | What's about static? |
- Make rng static to cut the compute time by half
- Use a custom seed only when cmpVisual is set
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1179/display/redirect
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1181/display/redirect
Maybe we should use boost after all https://stackoverflow.com/questions/20998470/random-numbers-c11-vs-boost
It makes sense since we use boost::mt19937 everywhere. And at sometime we may replace it at once in all places. Could you compare? I.e. you can add your sample to our tests and you'll be able to compile with boost.
Thanks for the tip, I was able to make a new test (I had some troubles with static variables XD)
Btw should I add the static variable to the class like the test or keep it global ?
STL seems faster on my machine
RandFloat(float min, float max): 0.672794 seconds RandFloatBoost(float min, float max): 26.7208 seconds RandFloatStl(float min, float max): 22.1563 seconds RandFloat(float min, float max): 0.792204 seconds RandFloatBoost(float min, float max): 27.6625 seconds RandFloatStl(float min, float max): 23.7849 seconds
Updated outside script https://pastebin.com/677SM52N with more profiling (now with clang-cl & clang++ https://chocolatey.org/packages/llvm on windows) (script: https://pastebin.com/54DBNbTL)
Compiler | No Optimization | O1 | O2 | O3 | OFast/Ox |
---|---|---|---|---|---|
cl.exe RandFloat(float min, float max) | 0.667871 | 0.550605 | 0.571285 | / | 0.478461 |
cl.exe RandFloatStl(float min, float max) | 102.4880 | 24.23740 | 22.25870 | / | 22.49370 |
mingw32-g++.exe RandFloat(float min, float max) | 0.839638 | 0.342533 | 0.389668 | 0.283480 | 0.281247 |
mingw32-g++.exe RandFloatStl(float min, float max) | 152.8700 | 53.70820 | 53.06500 | 47.0674 | 47.91880 |
clang++.exe RandFloat(float min, float max) | 0.970688 | 0.555659 | 0.286339 | 0.340232 | 0.291435 |
clang++.exe RandFloatStl(float min, float max) | 48.37020 | 23.80280 | 23.25420 | 23.29690 | 21.67440 |
clang-cl.exe RandFloat(float min, float max) | 0.635102 | 0.438460 | 0.314780 | / | 0.268367 |
clang-cl.exe RandFloatStl(float min, float max) | 44.04800 | 21.94620 | 22.59220 | / | 22.25470 |
Warning by clang
file.cpp(61,21): warning: declaration requires a global constructor [-Wglobal-constructors] static std::mt19937 m_StlRng;
mt19937 seems ok for me, but I'm worrying about performance. Could you test how many calls of the UploadPropertiesAndPlay function we have on the biggest map with biggest number of units per second?
@vladislavbelov Three questions
How do you do get the frame reset ?
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
40 | Should this be SoundGroup::g_Rng ? | |
43 | Should this be SoundGroup::RandFloat ? |
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1185/display/redirect
Diff to test Replay
- Use srand(seed) instead. According to my tests it works way faster. (See below)
- We don't care about network generating other numbers because sounds should be out of the simulation.
clang++ -02 base time: 0.000382867 seconds srand time: 0.000917521 seconds std::mt19973 time: 0.0250902 seconds
clang-cl -02 base time: 0.000463292 seconds srand time: 0.000489345 seconds std::mt19973 time: 0.0320702 seconds
msvc /02 base time: 0.000429687 seconds srand time: 0.00112821 seconds std::mt19973 time: 0.0217086 seconds
mingw -02 base time: 0 seconds srand time: 0 seconds std::mt19973 time: 0.051875 seconds
srand(3); float value2 = RandFloat(min, max); srand(3); float value3 = RandFloat(min, max); TS_ASSERT(value3 == value2); // True srand(2533652); float value4 = RandFloat(min, max); srand(2533652); float value5 = RandFloat(min, max); TS_ASSERT(value4 == value5); // True TS_ASSERT(value3 != value5); //False
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1189/display/redirect
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
187 | It's not good to use srand here, because there are other clients of rand(). |
@wraitii Any idea on what could be faster than a MT19973 ? srand() would have been perfect but we cannot use it.
I mean if we don't really care about actual randomness (which it seems we don't), maybe something from there: https://stackoverflow.com/questions/1640258/need-a-fast-random-generator-for-c
Like this should be super-fast and should be enough:
inline int fastrand() { g_seed = (214013*g_seed+2531011); return (g_seed>>16)&0x7FFF; }
Or maybe look at those, they are also a few LOC:
https://arvid.io/2018/07/02/better-cxx-prng/
cl.exe file.cpp /fp:precise file.cpp Microsoft (R) Incremental Linker Version 14.00.24245.0 Copyright (C) Microsoft Corporation. All rights reserved. /out:file.exe file.obj time: 0.114689 seconds time: 0.00824674 seconds time: 0.166148 seconds time: 18.0688 seconds cl.exe file.cpp /EHsc /O1 /fp:precise file.cpp Microsoft (R) Incremental Linker Version 14.00.24245.0 Copyright (C) Microsoft Corporation. All rights reserved. /out:file.exe file.obj time: 0.11172 seconds time: 0.00184184 seconds time: 0.119409 seconds time: 4.03114 seconds cl.exe file.cpp /EHsc /O2 /fp:precise file.cpp Microsoft (R) Incremental Linker Version 14.00.24245.0 Copyright (C) Microsoft Corporation. All rights reserved. /out:file.exe file.obj time: 0.0937283 seconds time: 0.000168023 seconds time: 0.119085 seconds time: 3.99642 seconds cl.exe file.cpp /EHsc /Ox /fp:precise file.cpp Microsoft (R) Incremental Linker Version 14.00.24245.0 Copyright (C) Microsoft Corporation. All rights reserved. /out:file.exe file.obj time: 0.0864048 seconds time: 0.00019974 seconds time: 0.115145 seconds time: 4.15333 seconds clang++ -Xclang -flto-visibility-public-std -Wall -Wextra -Wpedantic file.cpp -o file.exe time: 0.182753 seconds time: 0.01829 seconds time: 0.199486 seconds time: 8.68231 seconds clang++ -Xclang -flto-visibility-public-std -Wall -Wextra -Wpedantic -O1 file.cpp -o file.exe time: 0.0987724 seconds time: 0.0103253 seconds time: 0.121672 seconds time: 4.25378 seconds clang++ -Xclang -flto-visibility-public-std -Wall -Wextra -Wpedantic -O2 file.cpp -o file.exe time: 0.0754372 seconds time: 3.78e-07 seconds time: 0.100211 seconds time: 4.16511 seconds clang++ -Xclang -flto-visibility-public-std -Wall -Wextra -Wpedantic -O3 file.cpp -o file.exe time: 0.0575376 seconds time: 0 seconds time: 0.0823314 seconds time: 4.14444 seconds clang++ -Xclang -flto-visibility-public-std file.cpp -Wall -Wextra -Wpedantic -Ofast -o file.exe time: 0.0538619 seconds time: 0 seconds time: 0.0858678 seconds time: 4.14739 seconds clang-cl file.cpp -o file.exe -Wall -Wextra -Wpedantic time: 0.123457 seconds time: 0.00816971 seconds time: 0.155622 seconds time: 8.44064 seconds clang-cl file.cpp -o file.exe -Wall -Wextra -Wpedantic -O1 time: 0.0752284 seconds time: 0 seconds time: 0.100644 seconds time: 4.16412 seconds clang-cl file.cpp -o file.exe -Wall -Wextra -Wpedantic -O2 time: 0.0780395 seconds time: 0 seconds time: 0.104815 seconds time: 4.13017 seconds clang-cl file.cpp -o file.exe -Wall -Wextra -Wpedantic -Ox time: 0.0775343 seconds time: 0 seconds time: 0.103522 seconds time: 4.16733 seconds mingw32-g++ file.cpp -o file.exe -Wall -Wextra -Wpedantic -static-libgcc -static-libstdc++ time: 0.105719 seconds time: 0.013957 seconds time: 0.132643 seconds time: 28.3352 seconds mingw32-g++ file.cpp -o file.exe -Wall -Wextra -Wpedantic -O1 -static-libgcc -static-libstdc++ time: 0.061683 seconds time: 0 seconds time: 0.082779 seconds time: 9.54547 seconds mingw32-g++ file.cpp -o file.exe -Wall -Wextra -Wpedantic -O2 -static-libgcc -static-libstdc++ time: 0.065681 seconds time: 0 seconds time: 0.08375 seconds time: 9.30311 seconds mingw32-g++ file.cpp -o file.exe -Wall -Wextra -Wpedantic -O3 -static-libgcc -static-libstdc++ time: 0.040891 seconds time: 0 seconds time: 0.063861 seconds time: 8.30778 seconds mingw32-g++ file.cpp -o file.exe -Wall -Wextra -Wpedantic -Ofast -static-libgcc -static-libstdc++ time: 0.039929 seconds time: 0 seconds time: 0.060801 seconds time: 8.38361 seconds end
Too lazy to format those, but the
inline int fastrand() { g_seed = (214013*g_seed+2531011); return (g_seed>>16)&0x7FFF; }
is at worst 10 times faster than the current code. So not only will the game go faster but we'll get the feature. Win/win
- Use the fast random function
- Make function and variables members of the SoundGroup.cpp class.
- Make them non static as it doesn't matter as they are constantly overridden.
Build failure - The Moirai have given mortals hearts that can endure.
Link to build: https://jenkins.wildfiregames.com/job/differential/1222/display/redirect
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1223/display/redirect
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1224/display/redirect
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1262/display/redirect
I don't have any objection to use the fast random function, because it won't be used for visual things where non-uniformness is visible.
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
43 | Any source, why these constants? | |
44 | Why 0x7FFF and not 0xFFFF? | |
source/soundmanager/scripting/SoundGroup.h | ||
85 ↗ | (On Diff #7826) | I see 0x7FFF, it's usually uint16_t, but you have uint16_t overflow here, so it have to be uint32_t I suppose. |
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
43 | Well thats used by a lot of number generators but that aside I don't know. Maybe @wraitii does. | |
44 | Was that way I assume its faster to generate smaller numbers. | |
source/soundmanager/scripting/SoundGroup.h | ||
85 ↗ | (On Diff #7826) | Yeah that was the recommendation on stackoverflow |
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
43 | Seems referred here https://software.intel.com/en-us/articles/fast-random-number-generator-on-the-intel-pentiumr-4-processor but honestly I just found it on stack overflow so no idea. |
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1271/display/redirect
Successful build - Chance fights ever on the side of the prudent.
Link to build: https://jenkins.wildfiregames.com/job/differential/1273/display/redirect
std::mt19937 looked good to me, as Vladislav said since it's used in the other places too.
Actually the simulation and AI seed use boost::rand48.
But still better than using custom magic numbers?
The only possible performance problem was reinitializing the seed often right?
Wouldn't it be sufficient initialize the seed upon component construction once? Or upon first soundplay?
(Though the magic numbers aren't so bad, only one line, or four including the wrapping.)
(Especially if it's some orders of magnitude faster. But perhaps there are other std or boost RNG that are also orders of magnitude faster.)
source/soundmanager/scripting/SoundGroup.cpp | ||
---|---|---|
44 | magic numbers warrant a comment explaining whether they're fully arbitrary or not |
Yeah but it slow as hell. I can be called up to 180 per frame on a 4x4 game. This new random function (Fastrand) is ten time faster than the current code and therefore maybe up 100 times faster than mt. So at worst it makes the sound system faster and at best it does what this ticket is about.
Yeah seed reinit is slow. No you need to set the seed each time because you need to reset the random generator as its used more than once.
0XFFF is the max value tat can be reached.
The other two constants are not arbitrary they are what make the seed collision possible.
You refer to the constructor of std::mt19937, right?
You need to set the seed each time because you need to reset the random generator as its used more than once.
How so? An RNG is initialized once and then generates a sequence of random numbers (with the () operator), not only one number.
That's how the simulation and AI seed work too and how it's supposed to work.
The other two constants are not arbitrary they are what make the seed collision possible.
The magic numbers are chosen so that they generate pseudo random numbers over a big enough period
How did you chose them? How did you know that 214013 is a good number, and how does the reader know that 214014 wouldn't be better?
It sounds like Stackoverflow copypaste. Then you should link it.
Doing a websearch seems to confirm this https://stackoverflow.com/questions/4768180/rand-implementation
It's also a Microsoft Implementation, used in Erlang and Cisco projects. (http://irclogs.wildfiregames.com/2019-04/2019-04-25-QuakeNet-%230ad-dev.log 22:05)
https://rosettacode.org/wiki/Linear_congruential_generator
http://pi.math.cornell.edu/~mec/Winter2009/Luo/Linear%20Congruential%20Generator/linear%20congruential%20gen1.html
https://github.com/cisco-system-traffic-generator/trex-core
https://github.com/erlang/otp
You refer to the constructor of std::mt19937, right?
No even the number generation is slow.
How so? An RNG is initialized once and then generates a sequence of random numbers (with the () operator), not only one number.
That's how the simulation and AI seed work too and how it's supposed to work.
Here we generate two numbers at most m_Gain and m_Pitch. Those number should be the same sequence for a given cmpVisualActor seed. If you keep the same one and the sequence is like 10,25,65,12 the second time it's called for the same seed it will generate different numbers.
- M_Gain = 10 M_Pitch = 25 Seed 123456
- M_Gain = 25 M_Pitch = 65 Seed 123456
Hence why it needs to be reset so it starts at the sequence beginning again.
How did you chose them? How did you know that 214013 is a good number, and how does the reader know that 214014 wouldn't be better?
It sounds like Stackoverflow copypaste. Then you should link it.
Doing a websearch seems to confirm this https://stackoverflow.com/questions/4768180/rand-implementation
That's part of the Theorem and used in most implementation, sure I could have found other numbers. But I don't see the point.
Okay, thanks for having tested.
That's part of the Theorem and used in most implementation
(https://en.wikipedia.org/wiki/Linear_congruential_generator#Parameters_in_common_use)
Since the code comment mentions the algorithm, people can lookup the value ranges of the numbers, okay.
Then at last, there is http://www.cplusplus.com/reference/random/linear_congruential_engine/ (@vladislavbelov)
Then at last, there is http://www.cplusplus.com/reference/random/linear_congruential_engine/ (@vladislavbelov)
Ah didnt find that one I guess I could have run more profiling. But the usage is a bit different so maybe not.
AFAIK it's a bit different, it explicitly does the modulo division. Current implementation uses an implicit and probably faster modulo division through unsigned 32 bit overflowing.
I see that http://www.cplusplus.com/reference/random/mt19937/ has so many arguments that it must be slower, and that the other std::random generators only produce unsigned int types, so okay here.
(Still not a fan of reinventing the wheel. If the other components also need fast random numbers, they would have to introduce a copy. It's not more than 9 lines, but it's 9 lines that would be repeated for the next use case.)
it explicitly does the modulo division
This one does & 0xFFFF, if the implementation is smart, it will not perform the modulo operator at all if the modulo argument was set to 0. (Anyhow, not a defect.)