OptimusShepard (Pirmin Stanglmeier)
User

Projects

Contributors
Group

User Details

User Since: Feb 24 2017, 5:41 PM (387 w, 10 h)

Recent Activity
View All

Dec 5 2021

OptimusShepard added a comment to D4366: Increase the frame limiter maximum up to 360fps.

In D4366#185970, @vladislavbelov wrote:

Why not 320Hz or 360Hz?

Dec 5 2021, 6:03 PM

OptimusShepard retitled D4366: Increase the frame limiter maximum up to 360fps from Increase the frame limiter maximum up to 240fps to Increase the frame limiter maximum up to 360fps.

Dec 5 2021, 6:02 PM

OptimusShepard updated the diff for D4366: Increase the frame limiter maximum up to 360fps.

Changed the maximum to 360 as current top end monitors support 360Hz.

Dec 5 2021, 6:01 PM

OptimusShepard requested review of D4366: Increase the frame limiter maximum up to 360fps.

Dec 5 2021, 3:45 PM

Feb 27 2021

OptimusShepard added a comment to D2936: Allow limiting the max number of corpses simultaneously visible in the game.

In D2936#157405, @nani wrote:

X axis: Is this for every turn or every frame?

I'm not sure

Y axis: Is this the total render+simulation+gui+etc?

Yep.

If you want to see a difference you must split for each category, in the case for this diff the improvements are in the render category. I kind of see a difference between the two but don't look much diferent which means that all the noise from the other categories are drowning the render values. I tested this with the in-game profiler with AutoCiv mod corpse limit implementation so I can guarantee that there is a big improvement. In the case you are still not sold on it :) you can do a fast test with autociv (a23 or a24, doesn't matter) and compare, no even need for profiling as it very obvious.

Renderer: Red vanilla, green 100, blue 50

Feb 27 2021, 8:37 PM

OptimusShepard added a comment to D2936: Allow limiting the max number of corpses simultaneously visible in the game.

I made a new battle profile map and profiled it.

Vanilla

Limited to 100 corpses

Limited to 50 corpses

profil_data.zip4 MBDownload

skirmishes.zip108 KBDownload

(D3554 is needed)

Feb 27 2021, 6:31 PM

Feb 19 2021

OptimusShepard added a comment to D3554: JS Interface Profiler2.

In D3554#156273, @wraitii wrote:

I feel like you might have an easier time with "Enable"/"Disable" instead of Toggle, but otherwise this looks OK :)

Feb 19 2021, 9:50 PM

Feb 18 2021

OptimusShepard accepted D3578: Do not generate render data in case Decal calculated wrong coordinates.

Patch works for me, can't reproduce the bug anymore. Thx

Feb 18 2021, 8:55 PM

Feb 14 2021

OptimusShepard updated the summary of D3505: Profile/Benchmark Map.

Feb 14 2021, 3:19 PM

OptimusShepard added a comment to D3505: Profile/Benchmark Map.

In D3505#156053, @Stan wrote:

Missing the XML file?

Feb 14 2021, 3:17 PM

OptimusShepard added a comment to D2938: GL_ARB instancing to reduce draw calls.

Tested with D3505 an D3554, Ryzen 3700X, Radeon 5700, Win 10.

vanilla

patched

Feb 14 2021, 2:05 PM

OptimusShepard updated the test plan for D3505: Profile/Benchmark Map.

Feb 14 2021, 1:56 PM

OptimusShepard updated the diff for D3505: Profile/Benchmark Map.

Adding triggers for start, save and stop recording. Shorting the cinematic path.

Feb 14 2021, 1:54 PM

OptimusShepard updated the test plan for D3554: JS Interface Profiler2.

Feb 14 2021, 1:53 PM

Feb 13 2021

OptimusShepard updated the diff for D3554: JS Interface Profiler2.

Apply Imarok's comment.

Feb 13 2021, 10:03 PM

OptimusShepard updated the diff for D3554: JS Interface Profiler2.

Adding a new version only for testing. Work in progress.
The functionality can be tested, running the attached map.

skirmishes.zip677 KBDownload

Feb 13 2021, 8:32 PM

Feb 10 2021

OptimusShepard added a comment to D3554: JS Interface Profiler2.

In D3554#155633, @Stan wrote:

Where do you try to call them? You need to register them differently depending on whether you want them on the gui or the simulation

Feb 10 2021, 10:48 PM

OptimusShepard requested review of D3554: JS Interface Profiler2.

Feb 10 2021, 10:22 PM

Feb 7 2021

OptimusShepard added a comment to D2938: GL_ARB instancing to reduce draw calls.

Ryzen 3700X, Radeon 5700, Windows 10

red vanilla, green patched
I take the map from D3505, profil start at 2s end at 120s. The big offset is surprising me, as I start profiling after loading the map.

data.zip6 MBDownload

Feb 7 2021, 9:11 PM

Feb 5 2021

OptimusShepard added a comment to D3522: Separates allocated vertex buffers into groups.

In D3522#154824, @vladislavbelov wrote:

Could you attach the sources?

I take the map from D3505, profil start at 2s end at 120s.

data.zip6 MBDownload

Feb 5 2021, 9:52 PM

OptimusShepard added a comment to D2857: Matrix3D SSE.

red vanilla, green patch

Feb 5 2021, 9:31 PM

OptimusShepard updated the diff for D2857: Matrix3D SSE.

Updated includes

Feb 5 2021, 9:29 PM

OptimusShepard added a comment to D3522: Separates allocated vertex buffers into groups.

I can't notice a performance improvement. But I only checked the profiler2.

red vanilla, green patched

Feb 5 2021, 8:51 PM

Feb 3 2021

OptimusShepard added a comment to D2528: Performance improvements to VertexBuffer.

I made some profiling with my new benchmark map.

red vanilla, green deque, blue vector

profile_data.zip10 MBDownload

Feb 3 2021, 11:10 PM

Feb 2 2021

OptimusShepard added a comment to D3505: Profile/Benchmark Map.

In D3505#154314, @wraitii wrote:

I don't think it's that necessary to add pathfinding calls and such, to be honest, except perhaps as a way to profile animation code.

Feb 2 2021, 7:18 PM

Jan 31 2021

OptimusShepard requested review of D3505: Profile/Benchmark Map.

Jan 31 2021, 5:52 PM

Jan 28 2021

OptimusShepard added a comment to D3454: Modified CFixedVector2D/3D that cache length to reduce calls to isqrt64(), plus a few other fixes.

In D3454#153884, @DanW58 wrote:

Ah, wait! Did you profile this patch here or the one I submitted in the forum?

I profiled this patch here.

Thanks again. When my present work is finished I will submit a new patch, from the sources
folder, and submit it here.

Thank you. I'm exited to test your new patch :)

Jan 28 2021, 9:48 AM

Jan 27 2021

OptimusShepard added a comment to D3454: Modified CFixedVector2D/3D that cache length to reduce calls to isqrt64(), plus a few other fixes.

Can you please remove the "0ad" at the beginning of your file paths and start at "source" please?
I made some performance profiling, one time with heavy graphics load and one time with heavy movement load. I didn't see any performance improvements. Are there any other scenarios I should test?

Jan 27 2021, 7:02 PM

Jan 25 2021

OptimusShepard accepted D3474: Alpha 24 name: Xšayaṛša.

Would have been nice, if you explained, that this is the old Persian form of "Xerxes". Xšayaṛša isn't really known ;)

Jan 25 2021, 4:41 PM

Jan 24 2021

OptimusShepard added a comment to D3437: Move SSE.h to a better place.

Seems like a more useful place. But maybe you should rename the files to simd.cpp/h for more general use and preparation for future usage of AVX/AVX2?

Jan 24 2021, 11:03 PM

Jan 21 2021

OptimusShepard accepted rP24215: Let players remap hotkeys in-game, fix default hotkeys being qwerty-specific..

Jan 21 2021, 2:35 PM

Jan 18 2021

OptimusShepard updated the diff for D2857: Matrix3D SSE.

Update float values 0 -> 0.f (See comment)

Jan 18 2021, 2:36 PM

OptimusShepard updated the diff for D2857: Matrix3D SSE.

Removes the pointer switch and replaces it with a macro condition.
First, Linux loses performance by using the SSE implementation. Therefore, this implementation is now only for Windows.
Second, we lose too much performance by using pointers. We already use SSE2 flags for Windows. For this reason, the pointers have been removed and replaced by Windows + SSE macro condition.

Jan 18 2021, 2:32 PM

Jan 17 2021

OptimusShepard added a comment to D3400: Fix market waypoints..

In D3400#150863, @Imarok wrote:

Tried setting three waypoints? first to somewhere, second to the other market, third to somewhere else?
(That gives quite strange behaviour.)

Jan 17 2021, 10:31 PM

Jan 16 2021

OptimusShepard raised a concern with rP24215: Let players remap hotkeys in-game, fix default hotkeys being qwerty-specific..

Causes #5922

Jan 16 2021, 12:03 AM

Jan 13 2021

OptimusShepard added a comment to D2528: Performance improvements to VertexBuffer.

@vladislavbelov any plans to commit this improvement?

Jan 13 2021, 10:28 AM

Jan 11 2021

OptimusShepard accepted rP23262: Workaround for L3 cache detection of Ryzen 3000.

Solved in rP24550.

Jan 11 2021, 10:53 PM

OptimusShepard abandoned D3031: Workaround AMD hardware detection.

No longer needed as rP24550 has been committed.

Jan 11 2021, 10:51 PM

Dec 20 2020

OptimusShepard added inline comments to D2857: Matrix3D SSE.

Dec 20 2020, 9:45 AM

OptimusShepard added a comment to D2857: Matrix3D SSE.

In D2857#143713, @Stan wrote:

You need to compare generated assembly too :)

So here ist the generated assembly for matrix multiplication, SSE2 flag enabled, MSVC2017.

Dec 20 2020, 9:37 AM

OptimusShepard updated the summary of D2857: Matrix3D SSE.

Dec 20 2020, 9:28 AM

Dec 19 2020

OptimusShepard planned changes to D2857: Matrix3D SSE.

In D2857#137872, @wraitii wrote:

I'm not convinced TBH. If this is hardcoded at compile-time, either we drop support or it's pretty much useless for releases. SIMD-capable compilers seem able to vectorise this functions, so custom versions don't seem particularly useful.
If there was a runtime switch that actually increased performance, might be more interesting.

I will wait until @Stan has finished D3212. I plan, to implement only the matrix multiplication, like the other SSE implementations are done. So the SSE path can be chosen at runtime. Would that be an option?

Dec 19 2020, 11:40 PM

OptimusShepard added a comment to D2857: Matrix3D SSE.

Made some performance profiling on MSVC2017, SSE2 flag enabled, Ryzen 3700X.

Dec 19 2020, 11:31 PM

Dec 18 2020

OptimusShepard added a comment to D3212: Remove SSE duplication in Colors and ModelRenderer.

In D3212#143216, @Stan wrote:

It seems sadly that gcc on Optimus' computer doesnt take fully advantage of it either

GCC 10.2.0

Dec 18 2020, 10:42 AM

OptimusShepard added a comment to D3212: Remove SSE duplication in Colors and ModelRenderer.

I despise the OS solution however because it leads to unecessary macro complexity.

Maybe it's less complexity, if we only care on the architecture? So always use the SSE functions and flag on x86. As you already said, Windows doesn't support non SSE2 CPUs. If someone is using such an 20 years old, non SSE Linux PC, the computer is already to slow to run the game. So ignore them.

Dec 18 2020, 8:27 AM

Dec 13 2020

OptimusShepard added a comment to D3212: Remove SSE duplication in Colors and ModelRenderer.

In D3212#142196, @Stan wrote:

It's like 1ms slower?

Dec 13 2020, 10:32 PM

OptimusShepard added a comment to D3212: Remove SSE duplication in Colors and ModelRenderer.

In D3212#142191, @Stan wrote:

I don't understand does the current patch make things slower for you?.

Yep, the patch slowdown the performance for me.

By default the flag is already set to SSE2. (visual studio says so)

I don't know why, but it's a difference, if I set the flag in Premake instead of Visual Studio. Setting flags in VS the performance doesn't change in any way. Setting SSE2, AVX, or AVX2 flags in Premake improves the performance.

Also your profile data always have that weird offset which makes it hard to see anything

This is because you can only save the profile data at the end, not at the beginning of a session. Different framerates leads than to an offset.
Here the performance comparison of your patch.

frame: red unpatched, green your patch.

Dec 13 2020, 10:18 PM

OptimusShepard added a comment to D3212: Remove SSE duplication in Colors and ModelRenderer.

In D3212#142027, @Stan wrote:

@OptimusShepard is right, as shown in P149, the code is now good (it's possible it wasn't on old MSVC);

Dec 13 2020, 8:24 PM

Dec 12 2020

OptimusShepard added a comment to D3212: Remove SSE duplication in Colors and ModelRenderer.

Maybe I'm a bit off topic, but do we really need those SSE functions? Can't that be done via compiler flags by the compiler itself? I think we should always compile with SSE on x86. SSE was introduced on Pentium III 1999, I guess nobody does use 21 year old hardware?
Would be also nice if wildfiregames could offer a AVX or a AVX2 release version besides the SSE version. I guess, replacing the SSE compiler flags by AVX/AVX2 flags shouldn't be that effort?

Dec 12 2020, 4:23 PM

Dec 6 2020

OptimusShepard added a comment to D14: Thread the pathfinder computations.

In D14#140491, @OptimusShepard wrote:

I made some new profiling.

Dec 6 2020, 5:15 PM

Dec 5 2020

OptimusShepard added a comment to D14: Thread the pathfinder computations.

I made some new profiling.

Green patched - 16 threads, red without patch.
Replay and profiling data see below.

profiling.zip6 MBDownload

Dec 5 2020, 5:40 PM

OptimusShepard accepted rP24233: Fix rendering options failures following rP24228.

Dec 5 2020, 3:08 PM

Nov 24 2020

OptimusShepard added a comment to D3138: Fix AA / Sharpness not being correctly enabled at the start..

In D3138#138391, @vladislavbelov wrote:

That breaks/not fixes MSAA.

Nov 24 2020, 11:15 PM

OptimusShepard accepted D3138: Fix AA / Sharpness not being correctly enabled at the start..

Everything works correct again, thx.

Nov 24 2020, 9:51 AM

OptimusShepard raised a concern with rP24233: Fix rendering options failures following rP24228.

The sharpening and anti-aliasing options fail every time the game starts. To get them work, I have always to turn them off and on.

Nov 24 2020, 12:41 AM

Nov 13 2020

OptimusShepard added a comment to D2812: Adds MSAA to anti-aliasing techniques.

In D2812#135846, @OptimusShepard wrote:

I tested it on my notebook Windows 10 Intel 8250U, Intel UHD 620 and on my desktop Windows AMD 3700X, AMD RX 5700. Everything works fine.
I also didn't notice any performance impact on my desktop. Is the whole calculation done by the GPU?

Nov 13 2020, 8:33 PM

OptimusShepard added a comment to D2812: Adds MSAA to anti-aliasing techniques.

I tested it on my notebook Windows 10 Intel 8250U, Intel UHD 620 and on my desktop Windows AMD 3700X, AMD RX 5700. Everything works fine.
I also didn't notice any performance impact on my desktop. Is the whole calculation done by the GPU?

Nov 13 2020, 10:52 AM

Nov 11 2020

OptimusShepard added a comment to D2440: Per-Unit Discrete LOD using actor quality levels.

In D2440#135338, @Stan wrote:

I'm gonna push some change, can you try again afterwards ?

Nov 11 2020, 9:12 AM

OptimusShepard added a comment to D2440: Per-Unit Discrete LOD using actor quality levels.

In D2440#135335, @Stan wrote:

@OptimusShepard is it the same if you disable the props?

I made a new profiling with deactivating instead of the patch, absolutely the same.

I suspect <prop actor="props/units/quiver_greek_back.xml" attachpoint="back"/> to be particularly slow.

You're right, this is the main point.

profiler2 frame: red patched, green only quiver greek back deactivated, blue old
I guess, trees could be interesting too.

Nov 11 2020, 12:38 AM

Nov 10 2020

OptimusShepard added a comment to D2440: Per-Unit Discrete LOD using actor quality levels.

I made some profiling on the new map oceanside using athenians archers. I noticed realy big performance improvements when zooming out.
Test conditions: oceanside map, athenians archer, no moving of the units, saved at 4:30min, completely zoomed out, no moving of the camera from it's start position (only zooming out)

profiler2 frame: red old, green patched.

Nov 10 2020, 11:54 PM

Nov 7 2020

OptimusShepard added a comment to D1789: Handle unknown APIC IDs in the ACPI SRAT.

Game works for me. (Ryzen 3700X, Windows)
Seems to, that the patch is needed for some AMD Threaripper CPUs.

Nov 7 2020, 4:22 PM

Nov 5 2020

OptimusShepard added a comment to D2726: AMD Ryzen fix.

The Game still runs on Windows AMD Ryzen 3700X.

Nov 5 2020, 11:19 AM

Nov 4 2020

OptimusShepard added a comment to D3052: Moves terrain lighting calculation to GPU.

I have neither noticed any issues or performance improvements. Windows, AMD Radeon

Nov 4 2020, 12:28 PM

Oct 11 2020

OptimusShepard updated the diff for D3031: Workaround AMD hardware detection.

Reupload of the newest version.

Oct 11 2020, 9:00 PM

OptimusShepard updated the diff for D3031: Workaround AMD hardware detection.

Vendor check bevor skipping the validation.

Oct 11 2020, 8:54 PM

OptimusShepard added a reviewer for D3031: Workaround AMD hardware detection: Imarok.

Oct 11 2020, 5:24 PM

OptimusShepard requested review of D3031: Workaround AMD hardware detection.

Oct 11 2020, 5:21 PM

OptimusShepard raised a concern with rP23262: Workaround for L3 cache detection of Ryzen 3000.

The workaround is to specific, as it seems to work only for the Ryzens. The Threadrippers still fail. A correct implementation, or a more general solution/workaround is needed.

Oct 11 2020, 3:52 PM

Aug 22 2020

OptimusShepard added a comment to D2812: Adds MSAA to anti-aliasing techniques.

Patch works for me (Windows 10, RX 5700).
@vladislavbelov your former patch had AA for trees too, without a big performance hit. Does the new performance impact came from the water bugfix?
Also I don't think, we should care about performance hits by trees, as we have FXAA for lower hardware. I see no need for the implementation in the current state, as FXAA + sharpening looks better than MSAA, as long as trees have aliasing. Especially as FXAA and sharpening has no measurable performance impact compared to MSAA.

Aug 22 2020, 4:25 PM

Aug 8 2020

OptimusShepard added a comment to D2936: Allow limiting the max number of corpses simultaneously visible in the game.

In D2936#128281, @Stan wrote:

Dying animation is the corpse ^^

Aug 8 2020, 6:30 PM

OptimusShepard added a comment to D2936: Allow limiting the max number of corpses simultaneously visible in the game.

A dying animation is not possible here? Or at least a simple visual plop effect?

Aug 8 2020, 6:06 PM

Aug 7 2020

OptimusShepard added inline comments to D2642: Contrast-Adaptiv-Sharpening pass.

Aug 7 2020, 11:30 PM

OptimusShepard awarded D2938: GL_ARB instancing to reduce draw calls a Like token.

Aug 7 2020, 4:19 PM

OptimusShepard added a comment to D2938: GL_ARB instancing to reduce draw calls.

In D2938#128125, @wraitii wrote:

gpuskinning is an option in default.cfg that you have to activate using the config file. You could try (de)/activating that

Setting it disabled doesn't change anything.

What are you reporting as being 10/20% slower exactly?

The whole frame.

Edit: I feel like your GC might be too good. Can you try capping max_matrix_uniform to maybe 64 in hwdetect.js ?

That's it. Thx

Render: red non patch, green patched.

Aug 7 2020, 3:53 PM

OptimusShepard added a comment to D2938: GL_ARB instancing to reduce draw calls.

In D2938#128120, @wraitii wrote:

It should be a straight improvement on svn, so this seems odd. How did you replay exactly? Were you using GPU skinning? Were you using prefer glsl? What map?

I always use a 200 units replay on sahydian buttes, camera will not moved or zoomed. What is GPU skinning? GSLS enabled.

Edit -> Also what is your graphics card?

Radeon RX 5700

Aug 7 2020, 3:15 PM

OptimusShepard added a comment to D2938: GL_ARB instancing to reduce draw calls.

I also got this compiler warnings
\source\renderer/ModelVertexRenderer.h(158): warning C4100: "model": Unreferenzierter formaler Parameter (Quelldatei wird kompiliert ..\..\..\source\renderer\InstancingModelRenderer.cpp)
\source\renderer/ModelVertexRenderer.h(158): warning C4100: "shader": Unreferenzierter formaler Parameter (Quelldatei wird kompiliert ..\..\..\source\renderer\InstancingModelRenderer.cpp)
\source\renderer/ModelVertexRenderer.h(158): warning C4100: "model": Unreferenzierter formaler Parameter (Quelldatei wird kompiliert ..\..\..\source\renderer\HWLightingModelRenderer.cpp)
\source\renderer/ModelVertexRenderer.h(158): warning C4100: "shader": Unreferenzierter formaler Parameter (Quelldatei wird kompiliert ..\..\..\source\renderer\HWLightingModelRenderer.cpp)
\source\renderer/ModelVertexRenderer.h(158): warning C4100: "model": Unreferenzierter formaler Parameter (Quelldatei wird kompiliert ..\..\..\source\renderer\ModelRenderer.cpp)
\source\renderer/ModelVertexRenderer.h(158): warning C4100: "shader": Unreferenzierter formaler Parameter (Quelldatei wird kompiliert ..\..\..\source\renderer\ModelRenderer.cpp)

Aug 7 2020, 3:02 PM

OptimusShepard added a comment to D2938: GL_ARB instancing to reduce draw calls.

I made some quick profiling. I noticed that the amount of draw calls is halfed. I also tested value 1 against 64. 64 is much faster. But compared to the non patched replay, there was a performance hit by ~10-20%.
Does this hit come from your new profiler option "saved draw calls"? I think some calculations of profiler are active, even if you don't have open the profiler panel. Do you have a non profiler version to test?

Aug 7 2020, 2:52 PM

Aug 6 2020

OptimusShepard added inline comments to D2936: Allow limiting the max number of corpses simultaneously visible in the game.

Aug 6 2020, 3:09 PM

OptimusShepard added inline comments to D2936: Allow limiting the max number of corpses simultaneously visible in the game.

Aug 6 2020, 2:53 PM

OptimusShepard awarded D2936: Allow limiting the max number of corpses simultaneously visible in the game a Like token.

Aug 6 2020, 2:52 PM

Jul 25 2020

OptimusShepard added inline comments to D2642: Contrast-Adaptiv-Sharpening pass.

Jul 25 2020, 12:08 AM

OptimusShepard added inline comments to D2642: Contrast-Adaptiv-Sharpening pass.

Jul 25 2020, 12:06 AM

Jul 24 2020

OptimusShepard updated the diff for D2642: Contrast-Adaptiv-Sharpening pass.

Edited the sharpness initialization. Disables sharpening, if post progressing, or GLSL, is disabled.

Jul 24 2020, 11:55 PM

Jul 23 2020

OptimusShepard added a comment to D14: Thread the pathfinder computations.

In D14#125536, @wraitii wrote:

The smallest "unit" of computation is a single path, so to get an improvement with 16 threads you'd need more than 16 paths per turn. This is possible in intense combat situation (e.g. combat demo huge), but not very likely.

Jul 23 2020, 9:14 AM

OptimusShepard added a comment to D14: Thread the pathfinder computations.

In D14#125537, @Stan wrote:

He has 8 cores 8threads :)

Jul 23 2020, 9:06 AM

Jul 22 2020

OptimusShepard added inline comments to D2642: Contrast-Adaptiv-Sharpening pass.

Jul 22 2020, 11:14 PM

OptimusShepard updated the diff for D2642: Contrast-Adaptiv-Sharpening pass.

Updated the versioning of the used GLSL version.

Jul 22 2020, 11:09 PM

OptimusShepard added a comment to D14: Thread the pathfinder computations.

I made some profiling with different numbers (1, 2, 3, 4, 5, 6, 8, 12, 16) of threads. If I compare the results, I guess, more than 16 threads wont improve the performance further.

pathfinder.zip32 MBDownload

Jul 22 2020, 7:59 PM

Jul 18 2020

OptimusShepard added a comment to D2857: Matrix3D SSE.

In D2857#123243, @Stan wrote:

Can you figure out what's causing the spikes?

Didn't find the cause yet. But I recognized, that the profiler gains these pikes very much. Without the framedrops were much lower.

Jul 18 2020, 12:48 AM

Jul 10 2020

OptimusShepard added a comment to D2857: Matrix3D SSE.

In D2857#123243, @Stan wrote:

I like your approach better but interestingly it's done slightly differently in https://github.com/0ad/0ad/blob/d3e68a99e7f715ad7921a81e959f8ac51dfa1248/source/graphics/Color.cpp

Oh, I think they changing the instructions by runtime, doesn't they? A bit ugly, I think, but we could use FMA :)

Can you figure out what's causing the spikes?

I will try, but on an first look, the profiler doesn't show me something useful.

Jul 10 2020, 2:18 PM

OptimusShepard updated the diff for D2857: Matrix3D SSE.

Updated the year of the license header.

Jul 10 2020, 10:34 AM

Jul 9 2020

OptimusShepard updated the diff for D2857: Matrix3D SSE.

Including the SSE header, because Vulcan fails for the none Windows tests.

Jul 9 2020, 11:55 PM

OptimusShepard updated the diff for D2857: Matrix3D SSE.

Removed the AVX and FMA version, as we don't be able, to change instructions by runtime. Furthermore the AVX instructions aren't faster than SSE here.

Jul 9 2020, 11:32 PM

OptimusShepard added a comment to D2857: Matrix3D SSE.

I have rewrite the patch, so it uses only SSE. That I have used for the profiling. I will upload it later this day.

Jul 9 2020, 3:40 PM

OptimusShepard added a comment to D2857: Matrix3D SSE.

I tested the build flags, SSE seems to be the only flag with an positiv impact. AVX2 makes everything worse. I also made some profiling.
Current version:

Jul 9 2020, 3:38 PM

Jul 3 2020

OptimusShepard planned changes to D2857: Matrix3D SSE.

The hardware request doesn't work, like it should.

Jul 3 2020, 1:04 AM

OptimusShepard created D2857: Matrix3D SSE.

Jul 3 2020, 1:02 AM

Jun 18 2020

OptimusShepard added a comment to D2780: Bugfix: GLSL query for FXAA .

In D2780#119381, @vladislavbelov wrote:

Also isn't PostProc locked by GLSL? Since we don't have effects for ARB (iirc).

No it isn't. It's currently independent. I have also tried to lock postproc, to lock antialiasing, but that doesn't work. It would only lock postproc. A bit strange I think.

Jun 18 2020, 9:38 PM

OptimusShepard updated the diff for D2780: Bugfix: GLSL query for FXAA .

Edit some comments.

Jun 18 2020, 9:31 PM

Jun 14 2020

Krinkle awarded D2789: SSE Vector3D a Orange Medal token.

Jun 14 2020, 5:40 PM

OptimusShepard (Pirmin Stanglmeier)User

Projects

User Details

Recent ActivityView All

Dec 5 2021

Feb 27 2021

Feb 19 2021

Feb 18 2021

Feb 14 2021

Feb 13 2021

Feb 10 2021

Feb 7 2021

Feb 5 2021

Feb 3 2021

Feb 2 2021

Jan 31 2021

Jan 28 2021

Jan 27 2021

Jan 25 2021

Jan 24 2021

Jan 21 2021

Jan 18 2021

Jan 17 2021

Jan 16 2021

Jan 13 2021

Jan 11 2021

Dec 20 2020

Dec 19 2020

Dec 18 2020

Dec 13 2020

Dec 12 2020

Dec 6 2020

Dec 5 2020

Nov 24 2020

Nov 13 2020

Nov 11 2020

Nov 10 2020

Nov 7 2020

Nov 5 2020

Nov 4 2020

Oct 11 2020

Aug 22 2020

Aug 8 2020

Aug 7 2020

Aug 6 2020

Jul 25 2020

Jul 24 2020

Jul 23 2020

Jul 22 2020

Jul 18 2020

Jul 10 2020

Jul 9 2020

Jul 3 2020

Jun 18 2020

Jun 14 2020

OptimusShepard (Pirmin Stanglmeier)
User

Recent Activity
View All