Page MenuHomeWildfire Games

Fix memory leak introduced by rP22511 in SparseGrid
ClosedPublic

Authored by wraitii on Jul 24 2019, 12:51 PM.

Details

Reviewers
elexis
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Commits
rP22545: Fix memory leak introduced by rP22511 in SparseGrid
Trac Tickets
#5522
Summary

See here - rP22511 accidentally introduced a memory leak.

Test Plan

Check that the memory leak reported in #5522 (cavalry replay) is no longer there.

Diff Detail

Repository
rP 0 A.D. Public Repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

wraitii created this revision.Jul 24 2019, 12:51 PM

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/simulation2/helpers/Grid.h
|  34| template<typename·T>
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.

source/simulation2/helpers/LongPathfinder.h
|   1| /*·Copyright·(C)·2017·Wildfire·Games.
|    | [NORMAL] LicenseYearBear:
|    | License should have "2019" year instead of "2017"

source/simulation2/helpers/LongPathfinder.h
|  34| ·*/
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/218/display/redirect

Angen added a subscriber: Angen.Jul 24 2019, 1:07 PM
Angen added inline comments.
source/simulation2/helpers/Grid.h
211 ↗(On Diff #9097)

could be loop from reset moved here and this call removed?
because reset is allocating new memory for m_Data in this case just so we can free it line after.

wraitii added inline comments.Jul 24 2019, 1:10 PM
source/simulation2/helpers/Grid.h
211 ↗(On Diff #9097)

It's not, placement new doesn't allocate new memory > https://www.geeksforgeeks.org/placement-new-operator-cpp/

Angen added a comment.Jul 24 2019, 1:44 PM

My RAM usage for replay from https://trac.wildfiregames.com/ticket/5530

without patch:

10 minutes of replay 1 gb ram used
16 min 1,5gb
min 20 1.8gb
 hm 27 min  2.4 gb

replay with the patch:

11 min < 1gb
15 min 1gb
19 min 1gb
27 min 1,4gb
wraitii updated this revision to Diff 9098.Jul 24 2019, 1:46 PM
wraitii edited the summary of this revision. (Show Details)

Remove state.tiles cleanup.

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/simulation2/helpers/Grid.h
|  34| template<typename·T>
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'template<...' is invalid C code. Use --std or --language to configure the language.
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/219/display/redirect

Some profiling results of the old memset code, vs the default-initializers including this diff: P162

Could use more testing on other platforms, but I see no reason to go back to memset. This was a good change overall IMO and makes SparseGrid more robust.

elexis added a subscriber: elexis.

(1) Shame on me for not noticing it, it makes me wonder how bad hell will be.

(2) On whether or not this is causing _all_ memory leaks that contribute to the constant memory increase throughout the game without training more units:
I've tested with current r22542, 30min game with 20x speed on anatolian plateau, moving cav around in place. The game ended up allocating 1GB, but when adding delete[] m_Data; and
running the visual replay, it stayed at 500mb.
As you tested with macOS leak tools, that's adding to the evidence for this thing leaking and other things not leaking (unless you found more?).

(3) On the code choice:
The construction in place looks much nicer than deleting it and allocating new, but I would still propose to avoid this construction in place when it is called from the destructor.
This can be done by moving the loop into a helper function ResetBuckets or something, call that from the destructor and from Reset.

(4) On the correctness of the previous memset code:

From https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html on -Wclass-memaccess:

Warn when the destination of a call to a raw memory function such as memset or memcpy is an object of class type, and when writing into such an object might bypass the class non-trivial or deleted constructor or copy assignment, violate const-correctness or encapsulation, or corrupt virtual table pointers. Modifying the representation of such objects may violate invariants maintained by member functions of the class. For example, the call to memset below is undefined because it modifies a non-trivial class object and is, therefore, diagnosed. The safe way to either initialize or clear the storage of objects of such types is by using the appropriate constructor or assignment operator, if one is available.

After investigation what could have triggered the warning for this type by testing a bit, the compiler warning does not occur when commenting out the PathCost non-trivial constructor.

From https://en.cppreference.com/w/cpp/language/default_constructor

The default constructor for class T is trivial (i.e. performs no action)

(See also discussions on IRC today.)

So the previous code was actually broken, just not broken with the one template it was used, and the compiler warning was a false positive.

historic_bruno also noticed that the comment about PODs is obsoleted by not using memcpy and memset anymore (there is still one or more occurrences left).

(5) On the performance:
Now that we know that the previous code was working (at least for the one class instantiated), we can test whether the previous code was preferable for performance.

I've added this crappy but functional testcode{F1019351}.
Results with gcc9:

previous memset code:
Took 23072773
Took 22701399
Took 22662819

patched with first version of D2121:
Took 22652694

So I could not reproduce historic_brunos results.
The performance looks exactly the same.

So conclusively I would agree that this fix is good, even if it is not confirming a 10fold performance improvement on my system.

Further cleanup can be done separately.

source/simulation2/helpers/Grid.h
194 ↗(On Diff #9098)

(can remove braces + add \n)

211 ↗(On Diff #9097)

The value-construction in place is still unnecessary in case of calling reset, so I wouldn't decline moving this loop to a helper deleteBuckets.
On IRC you mentioned that the destructor is called throughout the game, so it would be of both (possibly marginal) performance relevance and code cleanliness (when presenting code statements to the reader, it implies they are necessary, but in this case its a useless statement, i.e. misleading)

elexis accepted this revision.Jul 24 2019, 6:51 PM
This revision is now accepted and ready to land.Jul 24 2019, 6:51 PM

As you tested with macOS leak tools, that's adding to the evidence for this thing leaking and other things not leaking (unless you found more?).

I did not find any other obvious leak, no, though I haven't searched a lot for them. The replays I tried showed plateauing memory usage (or the bandsaw pattern from the JS GC) so it seems to work correctly now.

This can be done by moving the loop into a helper function ResetBuckets or something, call that from the destructor and from Reset.
Further cleanup can be done separately.

Agreed.

This revision was automatically updated to reflect the committed changes.