Page MenuHomeWildfire Games

Don't leak GUI Object on name ambiguity, delete PS_ERROR_GUI codes, CGUI::UpdateObjects() and IGUIObject::AddToPointersMap
ClosedPublic

Authored by elexis on Oct 1 2019, 7:03 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Sep 13, 3:33 PM
Unknown Object (File)
Thu, Sep 12, 7:57 PM
Unknown Object (File)
Tue, Sep 10, 2:47 AM
Unknown Object (File)
Sat, Sep 7, 7:49 PM
Unknown Object (File)
Fri, Sep 6, 4:15 PM
Unknown Object (File)
Wed, Sep 4, 5:10 AM
F1104221: leak_report_comparison.7z
Oct 11 2019, 6:18 PM
Subscribers

Details

Reviewers
Stan
Summary

These old PSError codes were never well implemented, nor are they preferable over catching issues before they turn into an exception.
Most of the catch code is an explicit TODO or resembles one.
Only three of these errors were called in rare occasions, while all the other code relies on LOGERROR + return value, so it is consistent to get rid of them.

While removing them and reading the code more thoroughly, one discovers conditional memory leaks.

The throw call will quit the current function call and return to the first catch.
However at this point there is some data allocated on the heap, specifically the GUI Object to be read.

So in case such a throw is triggered, that data will be leaked.
You can confirm this with valgrind:

Index: binaries/data/mods/public/gui/pregame/mainmenu.xml
===================================================================
--- binaries/data/mods/public/gui/pregame/mainmenu.xml	(revision 23022)
+++ binaries/data/mods/public/gui/pregame/mainmenu.xml	(working copy)
@@ -4,10 +4,13 @@
 	<script directory="gui/common/"/>
 	<script directory="gui/pregame/"/>
 	<script directory="gui/pregame/backgrounds/"/>
 	<script directory="gui/pregame/userreport/"/>
 
+	<object name="anton"/>
+	<object name="anton"/>
+
 	<object>
 		<include file="gui/pregame/backgrounds.xml"/>
 		<include file="gui/pregame/menupanel.xml"/>
 		<include file="gui/pregame/userreport/userreport.xml"/>
==329104== 1,872 (72 direct, 1,800 indirect) bytes in 1 blocks are definitely lost in loss record 3,109 of 3,240
==329104==    at 0x4838DEF: operator new(unsigned long) (vg_replace_malloc.c:334)
==329104==    by 0x58B00D: allocate (new_allocator.h:114)
==329104==    by 0x58B00D: allocate (alloc_traits.h:444)
==329104==    by 0x58B00D: _M_get_node (stl_tree.h:580)
==329104==    by 0x58B00D: _M_create_node<const CStr8 &, CGUISetting<bool> *> (stl_tree.h:630)
==329104==    by 0x58B00D: std::pair<std::_Rb_tree_iterator<std::pair<CStr8 const, IGUISetting*> >, bool> std::_Rb_tree<CStr8, std::pair<CStr8 const, IGUISetting*>, std::_Select1st<std::pair<CStr8 const, IGUISetting*> >, std::less<CStr8>, std::allocator<std::pair<CStr8 const, IGUISetting*> > >::_M_emplace_unique<CStr8 const&, CGUISetting<bool>*>(CStr8 const&, CGUISetting<bool>*&&) (stl_tree.h:2404)
==329104==    by 0x585357: emplace<const CStr8 &, CGUISetting<bool> *> (stl_map.h:575)
==329104==    by 0x585357: void IGUIObject::RegisterSetting<bool>(CStr8 const&, bool&) (IGUIObject.cpp:138)
==329104==    by 0x581AE4: IGUIObject::IGUIObject(CGUI&) (IGUIObject.cpp:47)
==329104==    by 0x5523C2: CGUIDummyObject (CGUIDummyObject.h:36)
==329104==    by 0x5523C2: CGUIDummyObject::ConstructObject(CGUI&) (CGUIDummyObject.h:33)
==329104==    by 0x5484D5: ConstructObject (CGUI.cpp:307)
==329104==    by 0x5484D5: CGUI::Xeromyces_ReadObject(XMBElement, CXeromyces*, IGUIObject*, std::vector<std::pair<CStr8, CStr8>, std::allocator<std::pair<CStr8, CStr8> > >&, boost::unordered::unordered_set<Path, boost::hash<Path>, std::equal_to<Path>, std::allocator<Path> >&, unsigned int) (CGUI.cpp:610)
==329104==    by 0x5472F7: CGUI::Xeromyces_ReadRootObjects(XMBElement, CXeromyces*, boost::unordered::unordered_set<Path, boost::hash<Path>, std::equal_to<Path>, std::allocator<Path> >&) (CGUI.cpp:562)
==329104==    by 0x546D0A: CGUI::LoadXmlFile(Path const&, boost::unordered::unordered_set<Path, boost::hash<Path>, std::equal_to<Path>, std::allocator<Path> >&) (CGUI.cpp:522)
==329104==    by 0x571DD9: CGUIManager::SGUIPage::LoadPage(std::shared_ptr<ScriptRuntime>) (GUIManager.cpp:195)
==329104==    by 0x5711A8: CGUIManager::PushPage(CStrW const&, std::shared_ptr<ScriptInterface::StructuredClone>, JS::Handle<JS::Value>) (GUIManager.cpp:107)
==329104==    by 0x570EDA: CGUIManager::SwitchPage(CStrW const&, ScriptInterface*, JS::Handle<JS::Value>) (GUIManager.cpp:95)
==329104==    by 0x33EAFF: InitPs(bool, CStrW const&, ScriptInterface*, JS::Handle<JS::Value>) (GameSetup.cpp:512)

The CGUI page will also not delete this later on, since the object is not added to the map.

We observe that the throw is not only a fancy way to express an error, but that it is actually necessary in order to break the recursion of the RecurseObject function!

So in order to remove the exception and fix the memory leak at the same time, one has to change the object registration code to be performed per individual object.

That currently leaking unwinding process doesn't even seem necessary if one just inserts the one pointer instead of recreating the entire thing.
Recreating the entire thing every single addition of a GUIObject that is.

So that should also measurably speed up page opening times in theory, as the program doesn't recurse every GUI object for every GUI object anymore.

The other code change is the removal of one condition, the recursion for the GUIM_LOAD message is only sent after all GUI objects were created, rather than once per "root object" (that is a child of the singular "base object").
I.e. recursing the LOAD message only once for the entire tree instead of once per top node after load.
(This might even solve a theoretical bug, because the Load message handler might want to call into a GUI object that wasnt loaded yet?)

Test Plan

Verify the steps in the summary in the given order.

Event Timeline

Stan added inline comments.
source/gui/CGUI.cpp
483–498

Switch case ?

Successful build - Chance fights ever on the side of the prudent.

Linter detected issues:
Executing section Source...

source/gui/CGUI.h
|  53| class·CGUI
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classCGUI{' is invalid C code. Use --std or --language to configure the language.

source/gui/IGUIObject.h
|  53| class·IGUIObject
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classIGUIObject{' is invalid C code. Use --std or --language to configure the language.

source/gui/CGUISize.h
|  29| class·CGUISize
|    | [MAJOR] CPPCheckBear (syntaxError):
|    | Code 'classCGUISize{' is invalid C code. Use --std or --language to configure the language.
Executing section JS...
Executing section cli...

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/890/display/redirect

Successful build - Chance fights ever on the side of the prudent.

Link to build: https://jenkins.wildfiregames.com/job/vs2015-differential/374/display/redirect

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/934/display/redirect

Successful build - Chance fights ever on the side of the prudent.

Link to build: https://jenkins.wildfiregames.com/job/vs2015-differential/419/display/redirect

The leak comes from rP9.

The throw PSERROR_GUI_NameAmbiguity from IGUIObject::AddChild is caught in the line at LOGERROR("GUI error: %s", e.what()); from CGUI::Xeromyces_ReadObject.

We can count even thousands of leaks and the GUI not being rendered anymore (apparently the map is empty after the error and leaking all? objects)

cat '/tmp/leak_log_without_patch.txt' | grep 'ConstructObject (CGUI.cpp:305' | wc -l
1402

source/gui/CGUI.cpp
483–498
source/gui/CGUI.h
243

Why this uses a reference for child:

  1. reference vs pointer:

It's unexpected to take over ownership of a pointer passed as a reference, however using a pointer would mean requiring error handling in case someone passes a nullpointer, while receiving a reference means that its not possible to receive a nullptr, which either results in a compiletime error or otherwise making it the obligation of the one passing the dereferenced pointer as reference at the time that happens. In that case the new and deref operator are performed in the same function, thus making it more obvious that no null-testing is needed and both less prone to null-deref errors and less prone to introducing writing useless (dead code), weird, buggy or unfinished exception handling.

That's why I chose the pointer for child despite that reference being used to take over ownership.

  1. smart pointers:

The alternative to pointer and reference are smart pointers, but that would also mean passing through their interface each std::map call.
However the derefence operation isnt slower for smart pointers, and allocation / erasure only happens once per GUI page (and not more often if assuming JS created GUI objects).

This way the deletion of GUI objects could also become the responsibility of the shared pointer, removing it from the CGUI destructor, removing the argument between deleting GUI object children in CGUI vs IGUIObject.

Then again that would be better off as a separate restructuring that doesnt fix a bug but changes from a nonbuggy state to a cleaner nonbuggy state. Then one will have more freedom to judge possible alternatives or more elaborate refactoring that can improve the code further and will add to auditability, bla. If there is no explicit use case for smart pointers, then it might even be easier to track where things are deleted (because smart pointers may be deleted in various places implicitly whereas the raw pointers are explicitly deleted, thus making it more evident if they are deleted in the wrong place or if they live longer than they should (leak).
For example if the object map of CGUI and the child object map in IGUIObject were shared pointers linking to the same object, there will be more cases to think about than when using raw pointers.

Committed in rP23067 (if someone can accept this I can close it without abandoning it)

I haven't noticed any issue, I agree with the change as I actually find the debug break window not helpful at all.

This revision is now accepted and ready to land.Oct 28 2019, 12:24 PM