Page MenuHomeWildfire Games

Fix uncaught nullptr exception in TextureManager
Needs ReviewPublic

Authored by LukeV1 on Jul 16 2019, 9:49 PM.

Details

Reviewers
None
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Summary

In TextureManager there is a function called MakeProgress which is part of the texture caching system. It creates a new (empty) texture pointer and calls TextureConverter.Poll.
Sometimes this polling fails internally and gets correctly reported to the caller - which then tries to read properties of the empty texture pointer and is then in fact trying to read data of a nullptr what causes a complete appcrash.

TextureConverter.Poll actually fails for me at least one time per launch when the antivirus solution "G Data Internet Security" is activated and the cache completely cleared. A very similar problem in another file is described in D1274.

- As this only happens at texture conversion, this is just a bug faceable in SVN builds, since releases got precached textures -

Test Plan

You'll probably need to have "G Data Internet Security" installed to get TextureConverter.Poll failing at a reasonable chance.

  • completely whipe out the cache in %appdata%
  • apply this patch
  • propably also apply the D1274 since otherwise 0 AD will crash in advance
  • launch pyrogenesis

-> you will get at least one entry in interestinglog.html which shows the new LOGERROR-Message - and 0 AD survives (a bit longer at least)

Diff Detail

Repository
rP 0 A.D. Public Repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 8671
Build 14207: Vulcan BuildJenkins

Event Timeline

LukeV1 created this revision.Jul 16 2019, 9:49 PM
LukeV1 edited the summary of this revision. (Show Details)

Successful build - Chance fights ever on the side of the prudent.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/102/display/redirect

wraitii added a reviewer: Restricted Owners Package.Jul 17 2019, 10:00 AM

Could you add a context to the patch?

Could you add a context to the patch?

Actually I wanted to do this already in the first place, but I couldn't figure out how to do this!
I'm using TortoiseSVN and can't find any setting there to increase the amount context lines written to patch files. Also the command line method in the wiki doesn't work for me (cli tools from TortoiseSVN are installed).

Any help on this would be appreciated :)

Stan added a subscriber: Stan.Jul 31 2019, 7:49 PM

I usually use svn diff -x -U5000 | clip.exe :)

LukeV1 updated this revision to Diff 9180.Aug 1 2019, 10:10 AM

Ahh! Thanks a lot. This did the trick - here is the full patch with context.
Maybe your cmd command could be added to the wiki page so others don't run in this trouble too?

Build failure - The Moirai have given mortals hearts that can endure.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/276/display/redirect

LukeV1 updated this revision to Diff 9181.Aug 1 2019, 10:19 AM

Successful build - Chance fights ever on the side of the prudent.

Link to build: https://jenkins.wildfiregames.com/job/docker-differential/277/display/redirect

Stan added a subscriber: Itms.EditedAug 1 2019, 3:56 PM
In D2088#89258, @LukeV1 wrote:

Ahh! Thanks a lot. This did the trick - here is the full patch with context.
Maybe your cmd command could be added to the wiki page so others don't run in this trouble too?

I'm not sure it works on every platform, maybe @Itms would know.

LukeV1 added a comment.Aug 4 2019, 2:49 PM

I'm not sure it works on every platform, maybe @Itms would know.

It won't since clip.exe is a windows tool...
But on the other hand the current solution in the wiki isn't working at all on windows with TortoiseSVN installed. So just adding this shouldn't be that bad :)

Stan added a comment.EditedAug 4 2019, 3:33 PM

Oh yeah but you could redirect it to a file as well "> file.diff" that should work on most platforms :)
I'm not sure why it doesn't work exactly though.

I believe most people use Arcanist anyway.

elexis added a subscriber: elexis.Aug 12 2019, 10:55 PM

the antivirus solution "G Data Internet Security" is activated and the cache completely cleared

This thing again :| okay, good way to fuzz test the VFS I suppose.

source/graphics/TextureConverter.cpp
0–1

Do you happen to know which return value is triggered for you precisely?

Relevant definitions in vfs.cpp and status.h.
Errors are negative, INFO::OK is 0,
INFO::SKIPPED, INFO::CANNOT_HANDLE, INFO::ALL_COMPLETE are positive.

INFO::SKIPPED is returned from AddFileOrMemory from archive_zip.cpp which is called from AddFile, which is called from CreateFile.
It's triggered for 0 sizes.
But I think this shouldn't happen, because it doesnt write to a zip file.

INFO::ALL_COMPLETE is called from the zip_reader on LoadFile.
This is called from `CTextureConverter::ConvertTexture.
if (m_VFS->LoadFile(src, file, fileSize) < 0) in L312.
But there it is actually correct I suppose.
And LoadFile is not called from CreateFile, so its not the return value here.
INFO::CANNOT_HANDLE only happens when reading a zip file but not being able to locate the dictionary, which I suppose is also not happening here, since it caches a png file for svn.

Well which of the three is it? Or is this change cleanup not strictly necessary to fix the reported issue?

0–1

should this set ok = true?

Actually it looks like ok is uninitialized

source/graphics/TextureManager.cpp
8–1

(ok uninitialized)

8–1

If ok = true, then the pointer should be valid, if it was behaving kindly. One shouldn't have to doublecheck, i.e. the check should be adapted sooner. in m_TextureConverter.Poll.

Either there is an error that can be caught but wasnt caught (one of those 3 positive INFO return values), or everything went as expected, but due the some missing threading safety measure, something cleaned the buffer after the Poll call but before the ok check?

It creates a new (empty) texture pointer and calls TextureConverter.Poll.

That sounds like it knowingly returns ok = true and texture == nullptr.

LukeV1 added a comment.EditedAug 19 2019, 5:11 PM

Thanks for looking into this and giving such detailed info!!

Unfortunately (or fortunately?) something changed in the last few days on the windows or G Data side so that it is crashing somwhere else (wfilesystem.cpp, what means it is more related to D1274).
I will first have to figure out what is going on now before making any further changes here...
EDIT: As of now it is mandatory to apply D1274 otherwise it raises debug assertion failed errors in the above-described file wfilesystem.cpp.

LukeV1 added inline comments.Aug 19 2019, 5:44 PM
source/graphics/TextureConverter.cpp
0–1

Do you happen to know which return value is triggered for you precisely?

I can't remember it - and at the moment I also can't reproduce it..

Or is this change cleanup not strictly necessary to fix the reported issue?

Originally it wasn't. I just made it more "sensitive" to find all possible sources. Now I'm not sure as I can't remember which status actually got returned

0–1

If the return value is false ok doesn't get used by the caller - so it doesn't necessarily needs to be set..

source/graphics/TextureManager.cpp
8–1

Ok. That means I will revert the changes here and place them somewhere in m_TextureConverter.Poll if needed..

or due the some missing threading safety measure, something cleaned the buffer after the Poll call but before the ok check?

That would probably be quite hard to find out. I will check all other opportunities first..

8–1

Then it is probably the best to initialize it with false

LukeV1 added inline comments.Aug 20 2019, 3:09 PM
source/graphics/TextureConverter.cpp
0–1

The return value is 1