HomeWildfire Games

Do not actually Kill in debug_break() in non-debug binaries

Description

Do not actually Kill in debug_break() in non-debug binaries

Revelead by the A23 crash when hosting a game in the lobby. We have a debug_break() call that calls "kill" with SIGTRAP, which generally doesn't get handled correctly in release versions, particularly on OSX.

Differential Revision: https://code.wildfiregames.com/D1499

Event Timeline

Itms added a subscriber: Itms.Jan 2 2019, 11:45 AM

I agree with this change, it makes sense. What do you think of bringing up the usual debugging window instead of killing (in Debug mode)? I'm talking about the window that pops up when an ENSURE fails, for instance.

elexis raised a concern with this commit.EditedJun 15 2019, 5:48 PM
elexis added a subscriber: elexis.

When I compile in release mode on arch linux:

without this changeset:

asddTrace/breakpoint trap (core dumped)

with this changeset:

[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
pyrogenesis: xcb_io.c:263: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Aborted (core dumped)

When calling ENSURE("meh"), for example by Engine.StopXmppClient(); in the main menu via F9.

This commit now has outstanding concerns.Jun 15 2019, 5:48 PM
wraitii added a comment.EditedJun 28 2019, 9:03 AM

From IRC yesterday:

21:15 < Vladislav_> So wraitii thought that the game was crashing because of debug_break, but actually it was crashing because of the image, right?
21:15 < elexis> "debug_break() is intended to provide a break point on debug releases where you can jump in debug. It ends up killing the app in release versions."
21:16 < elexis> for me there always came this error dialog where one has the choice continue/break/kill, I dont know what happened on OSX
21:20 < Vladislav_> If it's exact as you state, then reverting seems reasonable.

The above is incorrect. I knew perfectly well that the image was the cause of the crash - but I found it unreasonable to crash anyways.
I did not make it too easy to find why, but this is the follow-up fix to D1498. As commented there, https://trac.wildfiregames.com/browser/ps/trunk/source/lib/tex/tex_dds.cpp#L558 is the culprit: if the mipmap chain is incomplete, it calls WARN_RETURN. That calls DEBUG_WARN_ERR, which ought to

/**

  • display the error dialog with text corresponding to the given error code.
  • used by WARN_RETURN_STATUS_IF_ERR et al., which wrap function calls and automatically
  • raise warnings and return to the caller.

**/

, calls debug_OnError. Going through the calls, this ends up calling CallDisplayError, calling sys_display_error, which itself calls try_gui_display_error, which isn't implemented on OSX (see https://trac.wildfiregames.com/browser/ps/trunk/source/lib/sysdep/os/unix/unix.cpp#L66).
Thus sys_display_error calls GetChar.
On MacOS, this works perfectly well if one has a terminal window open. However, when running the app from a Bundle application (such as releases), it appears the call is just automatically skipped (I haven't been able to find documentation on this so far). Thus we call the default debug_break. Thus we end up killing the app.
Thus my choice to skip this last bit on release calls.

Calling this a proper fix is probably a stretch now that I look at it with fresher eyes, but I maintain that killing-by-default is broken here, so we need some kind of change. Perhaps debug_OnError should return 'continue' by default. I'm not sure that we can open a debug window in MacOS bundles.

As usual, I will complain that this discussion could have been avoided if anybody had come to comment the above revision between May 19 and January 1.

I created a ticket for this at #5793.

I don't think it's very high-priority. As such, I am removing auditors so this no longer occupies a spot in the list of concerns, sine we probably want to try to keep that down.

This commit no longer requires audit.Jul 27 2020, 10:04 AM