Page MenuHomeWildfire Games

#loadingordermatters
Needs RevisionPublic

Authored by elexis on Jan 20 2019, 8:16 PM.

Details

Reviewers
wraitii
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Summary

There are currently only two known ways how to create a mod with a modification of an existing JS functions.

  1. The first is copying the entire file and modifying it.
  2. This obfuscates which lines are changed if a mod is maintained for multiple releases. The replaced file has to be updated with every commit to that

file in the "public" mod, otherwise the new commit is not present in the modified copy.

  • It also means that players can't launch two mods that modify the same file, because one mod will overwrite and thereby remove the diff of the other mod.

Examples: fgod, delenda est

  1. The second approach is creating a new file file_mod.js to refer to the functions, variables and prototypes defined in file.js. This approach removes the two disadvantages mentioned above, but as wraitii correctly noted in D1410, this approach has never been verified to be platform independent:

https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-findnextfilea

With the NTFS file system and CDFS file systems, the names are usually returned in alphabetical order. With FAT file systems, the names are usually returned in the order the files were written to the disk

There is no single person to blame for making such a recommendation, because many people throughout years have come up with this approach independently.
For example:

Test Plan

Consider whether it's better to just always sort (shouldn't cost any significant performance) or whether to have sorting an optional argument (so that callers of this function can't use
platform-dependent code without consciously picking it. (Another use case for omitting default behaviors, refs D1460.))
If going for the new argument, make sure the sorting choices are all correct, the comments are all correct and not too verbose.
Check whether there are other functions that also list files.

Diff Detail

Repository
rP 0 A.D. Public Repository
Branch
/ps/trunk
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 6824
Build 11221: Vulcan BuildJenkins
Build 11220: arc lint + arc unit

Event Timeline

elexis created this revision.Jan 20 2019, 8:16 PM

How many elements are sorted for a worst case?

source/lib/file/vfs/vfs_util.h
39

Maybe just sort?

I feel like sorting always would be better than sorting sometimes, as it reduces opportunities for inconsistency. Given that I assume you've changed all instances of GetPathNames, this seems safe enough based on what they do.

If not, I would recommend passing an enum VFS::SORTED or something instead of true or false since C++ doesn't have named arguments.

source/lib/file/vfs/vfs_util.h
37

ASCII

lyv added a subscriber: lyv.Jan 20 2019, 8:27 PM

Could be a lot for people who save their games quite often.

elexis edited the summary of this revision. (Show Details)Jan 20 2019, 8:40 PM
Vulcan added a subscriber: Vulcan.Jan 20 2019, 8:45 PM

Successful build - Chance fights ever on the side of the prudent.

Link to build: https://jenkins.wildfiregames.com/job/differential/989/

wraitii added a reviewer: Restricted Owners Package.Jan 27 2019, 3:52 PM

I think the time taken sorting some files by C++ code is irrelevant.

elexis marked an inline comment as done.Jan 28 2019, 1:30 PM

I think the time taken sorting some files by C++ code is irrelevant.

#metoo, but that doesn't make it true.
How many JS files do we load, < 100? That should be done within microseconds or something. The one place where > 100 could be loaded is the ColladaManager.cpp and that already does sort. *.po files are very few too, if they were sorted.

To prove the correctness of the patch, one can try to show that it's impossible to falsify the patch. The patch could furthermore only be false if it would sort in a case where sorting isn't really needed. I had added sorting in some cases only to provide platform consistency for return values to the simulation or even JS GUI, where they might not be needed. So one could complain about that, as it's a subjective / policy / design decision. But there's not much room for arguing about that either, as JS GUI devs usually don't know what C++ code they're calling, so better have it platform consistent.

I think sorting always would make the code much shorter and less complex, but I anticipated the hypothetical criticism that there could be one user where sorting order doesn't matter and it would hurt feelings to make it few microsecond slower, so I wrote the proof into every caller. That has the effect that every caller will have to determine wisely, which isn't too bad, but it's still costing development time that might not be needed after all.

Also that the explicit sorting argument means that the caller is forced to be conscious that it's sorted upon request and thus doesn't think about adding a std::sort call afterwards, like in ColladaManager.cpp. Doesn't make a significant difference, since devs should be aware of what they do anyhow.

source/lib/file/vfs/vfs_util.h
39

(I wanted to avoid using names that we know exist in std::, so that there is no room for doubt when reading the if (sorted) line without reading the function header)

I think returning paths in a deterministic sorted order always is best. IO is a known slow thing, so losing a few microseconds (even a few hunderds) on that seems completely irrelevant and _should_ be completely irrelevant.

So my advice is sorting always, structurally, and if that's slow you need custom caching, and that's it.

vladislavbelov added inline comments.Jan 28 2019, 8:10 PM
source/lib/file/vfs/vfs_util.cpp
51

We can add std::shuffle here on the testing stage to detect possible errors.

source/lib/file/vfs/vfs_util.h
39

Namespaces are for such purposes too: to have the same name for different types or variables. Also this function already contains used words: fs is declared as boost::filesystem, path is declared inside boost::filesystem and in std::filesystem since C++17.

Stan added a subscriber: Stan.Jan 28 2019, 8:29 PM
Stan added inline comments.
source/lib/file/vfs/vfs_util.h
39

The convention I learnt about boolean was more like IsSorted, HasValue, IsDead, could work here.

elexis marked an inline comment as done.Jan 29 2019, 1:43 PM

I think returning paths in a deterministic sorted order always is best. IO is a known slow thing, so losing a few microseconds (even a few hunderds) on that seems completely irrelevant and _should_ be completely irrelevant.
So my advice is sorting always, structurally, and if that's slow you need custom caching, and that's it.

There are few calls here where the result of the function call is only processed in the calling function and no further.
And if in these cases we don't need a deterministic order, one could argue that omitting the sort in these cases in a C++ performance maximization world is plausible.
Still seems like a diceroll so meh.

source/lib/file/vfs/vfs_util.h
39

But path and fs existed already, sorted is a newly introduced one. So we have the choice between avoiding reuses and chosing them consciously. IMO if the second name isn't becoming ugly, avoiding ambiguity even if not syntactically problematic seems preferable on the subjective niceness scale?

What's about wraitii suggestion:

If not, I would recommend passing an enum VFS::SORTED or something instead of true or false since C++ doesn't have named arguments

It makes sense to use a name. It'd be easier to read such code.

source/lib/file/vfs/vfs_util.h
39

I don't mind, I just pointed out that the sort name isn't a problem.

wraitii requested changes to this revision.Apr 13 2019, 11:13 AM

I'd say let's use the enum, default to VFS::SORTED or something, and let's sort everywhere at the moment - if it becomes a problem down the line we'll change then.

This revision now requires changes to proceed.Apr 13 2019, 11:13 AM