Hey, How are you doing in those virologic times? I'm a bit confused by the previous messages (It's been a while) did you work for 0 A.D. ? If so, I might need you help. We're having quite some issues with the new AMD CPU's and the game CPU counters HPET TSC QPC. It seems HPET is utterly broken for all platforms, and that TSC causes issues. If my previous assumption is right, you wrote that code, so maybe you could help me fix it. ( 0 A.D. is very short of talented programmers these days...) 0 If not, sorry for bothering you, and have a great week :) ## janwas: Hi, doing fine, thanks, hope the same for you! Just super busy. I did indeed write that code, and can at least advise on how to fix it. What's the problem? ## Stan: Well HPET doesn't work like at all ^^ It fails on all machines I have, I'm not exactly sure what aken.sys driver does or how to update it (I remember reading on the forums that it was a security issue) too, Here https://wildfiregames.com/forum/index.php?/topic/15482-akensys-security-concerns/&tab=comments#comment-233728 So that leaves us with two timers, TSC and QPC. On new Ryzen CPUs 3500+ using the TSC timers causes the game to have massive slowdowns and speedups at random. Using QPC seems to fix it (it might also be responsible for a game crash) but all those solutions are hacky at best, and I'd like to find a long term solution The game always defaults to QPC no matter what I guess it could use HPET, if it new how to use the high precision timer of Windows default to tsc you have to force it to use qpc Hope I explained it well, It's totally out of my area of expertise XD ## janwas: aken is a small driver to read MSRs. TSC only works if CPU has "invariant TSC" and ideally synced between cores (or 0ad main thread is pinned to a core). QPC is broken on some older XP but those are really ancient by now. ## Stan`: I see, Isn't QPC way slower though? ## janwas: yes, QPC is slower. that is why I went to all the craziness of writing a kernel mode driver. but if HPET hardware is broken (sigh) and AMD still can't build invariant TSC (sigh), then there are no good options left that I know of, and it's best for users if QPC is used - that one can at least be influenced using BCD flags provided by OS if its implementation is broken and needs to be changed. (i.e. punt to the operating system :p) and as a second step, see if QPC is called so frequently that it's a problem, and try not to do that. HTH :) heading out for now. ## Stan`: Thanks a lot for the answer. I'm not sure what's the best way to determine the best timer though. If TSC works 90% of the time and is generally better then we should find a way to detect when it's broken I guess ^^ Do you have any idea how it works on Linux, the code seems entirely different, and there is no reference to such timers anywhere. It seems the problem might occur there, too, although it only happened once so far... For HPET, is there anyway to do it without a driver and admin privileges? I'd like to make sure that it doesn't work before finding a way to force QPC, which is starting to look like I will need a CPU ID mask ^^ and a hacky function to detect them... ## janwas: Observing TSC breakage is difficult, and does not give a guarantee. Could check the CPUID flags for constant, nonstop, invariant TSC and bail out if any are not set. On Linux gettimeofday is fine, reasonable overhead and precision and no horrible breakage that I know of. HPET indeed requires ring0 access to program MSRs. ## Stan`: We already check for the invariant tsc flag see: https://github.com/0ad/0ad/blob/master/source/lib/sysdep/os/win/whrt/tsc.cpp I think I will just remove TSC and HPET alltogether, and always use QPC. Seems like a bit of waste though Made this patch as last resort https://code.wildfiregames.com/D2726 Currently we use a wide variety of timers on windows. HPET nearly often fails to initialize because it requires admin rights. The fallback is usually TSC which is an usually reliable timer, that unfortunately seems to have some issues on the latest... ## janwas: if you'd prefer to save TSC, invariant=nonstop_tsc is not enough, also need to check constant_tsc. And either pin to a core, or use RDTSCP to read some extra info that indicates which core gave the reading (plus per-core calibration). ## Stan` Oh sounds better! If that works :) I guess pinning is not really possible without a lot of changes I guess we should remove the HPET code since it doesnt work (probably because the drivers are not signed) It seems like all the current FLOSS engines use QPC, and that QPC actually uses HPET when available ## janwas: yes, QPC is reasonable. the systems where it was broken are long gone. ## Stan` Thanks a lot for the help. If I may abuse a bit more of your kindness https://code.wildfiregames.com/D2726 could you tell me if there is any other code I could delete to clean it up? Currently we use a wide variety of timers on windows. HPET nearly often fails to initialize because it requires admin rights. The fallback is usually TSC which is an usually reliable timer, that unfortunately seems to have some issues on the latest... ## janwas: you're welcome! not sure I'd delete tsc entirely, might be useful later. could instead move it after qpc, which shouldn't fail, so tsc is effectively disabled. deleting hpet seems fine. ## Stan` Okay. Is there any useless code I should remove for hpet other than the two files ? ## janwas: ah yes, I think you can clear out anything mentioning Mahaf and Aken (case-insensitive), IIRC those were only used by HPET. ## Stan` Are you sure there are quite a few references https://github.com/0ad/0ad/search?q=mahaf&type= ## janwas: you're right, I misremembered. It's also referenced by msr, used by tsc. might be a good idea to remove the calls to mahaf from those files because it's not going to work anyway. ## Stan` Ok, so I nuke mahaf and aken.sys related files and make it compile? :D ## janwas: :) yes. just treat all mahaf functions as if they always fail. ## Stan` That's my kind of programming :D Mmh I'm actually not sure by what I should replace them though... ## janwas: can also remove the remnant "// Pentium systems doesn't come up." and explain that QPC is first because it is less buggy. you'll also want to remove the PMT file entirely, that doesn't work without mahaf. ## Stan` Oh okay thanks! MSR too I guess It seems we are experiencing the same bug on Linux... I'm trying to see if replacing ```cpp (void)clock_gettime(CLOCK_REALTIME, &start); ``` by ```cpp (void)clock_gettime(CLOCK_MONOTONIC, &start); ``` will work. ## janwas: yeah monotonic is better than realtime. there's also monotonic_raw which skips NTP freq adjustments, not sure that's better for long-running sessions. ## Stan` Since people usually play 10hours session, I guess not :D I hope that will fix it cause I have no idea else. I removed everyreference from mahaf and aken i think https://code.wildfiregames.com/D2726