Project catch-ctd

meson800

Addon Developer
Addon Developer
Donator
Joined
Aug 6, 2011
Messages
405
Reaction score
2
Points
18
We've all experienced them...the dreaded crash to desktop.

Wouldn't it be great if there was a way to get more information for debugging, or to know which plugin called which Orbiter API function which crashed Orbiter?

catch-ctd is a module which aims to do that.

What is it
Currently, catch-ctd hooks all* of the "free-standing" Orbiter API functions.

These are the functions starting with oapi. Currently, class functions (like functions inside Orbiter classes, like CameraMode::GetMode() ) are not hooked.

Then, catch-ctd simply logs the function and the arguments passed to it to ORBITER_ROOT/catch-ctd/catch-ctd.log.

All Orbiter vessels are now hooked, so you get context about where each Orbiter call comes from.

Log files look like this right now:
Code:
Attaching hooks...
Original oapiRegisterModule :66CE68E8
Original oapiGetObjectByName :66CE6764
Original oapiGetObjectByIndex :66CE682C
Original oapiGetObjectType :66CE6620
---snipped list of functions hooked---
---snipped functions called during load of scenario---
Inside DeltaGlider::GL-01::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside DeltaGlider::GL-02::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside DeltaGlider::GL-03::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside ShuttleA::SH-02::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside ShuttleA::SH-03::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
	oapiMeshGroupCount:(hMesh:08706C20)...
	oapiMeshGroup:(hMesh:08706C20 , idx:0)...
	oapiMeshGroup:(hMesh:08706C20 , idx:1)...
	oapiMeshGroup:(hMesh:08706C20 , idx:2)...
	oapiMeshGroup:(hMesh:08706C20 , idx:3)...
	oapiMeshGroup:(hMesh:08706C20 , idx:4)...
	oapiMeshGroupCount:(hMesh:0852D718)...
	oapiMeshGroup:(hMesh:0852D718 , idx:0)...
Inside DeltaGlider::GL-01::clbkPanelRedrawEvent:id:14, event:1, surf:00000000, context:125BA090
...
Inside DeltaGlider::GL-01::clbkPanelRedrawEvent:id:13, event:1, surf:00000000, context:125BA0E8
		oapiGetNavType:(hNav:085C9D10)...
...
---snipped additional function calls----
Detaching hooks...
Hooks detached successfully

What it will be
I want to expand catch-ctd to know which modules are calling each function.

That should be fairly straightforward- just intercept the call to create a new vessel, install an empty "forwarding" class that simply forwards all vessel calls to the actual vessel class. Done :)

This will be harder for orbiter modules using the old deprecated, non-Module class method. (Where DLLs directly implement functions like oapiPreStep, instead of defining a subclass of Module and implementing Module::preStep)


Install

Requires the VS 2013 redistributable here. (If that direct link doesn't work, use this link and select the x86 redistributable)

Download the current version here, and unzip into Orbiter as usual, then enable in Modules list.

Caveats
catch-ctd doesn't actually catch CTDs :lol:
Even if Orbiter was throwing exceptions, catch-ctd could catch them, but the simulation still has to terminate because it's in a weird state.

As catch-ctd has to write to a file for every OAPI call, it is slow. On a simple scene (DG Brighton Beach scenario), I get around 70 FPS with catch-ctd enabled.

I'm trying to come up with a method to speed up logging, but most involve buffering, which might lose the crucial last call before an Orbiter crash.


*I use Detours to hook the functions, so functions that are too small (like the second oapiMeshGroup overload), as most overloads are, do not get hooked.
This shouldn't matter, as the really small functions probably just call the larger overload, so it gets hooked that way.

Question
  • Is there interest in a plugin like this? Is this something the community needs, or is the current CTD-debugging method enough?
  • Does anyone have an idea to speed up logging, without possibly losing information when Orbiter crashes?
  • How can I hook modules using the deprecated plugin method?
  • Any better names for this?


---------- Post added at 05:58 PM ---------- Previous post was at 03:15 PM ----------

Updated logging.

Now, there is only one file open/close per line of logging. This increases the FPS in that simple scene from 8 to 21 on my computer. Still not great, but better.

Also, now logging parameters have the argument names along with the arguments. If an argument is a struct, it is printed out in nested parenthesies.

For example,
Code:
oapiGlobalToEqu:(hObj:08495038 , glob:(data:00AEEBE0, x:1.43041e+011, y:-2.04222e+007, z:4.69909e+010) , lng:084957E0 , lat:084957E8 , rad:00AEEE48)...
oapiGlobalToEqu takes five arguments. Every argument except glob is a pointer, and gets their memory address printed out.

The second argument, named glob, is a VECTOR3, and gets the internal address and three components printed out.
 
Last edited:

kamaz

Unicorn hunter
Addon Developer
Joined
Mar 31, 2012
Messages
2,298
Reaction score
4
Points
0
Very cool.

To speed up logging try moving file write to a separate thread / process.

BTW you *can* hook object methods with Detours...
 
Last edited:

meson800

Addon Developer
Addon Developer
Donator
Joined
Aug 6, 2011
Messages
405
Reaction score
2
Points
18
To speed up logging try moving file write to a separate thread / process.

Will Orbiter crashing take down the logging thread with it?

BTW you *can* hook object methods with Detours...

Yes, thanks, I've looked at the example Detours provides for hooking member functions. It's just that right now, I wrote a script that does the "hard" work-it reads in OrbiterAPI.h and auto-generates the hook h/cpp code.

Parsing function declarations is easier than parsing whole classes, so I'll end up writing the class hooks manually, which takes time.
 

dbeachy1

O-F Administrator
Administrator
Orbiter Contributor
Addon Developer
Donator
Beta Tester
Joined
Jan 14, 2008
Messages
9,217
Reaction score
1,563
Points
203
Location
VA
Website
alteaaerospace.com
Preferred Pronouns
he/him
Will Orbiter crashing take down the logging thread with it?

Yes, unfortunately it will. The only way I can think of to make logging fast (i.e., to be able to buffer messages but still not lose them when Orbiter crashes) would be to send the log messages to another process via some form of interprocess communication. Then the sending process (orbiter.exe) will not have to wait for the messages to be flushed to disk before continuing -- it can just send them to the custom logging process and continue executing.
 

Linguofreak

Well-known member
Joined
May 10, 2008
Messages
5,034
Reaction score
1,273
Points
188
Location
Dallas, TX
Yes, unfortunately it will.

Would Windows have any equivalent to installing a signal handler for SIGSEGV on *nix? In that case, the handler could stop the main orbiter thread, record its state at the time of the crash, and pass control to the logging thread.
 

meson800

Addon Developer
Addon Developer
Donator
Joined
Aug 6, 2011
Messages
405
Reaction score
2
Points
18
Yes, unfortunately it will. The only way I can think of to make logging fast (i.e., to be able to buffer messages but still not lose them when Orbiter crashes) would be to send the log messages to another process via some form of interprocess communication. Then the sending process (orbiter.exe) will not have to wait for the messages to be flushed to disk before continuing -- it can just send them to the custom logging process and continue executing.

Thanks :tiphat:

I'll look into shared pipes- it seems like a fairly straightforward way to do it.
 

kamaz

Unicorn hunter
Addon Developer
Joined
Mar 31, 2012
Messages
2,298
Reaction score
4
Points
0
Would Windows have any equivalent to installing a signal handler for SIGSEGV on *nix? In that case, the handler could stop the main orbiter thread, record its state at the time of the crash, and pass control to the logging thread.

Yes it allows you to register your own debugger which will be called upon crash.

Cant tell you more at the moment because Im on vacation without a computer.

Very likely that there is an existing tool / code which could help here.

---------- Post added at 08:26 AM ---------- Previous post was at 08:15 AM ----------

Parsing function declarations is easier than parsing whole classes, so I'll end up writing the class hooks manually, which takes time.

https://pypi.python.org/pypi/CppHeaderParser/
 
Last edited:

fred18

Addon Developer
Addon Developer
Donator
Joined
Feb 2, 2012
Messages
1,667
Reaction score
104
Points
78
this has a huge potential for addon developers! thank you very much for this!

even if for the time being slows down orbiter it can be used only on spots to catch the reason of a ctd of a new addon! that's amazing!

I always have to put a lot of log tests in my addons to find out the exact point and the exact reason of a crash, now it will be a lot easier!!!

Cheers! :tiphat:
 

meson800

Addon Developer
Addon Developer
Donator
Joined
Aug 6, 2011
Messages
405
Reaction score
2
Points
18
Yes, unfortunately it will. The only way I can think of to make logging fast (i.e., to be able to buffer messages but still not lose them when Orbiter crashes) would be to send the log messages to another process via some form of interprocess communication. Then the sending process (orbiter.exe) will not have to wait for the messages to be flushed to disk before continuing -- it can just send them to the custom logging process and continue executing.

Ok, I split logging into a logging dameon that Orbiter launches.

The FPS drop is getting better. I'm going to try different buffer settings on the pipe, and see if that helps.

But it's getting much better :thumbup:

Normally, that scene (using the DX7 inline client), gets around 70-150 FPS. With the new catch-ctd, I get 50-70 FPS. That's almost a 3x improvement over before!

The FPS drop still exists, but is a lot better than before.

I will update the binary above here soon.

---------- Post added at 02:39 PM ---------- Previous post was at 02:35 PM ----------

Done

---------- Post added at 02:46 PM ---------- Previous post was at 02:39 PM ----------

How big do you guys think the log files can realistically be?

There will probably be at least one OAPI call per vessel per frame, which means even at 50FPS, the log files grow to megabytes pretty quickly...

In fact, for the same test scenario, it seems to generate around 500KB of log files per second.

My eventually goal is to have catch-ctd generate an "interesting" short log file, but I think that the full log files still need culling.

Should I just limit the maximum size of the log files?
 

Lisias

Space Traveller Wanna-be
Joined
May 31, 2015
Messages
346
Reaction score
3
Points
18
Website
www.youtube.com
As catch-ctd has to write to a file for every OAPI call, it is slow. On a simple scene (DG Brighton Beach scenario), I get around 70 FPS with catch-ctd enabled.

On the "good old times", when mobile processors with 100 MHz were the fastest thing that don't (usually) explode in your pockets, I had to optimize a mesh loading algorithm for a game. The guys that made the loading algorithm used plain ASCII flles, to make development easy - but the damned thing took over 7 minutes to toad a mesh.

I rewrite the C++ loading routine to read bziped binary representations of the mesh, and reduced the loading time to... 15 seconds. The packed stream of course saved time by reducing the amount of data to be read from a slow SD card, but most of the speed came from avoiding the serialization from ASCII representation of numeric data.

Perhaps writing out binary data, and then inputing it to a external ASCII serializer in post-mortem time wouldn't help?
 

meson800

Addon Developer
Addon Developer
Donator
Joined
Aug 6, 2011
Messages
405
Reaction score
2
Points
18
Ok, it appears like my code isn't the bottleneck.

Pointing the camera towards the sky, I get close to 170 FPS, so it appears logging doesn't cut FPS that much.
 

Lisias

Space Traveller Wanna-be
Joined
May 31, 2015
Messages
346
Reaction score
3
Points
18
Website
www.youtube.com
Ok, it appears like my code isn't the bottleneck.

Pointing the camera towards the sky, I get close to 170 FPS, so it appears logging doesn't cut FPS that much.

Now that you mention it...

Currently I'm playing Orbiter on a very old machine using Athlon XP 3800+. It works fine on single threads applications, but it slugs in multithreading - debugging some problems I faced in the past, I discovered that the Main thread of Orbiter is instanced in real time priority.

However, sometimes something needs to be loaded from disk (as a texture for a region of Earth that just became visible), and this is done by the book with another I/O thread.

However, since my CPU has only one Control Unit, the main thread waits for the loading, but the loading is delayed since it runs on a lower priority thread - a textbook example of Starvation.

After some (a lot, to tell you the true) seconds, the I/O finally happens and the Main thread starts to update the screen again. This can be very nasty if you are time warping...

Perhaps you are facing something similar?
 

meson800

Addon Developer
Addon Developer
Donator
Joined
Aug 6, 2011
Messages
405
Reaction score
2
Points
18
catch-ctd updated, now it hooks all vessels! :woohoo:

Now onto hooking modules :lol:

Now your log files look nice and pretty, such as:
Code:
Inside DeltaGlider::GL-01::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside DeltaGlider::GL-02::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside DeltaGlider::GL-03::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside ShuttleA::SH-02::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
Inside ShuttleA::SH-03::clbkPostStep:simt:1.94122, simdt:0.00735763, mjd:51981.6
...
	oapiMeshGroupCount:(hMesh:08706C20)...
	oapiMeshGroup:(hMesh:08706C20 , idx:0)...
	oapiMeshGroup:(hMesh:08706C20 , idx:1)...
	oapiMeshGroup:(hMesh:08706C20 , idx:2)...
	oapiMeshGroup:(hMesh:08706C20 , idx:3)...
	oapiMeshGroup:(hMesh:08706C20 , idx:4)...
	oapiMeshGroupCount:(hMesh:0852D718)...
	oapiMeshGroup:(hMesh:0852D718 , idx:0)...
Inside DeltaGlider::GL-01::clbkPanelRedrawEvent:id:14, event:1, surf:00000000, context:125BA090
...
Inside DeltaGlider::GL-01::clbkPanelRedrawEvent:id:13, event:1, surf:00000000, context:125BA0E8
		oapiGetNavType:(hNav:085C9D10)...
...

The details

------------
Turns out it was a pain to hook vessel creation. With Detours, you can't pass additional information to your hooked functions, so it's hard to do a many-to-one hook (All ovcInit calls needed to be hooked to a single MyOvcInit).

Turns out, Orbiter re-loads each vessel DLL just before creating a new vessel, so I ended up hooking upon vessel DLL load, then unhooking once ovcInit was called.

There, I generated a VesselProxy class, which simply logs and forwards calls to the underlying vessel.

Now, you would think that you could simply call
Code:
delete originalVessel //where originalVessel is a VESSEL*
Nope, guess again :facepalm:

That vessel pointer has to be casted to the original type to call the correct deconstructor, and the only code that knows what that original type is is the original vessel DLL. So I had to hook ovcExit...

ovcExit was way harder...
You can't be certain when a vessel is going to be destroyed, so I needed to keep all of those ovcExits hooked into one function. I ended up having to check if I had already detoured a certian vessel DLL on LoadLibrary, store those in a map, pass some additional information to VesselProxy, so it could call the original ovcExit, and wrap everything up in a nice MyOvcExit function.

With that done though, we get pretty-printing and can understand which vessel called which OAPI calls.

---------- Post added at 09:01 PM ---------- Previous post was at 08:58 PM ----------

Binary updated.
 
Top