-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Decompiling Tips IDA
Here are some tips you might find that come in handy while decompiling the game.
First, get a copy of IDA. You can use the free version, though it will prompt you to upgrade and you can't save the binary. That's ok because we don't really need to change the binary, just look at it.
When you open IDA, load the openrct2.exe file from this repository. You will see a large number of instructions without any information attached, and will probably want the debugging information that people have added so far. Email IntelOrca for the latest copy of the IDC file.
Once you have the IDC file, load it by clicking "File -> Load Script" and loading it.
RCT2 is written in x86, which is about as close to the actual CPU instructions as you are going to get. Each of the 8 CPU registers can store 32 bits (in this game), and you can perform operations on the actual bits contained in those registers.
Lots of numbers and addresses in the Tycoon Technical Depot are written in hex. The prefix "0x" generally denotes a hex address. The letters used for hex are a superset of those used for decimal, so numbers which look like decimals ("12") can actually refer to a different value than you think (in this case, 0x12 = 1*16 + 2 = 18). In IDA these are represented as numbers with the letter "h" as a suffix, like "1Ah".
To convert between these, I find it very convenient to keep a Python REPL up and make transformations between the types as necessary. It's also easy to view the binary representation of a number. Alternatively Calculator can be used.
# Print the decimal representation of a hex value:
>>> 0x12
18
# Convert from decimal to hex:
>>> hex(18)
'0x12'
# View binary representation of a number (works for hex or decimal)
>>> bin(0x12)
'0b10010'
Note that RCT2 uses a little endian encoding for integers that span multiple bytes, so the most significant bit is in the 2nd byte of a 16-bit integer.
The registers beginning with e and ending with x or i (edx, eax, ecx etc) are 32 bits each (4 bytes). The general purpose registers that are two letters ending in x (dx, ax, cx) are 16 bits each. The registers ending in "h" or "l" stand for "high" and "low" and are each 8 bits.
When converting to C, use an int
type to represent a 32 bit register, a short
to represent a 16 bit register, and uint8
to represent a 8 bit register.
The general unit of work in the x86 codebase is the subroutine. Subroutines are called like this:
call sub_6CFFFF
This will cause execution to jump to the subroutine, and the subroutine will
execute until a retn
value is encountered.
There are three functions in C you can use to replace a subroutine call.
-
RCT2_CALLPROC_EBPSAFE: Use this function if the subroutine does not use any registers from the calling program. For example, if the subroutine starts with a
pusha
instruction, this saves all of the registers from the calling routine.call sub_6CAB00
And then in the subroutine:
sub_6CAB00 proc near pusha
This generally means that the registers are saved. You may also see
sub_6CAB00 proc near push edx ; Save this register before overwriting push eax ... Code in the function.. pop eax pop edx ; Restore the registers to their values retn
-
RCT2_CALLPROC_X: Use this function if the subroutine starts using registers without loading any data first. For example:
sub_6CAB00 proc near add ebx, 7
This subroutine begins operating in whatever value is stored in ebx, so it's safe to assume the caller has deliberately put a value there to be manipulated.
-
RCT2_CALLFUNC_X: Use this function if the subroutine stores values in registers to be used by the caller. For example
sub_6CAB00 proc near add ebx, ecx retn
It's safe to assume that ebx is being used by the calling program.
You should brush up on how pointers work, if you are unfamiliar, or coming from a higher level language like Python. I haven't found a great tutorial for this yet, but here's something on pointers that might help.
In IDA you can determine the presence of a pointer by the braces around an expression. In this case this generally means the value stored in the register is an address.
add [ebp+2], 7
This means take the value stored in EBP (which should be an address like 0x579992), add two to it (0x579994), and then add 7 to the value stored at address 0x579994. Sometimes a register represents an address, and sometimes it may represent an integer like the height of a ride.
Always remember that pointers are unsigned do not try to use them as a signed integer otherwise you may end up at the wrong address.
If you see a value like this in the code:
mov bh, byte ptr word_F440AE
This roughly says, get the value at 0x00F440AE as a byte (8 bits) instead of a word (16 bits) and copy the value into the register bh
.
This converts into code as
int bh = RCT2_GLOBAL(0x00F440AE, uint8);
If you see an instruction like this:
jz nullsub_65
This represents a call that actually existed in a version of the program but doesn't exist in the final version. In this case, if the flag is set to zero, execution will jump to the end of the subroutine (eg a return
in C); otherwise it will proceed to the next instruction in the code.
If you see a set of instructions in the code that looks like this:
movzx edx, current_ride_index
imul edx, 260h
movzx edx, rides[edx]
In C code this would be:
edx = RCT2_ADDRESS(RCT2_ADDRESS_RIDE_LIST, rct_ride)[current_ride_index];
This stores in edx the beginning of data from a ride instance. The ride instance data follows the layout described here.
In general if you see something that is multiplied by 0x260 then its quite likely that it is a ride that is being references as rides are 0x260 bytes. Sprites are 0x100 and instead of multiplied by the number it is normally left shifted by 8 (<<8). This makes it very easy to work out what a loop is iterating over.
If you see an instruction that looks like this:
add ebx, offset sprites
(where sprites is a named address in IDA, like 0x123456). This means, roughly, *add the register on the left to the value on the right, and store it in the register on the left. In this case, this would mean
ebx = ebx + RCT2_ADDRESS_SPRITE_LIST
where RCT2_ADDRESS_SPRITE_LIST
is a value like 0x123456
. In the binary, ebx
could be any register, and offset
can refer to any address in the code.
This will eventually end up like the following once we have the offset properly mapped to a C arrary
ebx = RCT2_ADDRESS_SPRITE_LIST[ebx];
Use printf
or,
To print statements to the Visual Studio output after RCT2 begins, include the following header:
#include "windows.h"
Then use the command OutputDebugString
.
OutputDebugString("Hello World!\n");
- Use the spacebar to shuffle between the graphical layout and the line-by-line instructions.
- Press semicolon to add a comment at the end of a line.
- Press x to show all read / write / offset / jump references to an address.
- Press n to rename an address.
- If you are trying to read from the binary, note every address in the code is
0x400000
higher than its physical address in the binary. So if you have an address at0x900123
in the code and you want to read from it in an external program, start reading at0x500123
instead.
This is tricky, and note that the addresses in the Tycoon Technical Depot are only valid for RCT1. I would try starting with the work that's already been done and trying to branch from there to find sections of the code that are useful for you.
You can also read through the code in the OpenRCT2 project, especially the addresses in src/addresses.h, which contains a very useful list of important addresses in the game. Most of the functions in the OpenRCT2 C code list the address of the corresponding subroutine in the docstring.
Another approach is to work backwards from the strings or windows that exist in the game to the subroutines that you want to change. That is, find a string like "Too high for supports!" and try to figure out where it is used in the game, by searching for the hex representation of its ID.
- Home
- FAQ & Common Issues
- Roadmap
- Installation
- Building
- Features
- Development
- Benchmarking & stress testing OpenRCT2
- Coding Style
- Commit Messages
- Overall program structure
- Data Structures
- CSS1.DAT
- Custom Music and Ride Music Objects
- Game Actions
- G1 Elements Layout
- game.cfg structure
- Maps
- Music Cleanup
- Objects
- Official extended scenery set
- Peep AI
- Peep Sprite Type
- RCT1 ride and vehicle types and their RCT2 equivalents
- RCT12_MAX_SOMETHING versus MAX_SOMETHING
- Ride rating calculation
- SV6 Ride Structure
- Settings in config.ini
- Sizes and angles in the game world
- Sprite List csg1.dat
- Sprite List g1.dat
- Strings used in RCT1
- Strings used in the game
- TD6 format
- Terminology
- Track Data
- Track Designs
- Track drawers, RTDs and vehicle types
- Track types
- Vehicle Sprite Layout
- Widget colours
- Debugging OpenRCT2 on macOS
- OpenGL renderer
- Rebase and Sync fork with OpenRCT2
- Release Checklist
- Replay System
- Using minidumps from crash reports
- Using Track Block Get Previous
- History
- Testing