5.6 Non specific assembly discussion.
The following section covers various things that are often of great interest to many of those working in assembly hacking but that do not necessarily fit in other areas as well as a few concepts and techniques.
5.6.1 Language mod example
The game Advance Wars days of ruin/dark conflict had the Japanese language locked out despite it being present in the game. This would not be so bad but it never actually saw a released in Japan on the DS (it appeared on the 3ds in downloadable form in late 2013). However it was found that by holding part of the RAM with a cheat the Japanese language files would be used. Games can use an assortment of methods here but in this case it mirrored the firmware selection method somewhat and is worth analysing further. The following is based on this GBAtemp thread, uses a cheat from the GBAtemp cheat database and in many ways acts as more of a tutorial in how to port a cheat than anything to do with language. This firmware mirror concept is seen a lot in games and is useful for those games which pull from firmware but either you can not change (some emulators) or you do not want to change all the time, here though it actually unlocked hidden functionality.
22168F8C 000000?? is the cheat code where the ?? part is 00 through 05 to select the language you want. In this case corresponds to the firmware order (Japanese, English, French, German, Italian and Spanish).
The 2 at the start means do an 8 bit write and 2168F8C is probably better written as 02168F8C which is to say 168F8C hex in the main ram section.
Now the cheat probably works by constantly writing the value to that point in ram, something which might well be pointless after the game bootup.
Neither section is in the binary or an overlay for this game (it is just before the overlay lowest in ram) and setting the DS to Japanese does not net Japanese as the game language (it needing this cheat to get it). This means there is something odd happening somewhere for it.
In the disassembly the location is mentioned several times around 020E31E0, said location being the the first one to mention it. Running the game in iDeaS and setting a run to option it first appears shortly after the opening title/developer screen.
Here the ARM9 instruction as obtained from ndsdis
:020E31E0 E59F0004 ldr r0,\[r15, \#+0x4\] ;r15+0x4=*(020e31ec)=\35032972(0x02168f8c)
Crystaltile2 can do it just as well though
Stripped to the more relevant parts
ldr r0, 0x02168f8c
Now there are a couple of schools of thought here
Go backwards and consider what got things to this stage (there seems to be several branches to 020E31E0)
Go forwards and see what happens.
In this case forwards is probably more interesting and likely necessary as it might well set pointers for the text sections and that there had to be a cheat to select the language in the first place means it is probably better to go forwards in this case. It turned out as expected that the value was checked very early on so the game was reset and then allowed to run for a fraction of a second before being paused. 020E31E0 was selected (the go command will get you there) and run to cursor was used to get there.
From there the trace into command (f8) was used to advance instructions one at a time
In short without a cheat it sees 01 there and more or less jumps immediately to another area (it checks for French and Spanish as well) and with a cheat 00 appears before it goes off on a massive tangent ending with it loading the Japanese language version.
The obvious thing to do is change the load of that memory section (which holds 01 in the case of English) to return 00 instead of 01. As the value is known that mean ldr can be replaced with a mov or for the full instruction as arm-eabi will want it
mov r0, #0x00
This assembles as
0000A0E3 (it will account for endianness but many emulators and disassembly readouts will make it more human readable as indeed you can see in the iDeaS debugging shot
From ndsts earlier we know this is the ARM9 and that it loads from 4000 hex in the ROM and Crystaltile2 says it is uncompressed.
0E31E0 was the location in the binary which means 0E71E0 is the location in the ROM. If it was compressed or a lot of work had to be done then it would just be a matter of extracting the file (decompressing if necessary) before injecting it back in as is done with any other file.
There are of course multiple ways by which the game could have been changed to achieve the same effect and even though it was a single instruction change in the end there may have been better methods as far as game optimisation goes.
5.6.2 Non code in ASM
If you want a term to search for then this usually known as incbin (include in binary). The DS binaries and overlays, especially in games without many other files, can contain code that is not executable, including graphics, text, fonts, level data and more. Extracting this code is seldom a problem, indeed if it is compressed it is almost certainly compressed with the binary specific compression, compressed as part of the whole binary/overlay or something equally obvious. However the getting it back in part may well prove more challenging. There are various reasons for this, the most important of all being that the binary has probably been assembled down to the instruction level to handle the layout as it exists in the original ROM; even if it is something basic like text your pointers for each line are probably now memory addresses as opposed to some of the simpler methods the rest of the DS enjoys.
This is not limited to any one console either and although programmers are encouraged to keep their data and executable code separate as has been covered elsewhere and should be kept in mind at all times when hacking is programmers are people too and are no stranger to the quick and easy method when “it will do”.
5.6.3 Destructive vs non destructive assembly editing
In the simplest sense most assembly hacks are in some way destructive as they edit the original ROM but that is not a useful distinction so it is not often made, and if it is it is usually for when dealing with secured systems that check binaries during runtime.
In an ideal world you would just be able to add your function to the end of the binary and have it work after a branch but the world is not ideal and even if it was if you are replacing a function or editing just a handful of opcodes it might make less sense to do that so to this end we have destructive assembly editing where things get replaced from the original ROM. The crude method is to overwrite and accept the consequences which many have done to test things to no ill effect and even used in “production” hacks but beyond that DS binaries might contain all sorts of potentially viable space like those in wifi and other error messages; assuming you have the uncompressed version do a string search for ASCII or unicode data and you will quite often end up with a large number of bytes given over to the error messages which remember is loaded at all times and even if you do not want to repoint the text values wifi error messages rarely come up so it can be a safe place to branch to and inject some code or otherwise use for calculations/values.
If you are not up for that though there is the “proper” route of branching, binary extension and optimisation however on the DS once you add in binary overlays and other aspects of memory mapping this can get very tricky indeed as the memory on most DS games is aggressively managed both by design and because it has to be (4 megabytes give or take a bit for TCM and extras for video and such can do some amazing things but it is not enough to sustain a slightly sloppier style of coding that cares not about memory leaks and freeing unused information).
DEADBEEF padding and finding free memory A technique which is common enough outside DS hacking, the idea is you flood the RAM (or save file in some occasions) with a series of values unlikely to come up in everyday code and DEADBEEF is quite valid hexadecimal so it is often chosen (some older computers even initialised their RAM to it and other machines use similar plays on words for various things). After you have run the ROM for as long as you need anything with DEADBEEF in the RAM is possibly fair game to put something in.
It is probably not going to be that hard to make your own (you are making something do a simple task once early in the boot sequence) but DSATM 4.3.4 should have the option to add it in (it was dropped from later versions of the program which you can get from the same link).
This technique is more useful for code that needs to run once and run early as there are all sorts of edge conditions that could have to be accounted for but with the general lack of run time memory management outside of what has been programmed into the game expect things to cause trouble in the long run.
There is also a modified build of desmume that will attempt to track memory as it is used and will report which areas have not been used by given points.
Some games have been seen to initialise/flood fill the RAM and possibly expect certain things to be as such which can be broken if the RAM now reads DEADBEEF. Related problems have been seen when trimming GBA and DS ROMs down to the last 00 which is might well have expected as an end of file token or something similar.
No GBA or DS game has ever been observed to do it but reading “uninitialised” RAM is considered a reasonably high quality way of getting random numbers to use as a seed for some encryption or something, such a thing is technically possible though. With the hardware being fixed there are probably better methods using the sound hardware not to mention most games are unlikely to need security grade random numbers so a psuedorandom algorithm or even lookup table is quite suitable.
5.6.4 Polymorphic and dynamic code
There are various methods available to programmers to change code and use existing code to do things using information gathered at runtime and these can trouble assembly hacking.
Polymorphic Usually a term that comes up in the discussion of modern virus programming and signature analysis it is worth being mentioned here.
The idea is that if you can read and write the binary location in memory you can change how it runs at runtime but with a few exceptions this is rarely seen in commercial games for the likes of the GBA and DS; indeed the anti piracy protection for commercial games on the DS often revolves around checking the binary has not changed which prevents this method from being that useful (any potential speedboosts and memory savings usually being negated by the checks or having to work around the checks). On the flip side though some of the interpreted languages and emulators using techniques like just in time compilation and dynamic recompilation can be said to use this although that usually devolves into a semantic debate.
A basic example of this being used in a ROM hack could be something like hardpatching a cheat to the ROM and allowing a trainer to change it before you run the ROM by use of a menu. Say you have found the location in the memory of the lives and now found the instruction that removes them and as has been seen it is possible for cheat makers quite often patch the binary or overlays in memory with the payload being a changed instructions. True polymorphic code would do this multiple times at runtime and keep adapting the code most likely according to a set group of patterns/techniques but not necessarily25.
Some hackers have in the past swapped out portions of the actual game code, jumped to it and had it run before swapping out the changed portion back for the original code and jumping to wherever things need to be and others have compressed the code that comes after their code and expanded that back after their code is done running deleting their code in the process and allowing the game to run as normal. Both of these are quite an advanced techniques though and not terribly useful if you can otherwise find some RAM or some unused part of the binary to use instead..
Although the binary itself might be protected other parts of memory are not so “lucky” and things can happen there with one of the most common examples would be the palette used in 2d imagery; 2d animation was already covered but a method there comes in that the palette can be updated in real time to create the impression of changing colours. A great example of this is the rainbow blocks in Mr Driller 2 on the GBA and you can see the palette being changed in just about every version of VBA as long as you are in a level and if you prefer a DS example there is a nightmare sequence very early in the second game that does a similar thing to a background.
Heading into the future although it is a valuable technique some of the protection systems do a lot to prevent it from working with consoles like the 360 even gaining encrypted memory thus preventing simple manipulation (some of the hacks actually added an area of unencrypted memory to allow emulators that use related techniques to work) and home computers often now a memory protection technique known on Windows as ASLR (address space layout randomisation) that randomises the memory layout in an attempt to avoid having data at a given location; just quickly every windows program is compiled to run in a certain memory location but Windows itself hides and changes the commands into what they need to be which is how multiple programs work at once on Windows. On the other hand there are languages that get classed as reflective languages that do have such abilities as something of a core function but this again risks returning to the semantic debate.
Dynamic (calculated values) It could be said most pointers on the DS are a variation on this and although it is not strictly a type of polymorphic code it is related to it and a lot more common than by virtue of it being used in calculated pointers and similar code; it is quite common in C type languages but if you recall back to the cheat to assembly ports and the instructions that would add things to values later used as memory locations. Such things are also the reason it is not often advised to even try changing locations of things in RAM or in ROM for things like the GBA even if you are willing to undertake the near Herculean task of repointing everything. It is quite often seen in games programmed largely in assembly or plain C which is to say just about everything older than a PS1.
Here if you know the memory location of something the next thing might well be a known/fixed distance away so rather than fetching a new pointer from memory or worse the cart and incurring a speed penalty you could just add this fixed distance to a value already in memory and then direct it to a new location with this new value. Related to this is the idea of dodging having to call a function and spend several cycles setting up for it (even if it is as simple function everything will have to be pushed, new values written, the function jumped to and done, values returned and everything popped back before returning (although the last few things can be swapped around a bit).
Cheat makers quite often face a similar problem with pointer codes where a game will store a value in memory but instead of just having a plain value somewhere written directly it instead has a pointer to it which can change and the game calculates where to write the value at a given point based on this pointer which may or may not be static throughout a given run of the game. If it is not static this prevents the basic alter and check cheat finding method as well as always write unless the pointer is accounted for and such things have been seen in areas not generally covered exclusively by cheats as well.
5.6.5 Slowdown and speedup
Occasionally it is useful to slow a game down or speed it up and there are several ways to do this but if you are on an emulator just use the inbuilt features as they will be far more stable and easy to do than these hacks.
Slowdown Slowdown can be caused in one of several ways
The way most DS flash carts and programs like DSATM use aims to interrupt and then flood the CPU with useless instructions (maybe a selectable amount of them) that mean it can not do everything it normally would and will slow down the game as a result (and might well crash a game not coded to take it).
CPU halt commands can also be used although this is often even harder.
Methods similar to those some of those seen in speeding up but made to slow things instead.
Speeding up Typically this is seen where games have elaborate movement sequences (or just movement sequences) which might not sit well with someone wanting a quick fire game with pokemon and similar titles, RPGs and “tactics” games being common targets for such hacks. Simple emulator style speed options are not that doable in hardware, though you can try. You typically employ one of three methods
Animation based tweaks. If a game will wait for an animation to happen you force it towards the end state faster – if it waits for an animation to happen over ten frames you have it happen over one, either by changing the frame count or by changing it so it reaches the end state faster (the animals came on two by two but if instead it was four by four it would be faster). Works for 2d and 3d animations. Hopefully said animations are not a loading or calculation mask. You need not stop there; many years ago I played pokemon blue, it had a little experience sharing device which I took it off in the end because the thing made me press A a bunch of times at the end of battles. Should you eliminate that then you gain some more speed/less annoyance.
- Crank it to 11. If a game has a setting for text speed and such then it might have an option to go to even faster. For instance if a game has a text speed variable it might use said variable to speed things along and might not be a simple flag. Say then if the value is 90 but you have the full 8 bits to play with you might gain some more speed. You could also figure out how it changes the speed and change that instead.
Vblank loops. Probably the easiest to attack but gains the most fallout. Screen updates tend to be tied to vblanks and that also accounts for various pieces of game logic that might happen. You disable some of the vblank loops and have them happen all the time (or some more frequent event) instead and you probably get some garbage on the screen and some speed increases too. Traditionally this was all tied to the clock speed of a device which in turn went for the screen it went on, hence bad PAL conversions, different clock speeds between regions and even old dos games being crazy fast on modern machines. Newer devices often have a separate timer, hence PC games working just as fine on an old P4 as they do on an overclocked i7, assuming the P4 can output the required number of frames for the screen, or indeed the same on the PSP whether it was underclocked or clocked faster.
Full blown hack. If you don’t have anything nice to abuse like in 1) and 1a) you fully analyse the game code and make it happen, first step there though is to see if you can force something like 1) or 1a) to happen if there is a bounds check or upper limit put in place for whatever reason. Actually optimising games is not really going to be covered but people do it and it does increase speed, sometimes increase stability and increase battery life. As time goes on and higher level languages get used there is an increasingly large amount either actually redundant or potentially redundant code as well.
Some have attempted to disable vblank/vsync or prevent it from waiting to do things there which also speeds things up but often at the cost of corrupt graphics.
5.6.6 Cryptography (encryption, checksums and signatures)
Although you were already warned this document would probably ruin your ability to enjoy films, TV shows and such that try to portray computers this section carries another warning as it is more dangerous than almost any of the others for that.
For longer than there has been computers or even the maths to describe it there have been people which have wanted to verify information and/or have it only able to be read by those it is intended for. Modern consoles which have designs on being something of a single purpose device with no user interaction beyond the parameters of the game and any menu the console has then have made extensive use of it. Although with the exception of save editing which is covered in the next section this is usually taken care of by the console itself (and later on hopefully any tools you used to pull apart and rebuild the filesystem, the hacks to or emulators of the system).
Although in practice cryptography includes aspects of security as a whole and is also concerned with time taken to crack them as well as the proper/effective implementations of cryptographic systems it will have to be considered somewhat outside the scope of this document. If you do want to read further though two books by Bruce Schneier known as Applied Cryptography and a more recent book Liars and Outliers are starting points and reference materials for most that use the field.
Back on topic there are three main areas of interest.
- Checksums/Hashes - small numbers that are made using mathematical functions to describe the contents of a message/file.
- Encryption - using mathematical functions to prevent those without the relevant information (keys) from reading the message/file
- Signatures - using mathematical functions to generate a checksum but using a key so only the person possessing the key can generate the signature.
You have just seen some of the terms/jargon but more should be described before going on
Message - the term used to describe the data being checksummed, encrypted or signed
Plaintext - a synonym for the message
Ciphertext - a synonym for the encrypted message
Hash - a synonym for checksum
Plain text - the original unencrypted message
Key - simply a number used to do/as part of the encryption or signing process
Symmetric key - using the same key to encode and decode
Asymmetric key - using two or more keys to allow some groups to read and others to encrypt but one in possession of just one key should not be able to do both.
Private key - depending upon what you are doing the key you keep locked down
Public key - depending upon what you are doing the key you give to anybody that wants it
Collision - when two different messages produce a file with the same signature/checksum
Rainbow table - a list of all the hashes for every single message up to a certain length
Brute force - trying every possible key or every change to a file to get it to match the original. For a modern well implemented system this is not considered viable
Hole - the term for when an encryption or signing method has a flaw that allows people to bypass it (usually in a time shorter than brute force)
Security by obscurity - the process of trying to ensure security by using a custom encryption/hashing/signing process which is a method that frequently fails spectacularly
Checksums/Hashes There are two simple examples of this that will be covered in an attempt to explain this.
Parity - in short if you took the file as a whole or a collection of parts is each of those odd or even.
Bytesums - if you added up the contents of each byte a number would be produced that would vary if one of the bytes that made it up was changed.
Here are the requirements for checksums/hashes used for security (if you hash for searching purposes some of these change in a fairly obvious way), sometimes these are combined into more broad requirements but the ideas remain the same
- Each message must produce a unique hash, aka messages must not produce the same hash
- Each hash must be equally likely, aka each bit must have a 50/50 chance of being 1
- Each hash must be unpredictable, aka each minor change to a message must produce a big change in the resulting hash
- Each hash function must make the same hash for the same message (no randomness in the output)
Salting This is the act of introducing another piece of data to the data to be hashed. In the better cases something unique to that piece of data but in lesser cases something common to all the hashes to be made for a system. It is mainly used where password databases are kept; storing raw passwords is risky as they might be exposed in a leak and if you are given a password then a hash of said password can also be used to verify it (remember each message should produce a unique but unpredictable hash). It came about as rainbow tables, tables which list hashes for every combination of values (occasionally just dictionary words or a given character set) up to a certain length, became viable for even general use in the early 2000s as a result of increasing disc space. By changing the data being hashed to something else you stop basic rainbow tables from being useful, however as they are easily generated it is then desirable for a unique/different salt for each hash of a password.
Encryption As mentioned this is the process by which data or, to use the standard parlance, the message is made so that it can not be read by someone without the required key values. As far as the GBA and DS are concerned encryption is not in common use but other consoles have been seen to use it and on the PC games and other programs use it extensively so it is worth knowing about
There are two main classes of encryption known as
- Public/Asymmetric key - in the basic form one key is used to encrypt and another which is mathematically related is used to decrypt.
- Private/Symmetric key - in the basic form the same key is used to encode and decode.
There are ways to blur the lines and the terms do not always match up precisely and it is also possible to combine the two; commonly your bank will use public key to allow you to send a key to them to then do private key for the rest of the transaction and you can nest encryptions if you want. Private/Symmetric key is in many ways inferior but as it is less computationally expensive to do and when done right it works well it has stuck around since the invention of asymmetric encryption.
Symmetric XOR
Going back to the binary XOR example
Message
0110 1100
Key
1110 0001
XORing the two
1000 1101
Asymmetric The basis of asymmetric cryptography is the idea of one way/trapdoor functions.
Typically the functions employed for encryption are that it is hard to factor numbers (with cryptography techniques being available for prime numbers as a result) and a related problem with ellipses (leading to elliptical encryption).
Examples of implementations are usually given as part of a programming course and are quite lengthy and will do little here so it will not be covered beyond the most basic
Prime numbers
181733 - what are the factors?
The answer is 691 multiplied 263 but other than being told or trying every possible prime number (which remember there is no known pattern in) you can not do it and there is also the chance the number is not a prime number but to basic checks looks like one (only by checking every viable number can you tell but there are weak tests called primality tests that can indicate a prime number).
RSA is a popular prime number based encryption method which you can read about Berkley’s website
Basic attacks Some of the methods you will learned to find text have their roots in attacks on encryption, and indeed technically speaking text encodings are a type of substitution cipher which is a technique which has been in use for thousands of years. Being so well known by themselves these attacks are not typically of great use but as many methods derive from basic principles they are worth knowing of.
Known plaintext The simplest form of this is for XOR and was seen on the 3ds save encryption. Here the last bytes of the save chip which are typically larger than the save they contain was padded with 00 and XORing your password with 00 leaves the password as plain text.. Even if it was not 00 just knowing the original text (or enough of it) allows you to get the XOR key by simply doing the XOR.
Chosen plaintext Again if you go back to XOR if you can choose a file of all 00 (this also works on many more advanced algorithms at various levels) and get the program to encrypt it you will have valuable information about the key if not the key itself. If you use an action replay cheat to change a value in memory and then let the game make the save thus not having to work out how a game hashes the save you are doing a variation on this.
Key recovery This is more of an assembly level thing but most of the time developers will not manage the key well and leave it in memory or leave it in an obvious place at a given point in time thus allowing you to extract it. Equally if you are dealing with a PC or an emulator then much of encryption is merely designed to frustrate those basic skills or those without much patience so you can watch what happens and figure things out here if you want.
Oracle abuse The idea of an oracle is actually a specific thing within cryptography but in loose terms it can be summarised as a black box of unknown mechanism that assists in cryptography related problems and if you have/gain unfettered access to the oracle you can do some interesting things. The PSP saw a related attack with the game decryption program that allowed decrypting of game files for later firmwares (the PSP allowed for new game encryption keys to be introduced), several other late stage hacks in consoles use it as a core mechanic rather than hack things more extensively (although in some cases this may be viewed as exploiting a different bug), compression workarounds have been made where rather than figure out the compression form then if you just wait for it to appear in uncompressed form… and if you ever inject content into the ram and have the save created (and hashed) using the content you just injected that would be another example, or alternatively if you disabled the hash check on a save knowing that it would eventually redo the hash properly for you that would be another.
Example of known and chosen plaintext For an example of why using a long run of 0 with XOR is a bad idea if you want security and a further example of known plaintext
Message of all 0
0000 0000
Key
0110 1110
XOR
0110 1110
Message longer than key but padded with 0
1101 1111 0000 0000
Key
0110 1110
XOR
1011 0001 0110 1110
Known/chosen plaintext
So you have not padded with 0 but you have padded with a repeated value of 7h (0111)
Message
0111 1111 1101 0111 0111 0111
Key
0101
XOR
0010 0000 0000 0010 0010 0010
Unknown message (but with a known or suspected portion)
XXXX XXXX XXXX 0111 0111 0111
Reversing the XOR
0111 gives 0010
This means
To get 0 from 0 XOR has to be 0
To get 0 from 1 XOR has to be 1
To get 1 from 1 XOR has to be 0
To get 0 from 1 XOR has to be 1
0101 is the key.
Signatures In brief a signature is just like a hash in that it produces a number that represents the data signed but it uses cryptography (typically asymmetric but not always) to mean only someone with the keys can create and/or check the signature. Any encryption algorithm can be turned into a signature although in many cases you will want to read up and make sure it is a wise idea lest you end up like the original xbox (PDF version of the article).
Many times though making a signature for a large file is less than ideal for various reasons, the big two being speed and the possibility of key leakage, so the file will instead be hashed and the list of hashes (which remember should change if the file does) are then treated as a message which can be signed; changing the file changes the resulting hash which makes the signature, which is in essence a hash itself, incorrect and thus things can be treated appropriately.
A popular implementation of the signed list of hashes/checksums goes by the name HMAC.
Checks and workarounds Hashes and signatures are great but if you fail to verify them properly they are useless. Several consoles have failed to verify them properly over the years, most notably in the Wii which used a text compare function and thus could be made to only needed the first byte to match (a trivial thing to “brute force”) and early PSP which checked one folder and launched another if you put a $ in front of the name. A related failure can be seen in the xbox 360 DVD attacks (the basis for flashed DVD drives on the 360) where the disc is verified solely by the drive firmware, said drive firmware can be overwritten (no bad thing per se as it eases manufacturing) but the original console is also unable to check it properly.
Related to this are three concepts which are often used and even used in consoles, although not so much for anything ROM hacking is usually concerned with.
- Blacklist - if something is detected as being on a blacklist it will be prevented from being run or something similar.
- Whitelist - unless something is on a whitelist it will be prevented from running, or at the very least heavily restricted depending upon your setup.
- Heuristics/greylist - Here certain functionality is probed for and if it is found warnings or limits will be put in place. Greylist is more commonly used when discussing email where it refers to the technique of sending a reply to the originating email asking did you really send this.
Consoles will often blacklist earlier versions of the firmware to stop things being downgraded and the Wii saw a couple of devices/discs from Datel aimed at the Gamecube blacklisted as they allowed various things to be done to allow gamecube homebrew on the Wii.
Whitelists have been seen in consoles when signatures get broken (like the PS3) or for legacy support where there is not the option to upgrade old code (remember the term is called ROM for a reason) but the desire to run the old code is still there. The DSi and 3DS have seen this where all the old DS games were put on a whitelist but not any unofficial code and all newer DS games carried additional signatures26
5.6.7 Multiplayer and the failure of Nintendo’s online DS security.
Networking and all that goes into a modern multiplayer game is a bit outside the scope of this. However the demise of the DS (and Wii) online game services provided an interesting example of something that is technically editing a binary but in practice involved very little assembly.
As the DS at least pretended to be a partway modern computing device and sported TCP-IP it also tried its hand at security when doing online things. Part of this was a secure handshake that it did with Nintendo’s servers. Theoretically there were ways by which this could have been subverted, however in practice it was observed that by changing the web addresses hardcoded in binaries/overlays found in DS games from https:// to http:// (padding the end of the address back out to make it fit) it would happily broadcast in plaintext instead. Nintendo could have also done some server side checks for it but they had not.
With it, and in the short period before the shutdown came, the save Nintendo wifi project was able to observe the protocol and reverse engineer it far enough that many games are still playable on custom servers. Most games were not as extensive as something like World of Warcraft, which has full databases and servers doing lots of computation beyond that, but if a hacker is able to replicate the functionality of your hidden servers then it is something of a failure in network security.
Earlier failures in online security usually came from people being able to run their own code on carts, and modified versions of existing games, and with it cheat to the point that games were broken. Traditional anti cheat measures might have helped but in general security you should always assume those accessing your services are compromised/hostile and do what you can within your own servers.
5.6.8 Save editing
Although games could just take a snapshot of their memory at the time and restore it later (indeed this is all save states are) most of the time the key values that make up the game at that point (characters, names, levels, experience, states, inventory, location, point in the story….) are stored in a writeable section of memory somewhere using some custom format. Either as a type of cheating or for hacking purposes (both ROM hacking and general hacking27 ) it can be useful to edit these files.
Figuring out how this format works usually involves a combination of a method like searching for cheats (change one small thing and see what changed in the file format), actual cheats (makes changing the resulting values much easier), simple analysis (it is certainly not sure to gain you anything but there is nothing bad about looking at the file in a hex editor and seeing if you can figure something out based upon what you expect should be in there), level/stat editing (corruption and logical analysis) and assembly (seeing how it stores and restores the data from the save file into RAM and relating that to what you know). Usually the formats are quite logical and this can work to your advantage for much like most relative text systems simply bumping the value of an item by 1 will often select the next in the list and allow you to fill out the entire list quite quickly possibly including some items, locations and such the developers did not intend to have available in general use.
On top of this games will frequently do some checks on the file both in the form of a basic checksum or maybe a signature of some sort (rarely and quite often something relatively custom/nonstandard in the case of the GBA and DS) and in a manner similar to complex cheats with values mirrored to other locations in various ways, pointers used to change locations, values manipulated before storage and such. Finding the checksum section will usually come as part of the initial editing (remember checksums should change for even the smallest change in the data they cover) although finding out the method can be troublesome so alternatively you can hack the game to ignore the checksum as it tends only to be checked once right at the start of the loading procedure or some similar opportune point.
More modern systems and especially those with online components (Microsoft is well known to ban people from Xbox live and revoke their save making keys should they manipulate their save games and the Wii has seen similar measures in some cases) and in some cases even things like pokemon will have detection based not on cryptography type methods but on impossibility; a value might well be 16 bits but if the game under the best possible circumstances can only see a value of a stat reach the low thousands (as opposed to the 65535 that a 16 bit value affords) and someone has exceeded that it can be detected.
It should be noted some consoles like the Wii and 360 have developers store files in a universal container format for that console and then allow developers to stick their custom format in that.
Saves in different regions Saves are largely a custom format and although most localisation teams do not really touch much in the source code as far as games running goes it is quite nice to have saves work between different regions. In many cases you can use the saves directly with any region or version but occasionally there can be problems with the most common although not so troubling problem being where a Japanese game might have used a 16 bit encoding and the European or North American localisation will more likely use an 8 bit one or simply not include all the glyphs from the original encoding making the names of characters display oddly.
More troubling is that game saves are usually a fairly specific affair with locations hard coded so if a variable is changed in length between versions every subsequent variable will be out by a given amount and things will go sideways and if multiple variables are changed in length it can get even worse; here is might even be better to build a full save editor or collection of cheats instead of “fixing” the save.
5.6.9 Interpreted languages
Once exclusively the domain of homebrew a few commercial games have been seen using interpreted programming languages of which the most notable was the puzzle quest series (in this case a version of lua) and other times games have used scripting languages after a fashion for their text and general operations (recall the wii scripting example and wizard of oz game mentioned back in text editing) to say nothing of the DS sound format SSEQ being a limited type of scripting language. Trying to interpret these (that is to say the actual game code as opposed to the interpreter itself) as ASM is not going to happen easily and much like using a hex editor for everything it is not a terribly good idea but more importantly many of these interpreted languages are built in a way that allows for decompilation and/or simple editing using the language itself. Equally few games are programmed in ASM any more and most will not have any or at worst only a tiny fraction of ASM (it is a technique called inline assembly) so being able to recognise a given or standard C function in assembly can make reverse engineering that much easier.
5.6.10 Game AI, game logic and game theory
The mathematical discipline known as game theory is a fascinating subject and one well worth reading up on (some basic coverage will be here but it goes far more in depth) as it informs quite a bit of ROM hacking owing to it speaking to the nature of games. As one of the favourite things for people to do in ROM hacking is to twist the original concept into something quite different, or indeed remove the ability to exploit certain techniques that might be seen as game breaking, then it can pay to know the terms used to describe what has been done (“in doing this I turn the game back into a perfect information game”).
Game AI on the other hand follows on from stats and level editing but in general the idea behind game AI is to provide an opponent to a human player that, in the confines of the game at least, is relatively challenging despite no computer being able to think at the level (and more importantly in a similar manner) to humans. Games in general exist almost solely in the realms of using a limited set of rules/confines to attempt to produce a compelling experience for the player and this applies even more so for computer based games; think how much of ROM hacking and cheat making aim to change how the rules work. Layered onto this is various amounts of psychology but even small facets of this are the subject of far longer documents than this one, to this end and to keep it more applicable to ROM hacking the discussion will be limited.
Quite often would be hackers are wanting to change the stats of a collectable card game and quite often doing as such renders the in game AI almost useless. To understand why it is useful to analyse the principles of a card game. Going for one of the simplest card games known as Top Trumps where cards featuring various related items (characters, vehicles, animals, locations and much more) would be pitted against each other based on a certain stat and win accordingly (although here the standard rules will be ejected so as to create some room for examples) and comparing it loosely (a kind of proto game version for the most part) to some of the card game types popularised by the likes of Wizards of the Coast. Speaking of Wizards of the Coast then if the following section and the related ones interest you then they have a nice writeup of their AI for the Xbox Live Arcade version of their Magic the Gathering card game, you can read it here.
Now the cost for playing is zero (no “mana” or monster sacrifice)
There is no option to negate a play (no “interrupts” or traps)
There is no persistence of a card (once played a trump is removed from the game vs the potential for a persistent threat.)
For the sake of this example alternating turns will happen (no skipped moves and no winner stays on)
A fairly simple game but one some thought needs to go into to make a good AI.
Questions that need answering is does the AI know all the cards in the deck and calculate from there or use a probability estimation? How does the game draw new cards (7 and new set when they are over or 20 and 20 for each to be available to play from the start?), are the cards an equal spread (being based on real life items it is not likely but it could be made to be so), how is the game played round by round (put down a card and have to beat it or refuse and lose a random card)? Is just one stat mentioned for play or are all the stats up for the chance to lose?
The win condition of a given round is probably fairly obvious but the good player will not waste a card much higher than the other and also will not want to waste a card if the other stats are good. In practice when designing a game such things might not be considered in favour of a more human approach from the programmer or indeed something very basic; if it looks like enemies in a bad FPS title just run at you and shoot it is because they probably do just that.
Comparing this to the proto card game there are far more possibilities of play (the mathematical term being known as the problem space) than even the largest supercomputer could account for28 , however there are things that can be done to approximate a good game which will boil down to two main categories.
- Game approximation
- Game restriction
Approximation says things like “most of the time casting cost is X and the necessity to do things varies with game time according to a formula, therefore try to allow for this”.
Game restriction relies upon restricting the potential amount of options available to reduce it to a more calculable level. Related to approximation a developer may also provide the game AI with a reasonably loose advantage over the player either in the actual AI or design of the opponent, such things usually being taken care of/fine tuned in playtesting. Restrictions include limited customisation of a deck and limited amounts of cards versus the real world, quite often limiting the options for certain play techniques in the process. Related to both the AI can be given a measure of superhuman ability (for instance it might know the cards you have or at least some data on them) which would break the game if a human had the ability but when properly tuned it can make the AI work.
All of this provides the option to somewhat intuitively design an AI that does the job. However with a bit of analysis using the ideas presented by game theory and mechanism design you can refine your AI and play mechanics quite considerably; techniques for certain games, most notably the likes of backgammon and draughts/checkers, have almost served to break those games as well and create AI most humans have little chance of beating.
Game theory It has been mentioned and partially covered several times now but some more depth is warranted. The basic idea is what if complex actions and strategies could be formulated, understood and predicted using fairly simple maths (it does get complex later) as a basis for it? It turns out it really can be represented as some fairly basic mathematical equations, however before that there needs to be a short discussion of some ideas and terms, many of them describe somewhat simplified concepts and what levels they apply at can vary with how you analyse things and where you start analysing things
- Perfect information - everything will be known about what has gone before
- Imperfect information - not everything that has gone before will be known about. Card games that you can fold in without having to show cards are an example, as are games that use fog of war.
- Prisoner’s dilemma - a situation where two potential opponents have the chance to gain at the cost of the other or cooperate to both win but not win as big. Important whether it is continuous with chances for more rounds or distinct. Related to this is the game of chicken which has a worst case scenario if both players attempting to win will cause the worst outcome.
- Nash equilibrium - combinations of strategies on the parts of multiple players can result in a given payoff but any one player changing strategy will cause a net loss in the ability to accrue points to both or that one player.
- Complete information - different to the perfect and imperfect above (it is possible to have a complete imperfect game and an incomplete perfect information game) it refers to the idea that the goals and strategies are able to be known by all players. Risk when hidden goals are in play is incomplete but Risk when world domination is the goal is complete (albeit imperfect if you account for the hidden cards).
- Metagame - a game within a larger game. RPG games where the objective is to move from one town to the next to continue the story will often have battles which use entirely different mechanics to town part of the game. Sticking with RPGs they have long had minigames using certain actions which are radically different to normal play and maybe even at odds with the previously established game mechanics that have to be performed to progress.
- Utility - the “points” a player seeks in a game. It is quite possible for two players to seek different utility in the same game, whether by choice (arguably this is what “griefing” is) or by design, and this follows on from complete information with regards to goals. Again mission based Risk is a good example of this.
- Symmetric - strategies available to each player are the same. Advance Wars without CO bonuses (on an equal map) is symmetric but with CO bonuses it affords a whole different set of strategies.
- Asymmetric - where different strategies are available to different players.
- Strategy - a possible move (one of many)
- Zero sum game - if a person wins the other will lose the equal amount.
- Non zero sum game - The situation where winning or losing does not mean players will win and lose equally (competitions with places being a good example). Monopoly where there is an infinite bank is not zero sum which also brings in the concept of players having resources that are infinite, finite to the player or finite to the game.
- Cooperative - a game enforces agreements between players.
- Simultaneous - players act at the same time
- Sequential - players act at taking turns and possibly have the chance to analyse and react to earlier moves.
- Stateful - previous plays of a game have an effect on the later ones. A standard RPG is stateful in many regards by virtue of experience/levelling.
- Stateless - previous games have no effect on the later ones. A standard fighting game for example is stateless.
Breaking a game or sections thereof using this concept can be used to create a good AI or good game mechanics and be used to compel certain types of play which leads into mechanism design.
Mechanism design By combining terms of game theory and creating rules you can aim to create certain play styles or indeed AI to conform to certain play styles. In Tetris for instance you have the option to do an infinite spin but when you are playing against an AI/player then you are allowing them effectively free reign for several “turns” which in turn will mean you probably end up with a screen of garbage, equally the game lessens garbage if you make lines but you might want to take some and thus allow yourself the potential to give more lines of garbage to the other side. This further feeds into AI as you can build a series of strategies for a game and turn one or more of them off to make the game easier for the human players, and going back to Tetris, in this case have a look at Tetris DS, then if you play the AI on easier modes it will stop doing things like hard drops or even soft drops, it might stop holding pieces to make for a tetris at a slightly later time and similar things.
Further to this you have things like scoring systems and experience rates which also bring on aspects of psychology in greater and greater amounts, however as a ROM hacker you might be interested in lessening or furthering the game engine. For instance in an RPG you might not wish to have to “grind” to gain experience for a couple of hours to further the story but the game more or less demands it (a trade off developers of RPGs almost invariably have to make) and you can bypass this. Equally minigames have been extracted from larger games on several occasions and in others gameplay was radically altered, albeit within the same engine, to do something like a boss rush mode in a game like Castlevania or Metroid (both titles noted for their exploration elements) or even a rush/“boss”alley in a game like pokemon via a premade save and level hack which negates the entire explore, capture and train (possibly/normally to overpowered) element the games are based on.
The main problem then of the mechanism designer is that game theory largely assumes that humans are both rational and playing for self interest, although things have been made to compensate for the lack of this.
Another for the list of things considered outside the scope of this document would be evolutionary algorithms and evolutionary programming where groups of algorithms are collected and tweaked slightly in a random manner before being tested and the best of those being selected and randomly tweaked again and again and again for thousands of rounds in a process that often creates far faster algorithms than humans can make. This area is a hot topic for research right now and has widespread implications for all fields ranging from processor design where it is already being used to and even game design (we already have procedural generation and a handful of things using this in a token manner but this is the next step). A nice video on the subject can be seen in the 28c3 presentation entitled “Automatic Algorithm Invention with a GPU”.↩︎
Although these signatures were not checked entirely properly with the overlays (recall they are sections of code that can be dropped in place well after boot time to provide additional functions). Hackmii has a nice writeup of the events here.↩︎
More than a few hacks for a great many consoles have used errors in save handling and the handling of other ostensibly user editable content to launch and/or the saves themselves to contain the payload of the hack.↩︎
A secondary term that arises when discussing this and the further subject of mechanism design is emergent gameplay where a few simple concepts combine to create a good game that does not need several hundred rules. A large problem space is relatively easy to create but emergent gameplay is closer to an art type subject where the aim is to create a good game than it is to game theory which is usually more focused on mathematics and modelling a situation. For a slightly less theoretical example then consider that the Metroid and Castlevania engines by themselves do not dictate that backtracking for various items/perks is necessary but the level designers add in such ideas.↩︎