3.5 Scripting and layout

The concept has been mentioned a few times at this point, however it still not seen that much in the handhelds and earlier consoles. As some notable examples exist though it is going to be covered here. The idea here is beyond the basic text display the rest of the game revolves around a complex engine/interpreter.

The main one of choice here is Riz-Zoawd/The Wizard of Oz - Beyond the Yellow Brick Road on the DS.

The game actually displays all three main forms of text methods and for good measure some XML as well (see op.dat).

  1. Fixed length sections
  2. Conventional pointer techniques
  3. Scripting

The first two have been covered but the scripting engine is worth seeing and can be seen in event.dat in the data directory

The initial things are the setup for the maps and scenes but are a nice example

PIC

PIC

Initial analysis of the scripting engine as it was reverse engineered mimicked the markup reverse engineering techniques, which is to say a combination of static analysis and testing things to see what happens. Also of possible interest is the opening few lines which appear to be setting up a debug scene, for as mentioned elsewhere the things developers leave in for debugging are often simple examples of the engine/concept in question and can demonstrate concepts that might otherwise have to be extrapolated from the complex ones in the game itself.

Zombie daisuki Of course the scripting seen is not always so complex as the example above and a nice example can be seen in the game Zombie Daisuki which has scripting as seen in the following picture. If you want to look then in the data directory of the game itself there are some files with the extension .ini which are shiftJIS compatible. Note the variable names including spelling mistakes which can see the hacker accidentally correcting them and causing lots of issues, as well as markup and lack of pointers.

PIC

Lua The programming language lua was seen a handful of times on the DS. It was however converted to a bytecode esque arrangement compared to the plaintext it is usually left as on the PC.

El Tigre- make my mule The game features a nice archive format worth exploring a bit as it showcases a lot of things seen in archive formats.

PIC

Deleting the first and setting the window width very wide, the first part which was the names (although do not assume that as names quite often follow the rest of the information covering the file in question) was chopped off for this shot but the long name has been highlighted to get an idea of what goes.

PIC

There appear to be a bunch of plain ASCII names (underscore allowed) in alphabetical order by extension although the extensions are no in alphabetical order. Looking later in the file seems to say that upper or lower case for the names does not matter (on some systems it does). Remember that the fewer changes made in a thing like this the better so that order probably wants to be maintained when reassembling the archive

The numbers counting up in the 8 bits following the name section (actually flipped 16 bits as you will see in a moment) might well file numbers (ordinals might also be an acceptable term) which are useful for the system as referring to things by name is quite troublesome where maths gives file numbers.

0100 will want to be returned to later.

Three sets of three numbers, if possible it would be nice to leave them as they are in the original but that makes simply looking at the more troublesome than it has to be so a 32 bit byte flip later

PIC

The 0100 became 0001 in what is now the upper 16 of the 32 bits.

Three numbers then. The first seems larger than the second and the second plus the third is the next in the sequence in the third number.

Size and location then. For the time being no padding between values is assumed but there is frequently padding to make sure it lines up with 32 bits or even more but seeing them start on odd values makes it fairly likely that no padding is here.

Scrolling down a bit further

PIC

The swav files were checked for (they have a fairly unique start of the file aka a magic stamp) and they were indeed the swav audio format and as audio is not usually compressed beyond making it in the first place it finishes off the rest.

The 0001 now in the upper 16 bits is indeed a compressed flag. (a few files were extracted and then attempted to have compression applied without any real success - a quick and easy check).

In the swav examples the first and second of the three numbers is the same (and the pattern for 3 holds).

The first of the three values is the uncompressed size, the second the compressed size and the third the location.

pkg archive data table

Defined as follows

32 bytes for the name presumably ending with the first 00 in the name and padded out from there (the last 4 bytes might be necessary though)

2 bytes for the file number (flipped and counting from 0)

2 bytes for a compressed flag (flipped, 1= compressed, 0 = uncompressed)

4 bytes for uncompressed size (flipped)

4 bytes for compressed size (flipped)

4 bytes for file location (flipped, standard pointers (not relative) and starting from start of main file (not offset))

4 bytes padding (00 filled)

The header format is not finished as there is still the part that was deleted at first to make everything line up nicely (flipped to make things easier)

PIC

pkg is clearly the magic stamp for this format.

301… the last file number is 300 hex and starting with 0000 for numbers means that is likely the file count.

008C 9446 is the length of the file (usually a common sight in headers) but is absent

008B F7E2 plus 9C64 (the location of the first file and end of the header) is 008C 9446 and having lengths ignore headers is quite common.

Header format

4 bytes 706B6700 hex (pkg\[00\])

4 bytes 00 filled.

4 bytes file count (flipped)

4 bytes 0000 0001 when flipped

4 bytes unknown (019F3323 when flipped)

4 bytes size of file - header (flipped)

4 bytes unknown (0009 4584 when flipped)

4 bytes unknown (0002 DDAA when flipped)

4 bytes unknown (0000 0003 when flipped)

12 bytes 00 filled (padding?).

The compression The header is a nice example of a custom file format and most of the time that is where it ends (give or take building something to remake the archive) but compression was detected. Sadly it is one of the few times a custom format for compression has been seen on the DS. The extensions appear as though they can be trusted so for the time being they were. This section might be more useful once compression (covered in game logic) us covered.

Other than the swav files there were a handful of uncompressed files but they were usually quite small. That it happened is nice as it points to file level compression rather than archive wide library compression (formats like 7zip do this to achieve very high compression rates for groups of similar files at the cost of decompression time, resources, potential for the archive to be corrupted beyond recovery of anything via simple means and not able to be extracted without a complete archive set in the case of split archives).

tmpCopy.txt was extracted. It sounds like a debug text file if ever there was one.

C86 is the length of the compressed file. According to the header it should be 2DAC hex long.

entities.xml was extracted. xml should have lots of nice brackets to look at.

16D8 is the length of the compressed file. According to the header it should be 00010344 hex long

Neither appeared to have any flags to start (the first clue it might be custom) and neither had any obvious starting out OK and degradation as it went on (LZ usually starts out fairly readable and becomes less readable as things repeat and get picked up and RLE is much the same) which points to something like Huffman (the DS BIOS does support huffman but it was not a standard BIOS compatible version by the looks of things)

\[to finish\]

Assembly reverse engineering (full assembly as seen across the decompression function).

Memory viewing reverse engineering (files have to be decompressed to run)

dat files (some naturally decompressed and compressed, partial known plaintext analysis).

The lua The header is a nice example of a custom file format but later in the file there are some files with the extension lua which is the chosen extension of a fairly powerful scripting language of the same name that has been seen on the DS (the puzzle quest series and several times in homebrew). These were all compressed using the custom compression format.

\[to finish\]

Puzzle Quest and Theta \[to add\]

Further reading A scripting engine for the Wii game Tales of Symphonia: Dawn of the New World was reverse engineered and although GBA and DS games rarely require anything so extensive it is well worth a read. Links to the matter at hand at blog.delroth.net part 1 and blog.delroth.net part 2.

3.5.1 Layout and limits

Covered in part earlier (the megaman ZX markup) but worth a quick subsection. Where most of the time outside games if the text reaches the edge of a screen it will automatically wrap back around games, and especially earlier games and games on the handhelds, can certainly never be assumed to do this.

The methods games employ to do things here and what you will run up against are as varied as any other area of hacking. The first thing to note is it might not be the screen dimensions that causes you trouble but a text box or, worse, an imaginary/invisible text box which usually means an ASM hack to change. Equally important and somewhat more troubling are memory limits, be it in the console’s memory (the DS cart is not available in normal memory so everything has to be copied in), cart memory (seldom a problem on the GBA or DS) or format memory (if the game only uses 16 bit pointers you might have limitations). Strictly speaking this too is an assembly hack but you can get a lot done by simply viewing the memory and adjusting your habits accordingly.

Auto screen (press to continue) making Not really in the same class as the others mentioned, save for some quite annoying to handle games, but definitely worth knowing about as it is something you will probably run into sooner or later. The game might have more text in a given conversation than can be displayed in the text box given and as such it will either have to auto change to the next one or allow the user to control what happens. Equally some games are fairly concise and there might not be provisions for it, or will require editing of the game’s text engine itself.

The possibilities here are extensive. Some will have a basic end of section command which the game will pause on pending user input and others will scroll automatically. The eventually ends up at the choices menus in games (the classic “yes/no” option in a game, a concept that very frequently sees dual tile encoding used for it) which can a sort of linked list/level design approach or something buried deep in the game.

OAM/tile driven wrapping This is more reserved for text in images/tile maps like those often seen in puzzles, however those doing a variable width font hack might use reads of the OAM or BG tile management to direct things. There have been games like Kenshuui Tendo Dokuta that used a font representation and a nitroSDK graphics format to repeat tiles as necessary to display the scene.

Line wrapping Much like pointers being used to indicate the location and end of a section of text a game might not have the ability to automatically wrap a line when it comes to the edge of the screen/boundary box. Sometimes it is automatic, sometimes it is pointer driven, sometimes it has a unique character or set of values (Megaman ZX used FC if you recall) and sometimes it uses the same character as another ending value.

Section wrapping Much the same as line wrapping but note that it is not always the same as the line wrapping used in the game you are dealing with, it may not even be the same concept (pointers to end a section and characters to wrap a line is actually a pretty good way for devs to do a lot of this).