Reverse engineering Mischief file format Part 2

In this part, we’re going to look at the unpacking function in Mischief.

← previous part next part →

After fiddling with Mischief a bit, I found this standalone function which unpacks bytes.

First thing to note: it is ~1000 CPU instructions, that’s a lot for a single function.
I assume it is this big is because it contains inlined code.

The code at 00467F50..00467FD9 initiates all the variables in the function.

SS:[ESP+NN] is basically access to the memory on the stack.

DS:[ADDR] is access to “data segment”, which is mostly used for input and output bytes in the rest of the function. In this part of the code however, a structure is traversed and pointer to the raw data is extracted.

It is (roughly) translated to this python code:

class UnpackerState():
    def __init__(self):
        self.EAX = 0
        self.ESI = 0
        self.in_pos = 0 # sp_10
        self.sp_14 = 0
        self.sp_18 = 0
        self.out_pos = 0 # sp_1c
        self.sp_20 = 0
        self.sp_24 = 0
        self.sp_2c = 0
        self.sp_28 = 0
        self.sp_30 = 0
        # self.garbage_p = 0 # sp_3c
        self.sp_38 = 0
        self.sp_40 = 0
        self.sp_44 = 0
        self.sp_4c = 0
        self.sp_64 = 0

def mischief_unpack(byte_input):
    state = UnpackerState()
    state.EAX = 0xFFFFFFFF
    # unpacked size is not actually computed in the give function,
    # but somewhere before, and passed as an argument.
    (unpacked_size,) = struct.unpack('I', byte_input[0:4])
    (state.ESI,) = struct.unpack('>I', byte_input[5:9])
    state.in_pos = 9
    packed_length = len(byte_input)
    byte_input += bytearray([0,0,0,0])
    decoded_length = unpacked_size
    state.sp_20 = 1
    state.sp_28 = 1
    state.sp_2c = 1
    state.sp_40 = 1
    state.sp_64 = decoded_length
    state.sp_38 = decoded_length
    sp_50 = 0

Near the end, there is this part:

00468AC8   > 8B4C24 1C        MOV ECX,DWORD PTR SS:[ESP+1C]
00468ACC   . 3B4C24 64        CMP ECX,DWORD PTR SS:[ESP+64]
00468AD0   . 73 37            JNB SHORT Specimen.00468B09
00468AD2   . 8B5424 10        MOV EDX,DWORD PTR SS:[ESP+10]
00468AD6   . 3B5424 68        CMP EDX,DWORD PTR SS:[ESP+68]
00468ADA   . 73 2D            JNB SHORT Specimen.00468B09
...
00468AE8   .^E9 F4F4FFFF      JMP <Specimen.00467FE1>

What it does is it jumps forward if SS:[ESP+1C] >= SS[ESP+64] or SS:[ESP+10] >= SS:[ESP+68]. Otherwise, it jumps to the beginning of the function.

It is a while loop and is roughly translated to:

while state.in_pos < packed_length\
      and state.out_pos < decoded_length:
    ...

By locating all conditional jumps and indenting assembly for visual clarity, I’ve created this file to work from.

Then there was a notorious part of manually translating the code to another language.

Some things to note, though:

  • There is a WORD array, and all pointers are calculated by doubling the index.
    This caused a lot of errors when I tried to convert the doubling to direct indexing.
    You can see a many places where index is divided by two (e.g. ecx = garbage[ebx//2]), which closely corresponds to what is done in the machine code. All indexes are guaranteed to be multiples of two due to the way they are calculated.

  • I’ve translated the code at 004689FC..00468A1B to (state.sp_28, state.sp_2c) = (state.sp_20, state.sp_28) which might not be obvious, but it does not use an additional variable to set the values, akin to swapping the variables using tuples.

  • There is a chunk of code that occurs a lot of times, and it yields a packed byte when “buffer” register is low. It was moved to a separate function _pop_byte.

  • There is an interesting assembly trick to implement a ternary operator without conditional jumps, e.g.

00468A31   . 837C24 18 13     CMP DWORD PTR SS:[ESP+18],13
00468A36   . 8B7C24 24        MOV EDI,DWORD PTR SS:[ESP+24]
00468A3A   . 1BC9             SBB ECX,ECX
00468A3C   . 83E1 FD          AND ECX,FFFFFFFD
00468A3F   . 83C1 0A          ADD ECX,0A
00468A42   . 894C24 18        MOV DWORD PTR SS:[ESP+18],ECX

is translated to

state.sp_18 = 0x7 if state.sp_18 < 0x13 else 0xa
  • Even if the translated unpacker works, I still don’t have a full understanding of the code.
    Why does it require so much work done for each input bit?
    What is the purpose of garbage WORD array?
    However, I do know that it is using some kind of Lempel-Ziv coding, there are “commands” to output raw bytes and back-referencing byte ranges.

After all the translation and painful debugging updated version of the parser was born.

Now it prints raw unpacked bytes from an .art file to standard ouput. Let’s see what we have.

m1el@wbox $ python3 artparser.py empty.art | xxd
0000000: 0200 0000 0000 0000 0000 0000 dcdc dc00  ................
0000010: 0080 3f01 0000 0001 0000 0000 0000 0000  ..?.............
0000020: 0000 0001 0000 00f8 f1f1 cdcc 4c3e 25c2  ............L>%.
0000030: 2142 c5bf 0d3f 0000 803f 0000 803f 0000  !B...?...?...?..
0000040: 0000 0400 0000 0000 803f 0000 803f 0000  .........?...?..
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 803f 0000 0000 0000 0000 0000 0000 0000  .?..............
0000070: 0000 0000 803f 0000 0000 0000 0000 0000  .....?..........
0000080: 0000 0000 0000 0000 803f 0000 803f 0100  .........?...?..
0000090: 0000 0000 0000 0100 0000 0100 0000 0000  ................
00000a0: 803f 4c61 7965 7220 3100 0000 0000 0000  .?Layer 1.......
00000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000110: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000140: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000170: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001a0: 0000 0000 0000 0000 803f 0000 0000 0000  .........?......
00001b0: 0000 0000 0000 0000 0000 0000 803f 0000  .............?..
00001c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001d0: 803f 0000 0000 0000 0000 0000 0000 0000  .?..............
00001e0: 0000 0000 803f 0000 803f 0000 0000 0100  .....?...?......
00001f0: 0000 0000 0000 0800 0000 0000 0000 0000  ................
0000200: 0000

This is much better — we can see raw text (Layer 1 name, for example), we can see the background color dcdcdc, there is also a bunch of floats: 0000 803f is float 1.0. This can be further parsed.

In the next part we’re going to parse these unpacked bytes.

← previous part next part →

2015-04-18T02:05