The home of Barnaby Smith
Header

Author Archives: mvi

Colour Code Snippets

May 27th, 2013 | Posted by mvi in Colour - (Comments Off)

I’ve previously produced a number of mod tools for Dune 2000 that convert colours from various formats to 24 bit and back. Although most of the source code is already online, I thought I’d share the colour conversion C# source code in one place.

Convert 15 bit high colour to 24 bit true colour

This converts 15 bit (with 1 bit padding) colour to 24 bit, by first extracting the red, green and blue channels. In 15 bit colour, each channel has 5 bits meaning 25 = 32 permutations, but we want 8 bit colour for each channel which is 28 = 256 permutations, so we divide by the highest 5 bit value and multiply by the highest 8 bit value to convert the range from 0-31 to 0-255.

private Color ConvertColor(UInt16 colour)
{
    // bits = XRRRRRGGGGGBBBBB
    byte red = (byte)((colour >> 10) & 0x1f);
    byte green = (byte)((colour >> 5) & 0x1f);
    byte blue = (byte)(colour & 0x1f);

    red = Convert.ToByte(255.0f * (float)red / 31);
    green = Convert.ToByte(255.0f * (float)green / 31);
    blue = Convert.ToByte(255.0f * (float)blue / 31);

    return Color.FromArgb(red, green, blue);
}

Convert 24 bit true colour to 15 bit high colour

Similarly to convert to 15 bit, we convert each channel from 8 bit to 5 bit. Then we bit shift the red and green values and combine the three channels using a bitwise OR, so that we can fit all three 5 bit channels in the format XRRRRRGGGGGBBBBB.

private UInt16 ConvertColor(Color colour)
{
    int r = (int)((colour.R / 255.0f) * 31) << 10;
    int g = (int)((colour.G / 255.0f) * 31) << 5;
    int b = (int)((colour.B / 255.0f) * 31);

    int combined = r | g | b;
    return Convert.ToUInt16(combined);
}

Convert 24 bit true colour to existing palette

This algorithm calculates the entry in an existing palette which is closest to the target colour. The algorithm treats colours as points in 3 dimensional space – instead of x,y,z we use r,g,b – and uses Euclidean Distance (the distance in a straight line between two points) to calculate the closest colour. For efficiency, all distances are kept at their square values, since it is not necessary to use the relatively expensive square root operation in distance comparisons. The assumption here is that the palette is 24 bit, however if you have a 15 bit palette you could use the conversion technique described above to convert it to a 24 bit palette.

Color[] palette = new Color[256];

internal int GetNearestColorIndex(Color targetColor)
{
    int index = 0;
    // Set the initial color distance to the maximum
    double currentDistance = (255 * 255) * 3;
    Color paletteColor;
    for (int i = 0; i < palette.Length; i++)
    {
        paletteColor = palette[i];
        int redDistance = paletteColor.R - targetColor.R;
        int greenDistance = paletteColor.G - targetColor.G;
        int blueDistance = paletteColor.B - targetColor.B;

        // Calculate the square distance from the target colour to this palette entry
        int distance = (redDistance * redDistance) +
                            (greenDistance * greenDistance) +
                            (blueDistance * blueDistance);

        if (distance < currentDistance)
        {
            index = i;
            currentDistance = distance;
        }
    }
    return index;
}
internal Color GetNearestColor(Color targetColor)
{
    return palette[GetNearestColorIndex(targetColor)];
}

Calculate palette from 24 bit

Technique description coming soon.

Back in 2009 a friend and I were sat in the university library when we decided to give reverse engineering Dune 2000 a go, a classic RTS released by Westwood in the late 90s. I can’t remember why we decided to do it other than it seemed like a good idea at the time. Neither of us had experience reverse engineering binary files at that point, although as a teenager I had spent a fair amount of time in the Grand Theft Auto mod scene, mostly making missions, buildings and multiplayer scripts. My friend ended up not going further with the reverse engineering however I decided to continue with it. Little did I know the project I was embarking on.

We started by looking at the map files and tilesets that went with them. We quickly established the structure of the map files, starting fresh research that a couple of years later led to the first campaign map editor for Dune 2000 by Klofkac which extensively used the research by myself, Lewis and DaxxXyrax. I also looked at the tileset files, which were reasonably straight forward and ultimately led to a tool I developed which allowed tilesets to be exported and imported to and from PNGs that could be edited in Photoshop.

The vast majority of my research however went into the mission files. The community was hungry for new campaigns, for over a decade the only missions for Dune 2000 were the original three campaigns made by Westwood. Very little progress had been made modifying mission files in the community, people had realised that in the first mission they could modify the amount of money needed to win the mission and that it was possible to chance the diplomacies of sides (for example make sides enemies, allied or neutral). Almost entirely however, mission files were 68 thousand byte black boxes.

I spent the next two years reverse engineering the mission files. Detailed discussion of what that involved would probably be tedious to read and also I don’t think I can remember too much of it. So I’ll instead briefly discuss the biggest (solved) challenge of the mission files. Early on someone found that modifying certain bytes in certain mission files affected reinforcements. We knew from substituting mission files for each other that the mission specific logic was all contained in these files, but we had no idea how that logic was structured. Getting my head round how this logic was structured let alone the byte representation in the files took a very long time.

Unfortunately the way the actual logic is stored does not have strong indicators of repetition or patterns. However I did have a vague idea of the part of the file this logic was stored in and through perseverance and repeated testing of file modifications I eventually isolated the exact range of data and grouped the data into events and conditions. Each event had a type indicator and its own specific information, it also references conditions with boolean flags to flip the value of the condition. Conditions themselves are similar in that they have a condition type and their own data. For example a reinforcement event would store the side it belonged to and what it contained and then link to a time based condition so that the reinforcement would occur after a specified time. Fully specifying the conditions and events took a fair amount of time (and there are still a couple of unknowns). Once this was complete however I had enough pieces to finish the mission editing tool.

Dune 2000 Mission Editor

After over two years of research, head scratching and tool writing I released the Dune 2000 Mission Editor along with initiating the research that allowed a campaign map editor to be created. Between these two tools it was possible for the first time in the history of Dune 2000 for modders to create new missions for the game. I also created a new string table editor, tile set editor, sound effect packing tool, as well as partially complete UI layout and resource file editors. Working with CCHyper I also released a patch for Dune 2000 making it possible to play as the sides in the game other than the three playable sides that shipped with the game (such as smugglers, mercenaries, Fremen, Corrino). As all of the editor and mod releases were on the forums and quickly slipped down the topics to be hidden I created a dedicated website to host both the toolkit of mod tools I had created as well as modifications made with the tools by the modding community.

Reverse engineering Dune 2000 was a long process and I have learned a great deal during the time I spent doing it. While for most the time I was working for alone, I am extremely grateful for the times other modders got involved and made progress. There are now a number of new missions and campaigns released by the community and seeing these mods created feels extremely rewarding after spending a long period of time trying to get meaning out of a screen full of hexadecimal numbers. I recently announced that I was leaving the Dune 2000 modding scene, my original goal was to facilitate new campaigns to be created by modders. With this goal complete I feel it’s a good time to stop.

I will continue to administrate the D2K+ website and add any new mods or tools that other people create and would like listed. As always I’ll read the forums and I’ll be excited to see what developments are made, this is the end of my involvement in modding Dune, but as the blog post title alludes there’s plenty more life left in the mod scene.

Links:

D2K+ Website

Dune 2000 Forums

Unity3D Tips Tweet Aggregator Launched

September 1st, 2012 | Posted by mvi in Unity3D | Unity3DTips - (Comments Off)

As a Unity3D developer, I often find useful features, tricks and good practices for developing using the engine. I was frustrated by the way that there wasn’t a place to share brief snippets of information, so I’ve created a site that aggregates tweets that use certain hash tags. Tweets made using those hash tags will be stored (forever) and displayed.

Unity3D Tips

Announcing UnityXNA – XNA Inside Unity3D

July 7th, 2012 | Posted by mvi in Unity3D | UnityXNA | XNA - (Comments Off)
Disclaimer:
I’ve received a lot of emails asking if UnityXNA is suitable for porting an existing XNA game to Unity. The short answer is: it is, if you’re prepared to do the leg work yourself. UnityXNA only implements the bare minimum of the XNA API in order to get the Platformer sample working. If your project uses any parts of the XNA API that the Platformer sample does not, you will need to add support for this yourself. Due to personal time constraints and the nature of this project as a tech demo only, I can offer no support for UnityXNA. Source code is provided for the curious and as a nice starting point for more serious emulation of XNA in Unity.

I had a theory that it would be straightforward to get XNA games running in Unity3D so I decided to give it a go.

This is a proof of concept showing the Platformer XNA sample running inside Unity3D. Zero code changes have been made to the original game code. Using a mixture of new code and some code from MonoXNA I’ve implemented XNA emulation by having a game object with a script attached run an XNA game performing updates and drawing.

You can see the game in your browser here and download the source here.

Implemented so far:

  1. Basic game loop and GameTime calculation.
  2. ContentManager loads Texture2Ds, SoundEffects and Songs, each wrapping the relevant Unity3D object.
  3. SpriteBatch Draw implemented using a draw queue, specifically created for the purpose. Currently supports colour tinting, source rectangles, and sprite effect flip modes.
  4. SpriteBatch DrawString has limited support, rendering the text in the correct position and with the correct colour.
  5. Support for playing Songs through MediaPlayer and playing SoundEffects
  6. KeyboardStates emulated for a limited set of keys which are mapped from their XNA values to Unity3D KeyCodes.
  7. Zero code changes to the game needed to run Platformer sample

Known issues, immediate areas for improvement:

  1. SpriteFont is not supported, all DrawStrings render with the default GUI Label font.
  2. Frame rate is currently vsync’d at 60 frames per second. When vsync is disabled GameTime is not calculated correctly.
  3. Windows Media Audio (.wma) is not supported by Unity3D, so I’ve converted the sample audio files to Ogg Vorbis (.ogg).
  4. Keyboard input is currently limited to a small set of keys, more mappings between XNA Keys and Unity3D KeyCodes need creating.
  5. Mouse, gamepad and touch input are not currently implemented.

Proof of concept live in browser            Download source code at GitHub

Edit: This post has had some overwhelming support, so thanks for all the tweets and mentions. I have made the full source code available on GitHub and I think it provides a great starting point if you have a finished 2D XNA game which you want to port to Unity3D and you’re prepared to spend a few weeks (instead of months) on it.  However as a proof of concept it has achieved my original goal, I’m afraid I can offer no support to XNA projects wishing to port to Unity3D.

In the last article I talked about some theory of how data types are stored in binary. Of particular importance were the concepts of endianness which defines the order of the bytes that make up a data type. This article will use XVI32 to practice determining and changing the values of data entries which make use of some of the data types we discussed in the last article.

Determining Data Values

I’ve put together a small sample file for this article, before continuing you’ll need to download it and open it in XVI32.

Download here

Once you have the file open in XVI32 you should see the hex and text values in the above screenshot.

To make this article more straightforward I’m going to tell you the actual structure of this file:

Remember that there are 8 bits to a byte, so get the number of bytes in a 32 bit integer you divide 32 by 8 and get 4.

In this part of the practical, you need to answer four questions:

  1. What is the decimal value of the byte at offset 0?
  2. What is the decimal value of the little-endian int16 at offset 1?
  3. What is the decimal value of the big-endian int32 at offset 3?
  4. What is the decimal value of the little-endian int32 at offset 7?

Tips for the struggling

If you struggle with question 1, reread part 2.

If you struggle with questions 2 or 4, reread part 4.

If you struggle with question 3, remember that the conversion is the same as for little-endian numbers only you don’t need to bother with reversing the order of the bytes.

Answers

The answers can be found here. If you didn’t get the same answer, check out the tips above.

Changing Data Values

Before starting this section, you need to make sure you’ve completed the questions above and got the correct answers to every question.

In this section, we want to change the values of the data entries encoded in this sample file. If you want to refresh your memory of making hex edits with XVI32, reread part 3 now. Remember to use overwrite mode in XVI32 rather than insert.

Again we have four exercises:

  1. Change the value of the byte to 54 (expressed as a decimal)
  2. Change the value of the little-endian int16 to 40 (expressed as a decimal)
  3. Change the value of the big-endian int32 to EDA0 (expressed as a hexadecimal)
  4. Change the value of the little-endian int32 to 6767 (expressed as a decimal)

If you struggle with any of these questions, the best idea is to reread the previous articles.

Answers

The answers can be found here. Open this file in XVI32 to compare against your own edits.

In this article you’ve determined the values of different data types and also changed those values. The skill you’ve just picked up means that you’re now able to hex edit a large range of data and files. Being able to determine the value of and edit data which occupies more than one byte is the most core practical skill in hex editing and reverse engineering binary files.

Core Concepts Of Reverse Engineering: Part 4 – Data Types

February 4th, 2012 | Posted by mvi in Reverse Engineering - (Comments Off)

A Brief Intro

The last post gave a practical example of hex editing. In the post before that I talked about bytes and hexadecimal numbers. This post continues that discussion of theory.

With regard to hexadecimal numbers, its important to note that you don’t need to be able to convert between hexadecimal and decimal in your head, or even on paper, using a calculator for the conversion is fine. All that’s important is to know that the same number can be written both as a decimal and as a hexadecimal. For example if I had a byte in my file which has the value AB, as a decimal this is 171. You may sometimes see numbers prefixed with 0x like 0xAB, this is simply standard notation for a hexadecimal number.

While being able to store a value up to 255 in a byte is useful, being able to store larger numbers is more useful. In this post, I shall discuss some basic types.

Signed and Unsigned Numbers

In mathematics, numbers can be either positive or negative. In computing, sometimes we’ll want numbers that can be either positive or negative, or sometimes we know that a number will always only ever be positive. Why differentiate the two you might ask? Storing whether a number is positive or negative takes up a small amount of data (specifically one bit). If all numbers were treated this way, we would be able to hold a smaller range of data even if we knew that data would never be negative, which while a small limitation is still wasteful.

Signed numbers are numbers which can be considered to have a positive/negative sign information. Supporting negative  numbers comes at the expense of a smaller range of numbers that can be represented.

Unsigned numbers are numbers which must be of the same sign (typically positive). These numbers can support a larger range but at the cost of not being able to store both positive and negative values.

Integers

Integers are one of the most basic and ubiquitous data types in computing. They represent whole numbers such as 1, 5, 98 and cannot store fractional numbers such as 0.24, 1.7, 5.5. Integers can be signed or unsigned and come in various sizes. The most common sizes of ints are 16 bit, 32 bit and 64 bit. The number of bits refers to the size that the integer occupies, there are 8 bits to a byte and therefore a 16 bit integer is 2 bytes and a 32 bit integer is 4 bytes. By combining bytes together we extend the range of the data type significantly, the more bytes there are the more variations that can be stored. In the table below I show the range of the above three types of ints as both signed and unsigned numbers.

Little/Big Endian

When it came to writing say the number 123 as a hexadecimal byte, it was quite straight forward. We just worked out it was 7B using the calculator and that was it. If we now take the number 1234 which is bigger than the maximum value a byte can hold (255), we clearly now need to use an integer. So lets take a 32 bit integer which consists of four bytes. So if you put 1234 into your calculator and convert to hex you’ll get the result 4D2. If we stick some zeroes in front of it to occupy four bytes we would then get 00 00 04 D2. That’s great and this is a viable way of writing an integer however its not the only way.

Big endian means that the high numbers come first and the low numbers come last. For example with 1234, it’s quite a small number compared to what a 32 bit int can hold so its on the right side. Larger numbers would occupy further numbers towards the left.

Little endian numbers reverse the byte ordering so that the above example would be written as D2 04 00 00.

Its common for x86 architecture (PC) files and Intel Macs to be little endian and for PowerPC Macs and UNIX to use big endian. Particular file formats may choose to use little or big endian regardless of the architecture and operating system, however as a starting point I would assume the endianess matches the architecture.

In the case of Dune 2000 and most PC formats, files are stored in little endian. If you would like to read more about endianess try here.

Converting from decimal to a little endian 32 bit integer

  1. Convert to hex using calculator
  2. Prefix with ’0′s until the number is represented as 4 bytes. (Has the structure 00 00 00 00).
  3. Reverse the bytes, each grouping is a byte. So 12 34 56 AB becomes AB 56 34 12

Converting from a little endian 32 bit integer to decimal

  1. Reverse the bytes, so AB 56 34 12 becomes 12 34 56 AB
  2. Convert to decimal using calculator

Bit Representations in Bytes

A byte is made up of 8 bits. Each bit holds a 1 or 0 value, so a byte that holds the value zero can be represented as 00000000. The value that each digit represents doubles from right to left, starting at 1.
Adding across we have zero lots of each number, so a byte with the bit representation 00000000 = 0
If we take a byte who’s bits have a value of 11111111 and use the above grid, we get:
Adding across we get 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1

Which if we add it up, we find that a byte with a bit representation of 11111111 is 255 – the max value of a byte. The number of bits is the reason for the number of variations a number can hold. Try continuing the table across to represent 16 bits rather than 8 bits, add up all the top row numbers and compare against the integer data type table above.

Let’s take another example:

A byte with bit representation 00110010

Adding across again we get 32 + 16 + 2 = 50

Bit Fields

While the above usage of bits in a byte is the most common, it can be considered just one possible interpretation of the bits in the byte. Data only has meaning when we give it meaning. For example we could say rather than the right most bit representing the value 1, it could mean whether or not a tank can move onto a terrain tile. We could then say that the second to right byte which represents the value 2 can mean whether a player can build structures on this terrain tile. This usage of the bits in a byte is called a bit field. While bit fields are decreasing in usage now that data limits are getting higher and higher, they still have importance in many areas where small data sizes are critical.

Sub-byte data types

On a related note, you may want to hold more data than a simple true/false but want to use less data than a full byte. If say your maximum value was 15, then you’d only need 4 bits. You could then hold two different 4 bit variables in a single byte, one after the other. Alternatively, if you needed to hold a number with a ranger larger than a byte but smaller than 2 bytes, say a 12 bit number and you also wanted to hold a 4 bit number, you could combine the two by making use of 2 bytes. The first number occupying the first byte and the first half of the second byte, while the second number occupies the second half of the second byte. The below table shows the first number in red and the second number in green.

What’s next?

This post has been quite heavy in theory, in the next post I will explore data types through practical examples. So if you felt this was a lot to take in, don’t worry there will be opportunity to practice what I’ve talked about here and hopefully make sure you get your head around it.

In the last post I discussed why hex editors are useful for working with binary files. I also talked about XVI32 my hex editor of choice. During this series I will be using XVI32 for examples, so if you are not on a Windows machine or if you want to use a different hex editor then you will need to adapt my examples.

So if you haven’t done so already, download XVI32. XVI32 doesn’t need installing, you can just unzip it and run it from there but you may want to copy it to a more memorable location and set up any relevant shortcuts.

In this post I’ll be taking more of a practical approach, I’ll start by talking about file signatures and then we’ll open up a few common file formats and take a look inside.

File Signatures

An extension does not make a type.

Or to put it more clearly, just because a file has the extension .png doesn’t mean there’s actually a png image inside. This is a very important lesson, since it is incredibly common for games just to give a common file format a different extension. When I was browsing the Call of Duty 4 files I realised instantly when I opened a file format up in a hex editor that it was just a zip with a different extension, meaning I could unzip it and see the files inside.

So how did I realise it was a zip at a glance? What kind of technomagery is this? Many files have a signature at the very start saying what format they’re in. It doesn’t matter what the file is called or what extension it has, if it has a signature then you have a way of identifying the file type. These are often called magic numbers because a piece of text can be represented as a sufficiently long number. In fact, any and all data can be considered just a very, very long number, but I may be straying off the point.

There are a number of very common file signatures you’ll see, including:

Signature Hex Type
MZ Executeable code (.exe)
PK 50 4B 03 04 Compressed Zip (.zip)
Rar! Compressed Rar
BM Bitmap Image (.bmp)
FF D8 FF E0 ** ** 4A 4649 46 00 JPEG Image (.jpg, .jpeg)
%PDF 25 50 44 46 PDF Document (.pdf)
‰PNG PNG Image (.png)
OggS Ogg Vorbis Media e.g. audio, video (.ogg)

For a more extensive list of file signatures check this page out.

Some Examples

Before continuing there are three examples you need to download, in the case of the images you will need to right click the link and hit “save as” or the equivalent option on your browser.

Example 1
Example 2
Example 3

Now for each of the examples, I want you to open it up in XVI32 and have a look inside.

Example 1

Once we open the first example in XVI32 we see that the file header starts with the per mil symbol followed by PNG in the text view. This file clearly is a PNG image.

Example 2

With example two, we see that the file signature is PK (named after the format’s author Phil Katz, but I often translate PK as Packed). This file is a zip as we can see in the table of signatures above.

Example 3

The final example has the file signature BM and is therefore a bitmap image. Ignore the F, that byte is actually part of a variable in the bitmap format that says how big the file is.

Example 3 In Depth

Let’s take a deeper look at the third example now that you’ve got it open. We know that the file is a bitmap image, so let’s take a look at it in Windows Photo Viewer. It’s a small image so you’ll have to zoom in.

If we open up the file in Paint, view it in Explorer or open its properties we’ll see the file is a 2 by 2 pixel image. I’ve created this small image to demonstrate the format more simply.

File Headers

In addition to the file signature, most files have a file header which includes some basic information about the file. In the case of an image this may include its dimensions and colour depth/quality. In the case of audio this may be the duration, number of channels and bit rate.

In the above image I’ve highlighted the file’s header. In the case of a Windows Bitmap the file header is 54 bytes long. To highlight a section in XVI32, select the first byte then select Edit -> Block <n> chars and type 54 in decimal mode.

You can see a couple of 02′s in the header, so a reasonable assumption would be one represents width and one represents height. We can also see an 0×18 which as a decimal is 24, so another reasonable assumption is that this is the colour depth specifying 24 bit colour. For now don’t worry about colour depth, I’ll talk about colour in a dedicated article later in this series.

After the header we have 16 bytes. Now we know there are four pixels (2×2) pixels in the image, so it’d be sensible to assume that those four pixels are represented in these 16 bytes.

Opening the file in Paint, we can use the dropper tool to pick the colour of each pixel. By going to edit colour, we can then see the colour in its red, green and blue components. You can do this manually, or use the figures I’ve shown below. I’ve also converted the values to hex for you.

Top Left

Decimal Hex
Red 34 22
Green 177 B1
Blue 76 4C

Top Right

Decimal Hex
Red 255 FF
Green 242 F2
Blue 0 00

Bottom Left

Decimal Hex
Red 255 FF
Green 127 7F
Blue 39 27

Bottom Right

Decimal Hex
Red 237 ED
Green 28 1C
Blue 36 24

Now using the hex values worked out for each colour, we can spot them in the file. We can spot each three, in reverse order displayed as Blue, Green then Red. The reason for this different ordering is something I’ll talk about in a later article. We can also see that the bottom left pixel is first, followed by the bottom right pixel, then followed by two 00 bytes. Immediately after is the top left pixel, followed by the top right pixel and two more null (00) bytes.

Ignore the two sets of two null bytes, these are due to a nuance of a the bitmap format which means that it must pad the number of bytes representing a row of pixels to a multiple of 4 (so in this case we have 6 bytes representing a row, so it adds on 2 blank bytes to reach a total of 8 bytes and therefore a multiple of 4).

Editing Data

So now that we know where the colour data is in the file, let’s try changing it.

Let’s pick the top right pixel, which is yellow. Let’s change it to blue. Right now its represented as 00 F2 FF, so since this is in Blue Green Red order rather than Red Green Blue, changing the value to FF 00 00 will be a strong blue. To edit the values select the first byte in the “00 F2 FF” sequence and make sure that it says Overwrite in the status bar. If it says Insert then tap the insert key once. The insert key toggles between Overwrite and Insert modes. Now simply type FF 00 00 on your keyboard and hit save.

Opening the file up in Windows Photo Viewer and zooming in, we now see that the top right pixel is blue.

Congratulations, you have made your first successful and practical edit in a hex editor!

In this article, I’ve talked briefly about how to identify common file types regardless of their file extension. I’ve also shown some basic hex editing in a practical example – editing a bitmap image. In the next example I’m going to go into data types which combine multiple bytes to represent larger numbers.

Before I start this article, I need to define a couple of terms:

  1. Byte – This is a data type which can store 256 discrete variations. Typically its said to have a minimum value of 0 and a maximum value of 255 (thus 256 variations including the 0). All files can be considered to just be a series of bytes.
  2. Hex or Hexadecimal – Unlike a decimal or base 10 number which only allows 10 different variations per digit and must include a another digit to include numbers which exceed that range (e.g. 8, 9, 10, 11 / 98, 99, 100, 101) a hexadecimal number stores 16 variations. After 9, the first six letters of the alphabet are used (e.g. 8, 9, A, B, C, D, E, F, 10 , 11 / FE, FF, 100, 101) .

All files are divided into two categories – plain text and binary.

Plain text files as the name suggests just contains text. They cannot contain images, sound, video or any form of text styling unless they mark it up. Examples of these files include .txt, .ini, .csv, .html, .php. These files can be opened in your system’s default plain text editor such as Notepad on Windows and will load and display fine.

Plain text files load and display fine in plain text editors

Binary files however can store a much wider range of data. Your camera photos, mp3s and videos are all binary files. Rather than limit the data storage to just text, binary files can make use of a larger range of encoding which means that we can’t view these files properly in a plain text editor. This is shown in the image below, in which I’ve tried to load a bitmap image into Notepad.

By contrast, loading a binary file in a plain text editor is not a good idea

So to view and edit binary files, we clearly need a different tool. If we know the file format then we could load the file into the relevant editor, loading images into Photoshop or Paint for example. But this is no good to us if we don’t know the file format, if there isn’t a relevant editor yet or if we want to examine the internal data structure.

Hex Editors can display and edit binary data in a very helpful and effective way. They are not limited to text characters and can be used to display and edit the full range of variations in each byte. Unlike text editors which display binary data badly and don’t support changing the value of non text data, hex editors are not hindered by these problems.

XVI32 is a freeware hex editor for Windows

My personal favourite hex editor is XVI32 which can be downloaded for free on Windows. It’s quite a lightweight and functional hex editor and while there are many hex editors out there offering a greater range of features, I like the simplicity and straightforwardness of XVI32.

In the above screen shot we can see three columns. The left and thin column is the line number displayed in hex. The number shown represents the index or offset of the first entry on that line, for example B means 11 as a decimal and if you count across the boxes in either of the other two columns you’ll see that they are also 11 boxes across. These line groupings do not exist in the actual file, this is merely just how its displayed in the editor, a bit like word wrapping text.

The middle column displays the hexadecimal view of the file, while the right column shows a ASCII or text view of the file. Each box in the hex and text views represents a byte in the file.

So why is hex useful? Why not represent the values of each byte as a decimal?

Hexadecimal numbers have the useful property that with two digits they can represent 256 discrete variations, just like a byte. So rather than use 3 digits to represent the value of each byte, we can use two digits to their full range. The minimum hex value for a byte is 0 (or 00) and the maximum hex value for a byte is FF.

To convert between decimal and hexadecimal you can use the built in calculator on Windows (or use a site like this). Depending on your version of Windows you may need to change to either scientific or programmer mode before the hex and decimal options are available. To convert a number, type it into one mode then select the other mode. For example:


In the next post, I’ll explore how to use a hex editor and look at some common data types. Before reading that post however, it would be useful to try opening a few different file types in a hex editor just to get a feel for it. It would also be very helpful to try converting a few different numbers between hex and dec.

Core Concepts Of Reverse Engineering: Part 1 – Introduction

January 29th, 2012 | Posted by mvi in Reverse Engineering - (Comments Off)

This is the first in a series of mini articles on reverse engineering binary files, particularly those in computer games but the majority of what I will discuss has applications in all areas of computer software. While I may touch on reverse engineering executable code, memory hacking and plain text formats, the majority of the focus is on actual binary files.

Since 2009 I’ve been reverse engineering and modifying the classic Westwood RTS Dune 2000. Despite being released in 1998, the modding community was held back by a limited number of tools. For example while it was possible to edit and make new multiplayer maps for the game, it was not possible to make campaign maps and missions. A couple of guys had been exploring the mission files and had made small, but significant, discoveries. This was when I joined the modding scene and since then I’ve reverse engineered a whole range of different files in the game and released a bunch of tools based on my research.

When I started, I had no reverse engineering experience. I’d never used a hex editor and had no idea about the binary representation of data. If you’re in the same boat, then don’t worry, in this series of articles I’m going to talk about what I’ve learned in terms of both conceptual theory and also practical techniques that I employ. This series of articles is particularly aimed at computer programmers with no reverse engineering experience and who would like to get started.

As with programming, the initiative is all on you. You will apply the theory and techniques to a different set of problems and formats than I did. As such you will need to extend and modify them to fit your needs and make a few leaps on your own to successfully reverse engineer a complex format. But if you start off small and work your way up, once you understand the basic theory you’ll find reverse engineering is actually very straight forward.

DodgeFoot Released

September 16th, 2011 | Posted by mvi in Production - (Comments Off)

My first Facebook game, DodgeFoot, has just been released. I’ve been working on it the last four weeks part-time with a couple of friends from University. It was the first time we’d made a game for Facebook and it was also the first time we’d made a game with the Unity3D game engine.  It was a great experience and I will be releasing a post-mortem very shortly, but in the mean time:
Please check out the game :-)