Notice, this document contains information that may be sensitive, but the contents are protected entirely under law through Free Speech in the United States, the country through which it was written.
The document writer is not responsible for any actions taken with this information or the outcomes thereof.
The terms Obfuscation and Encryption may be used interchangeably within this document.
I begin by giving an overview of this obfuscation. The overall obfuscation is made up of 3 parts. A needle in a haystack (or in this case, a seed in a junkyard), a lookup table (or swap table), and the encrypted data itself. The encrypted data is then made up of two parts: the lookup table, and basic addition encryption.
The lookup table is like one of those kid's spy decoders, or a newspaper cryptogram, where each letter is associated (and replaced) with another letter. The lookup table is generated by swapping translation results with their neighbors multiple times, making a fairly poor and inefficient randomized table, but quickly and easily done with sufficient results. The addition encryption uses a seed for how much to add to each character (after it's been translated via the lookup table) − it just basically adds that seed to the character (thus A + 1 = B), wrapping it back to 0 if it exceeds 255. The seed is the position of the character in the stream (or the file position of the character if in a file), or more efficiently, the seed begins as the position of the first character, and each successive call increases the seed by 1.
Although you may bypass the table by mathematically calculating each byte with its lookup, it would be inefficient and more work on your part, whereas a table provides a very elegant and quick lookup for each byte. Memoization may also be possible, but if every cell is used and any 1 cell is reused, it would be less efficient than just generating the table in the beginning, since a lookup is faster than an if…lookup.
To generate the swap table, a seed is necessary. This seed is then split into its upper 250 and lower 250 parts, so 249 = 0,249 and 250 = 1,0 and 251 = 1,1. Then 6 is added to the lower part. The table starts out with 2 columns and 256 rows, each cell in the left column initially containing its row value (starting with 0, ending at 255).
Cells in the first column are then swapped with their next neighbor down to an extent of 10000 swaps, such that the cell that each swap is performed on is 1 + ((iteration * upper + lower) % 254)
Where
iteration
is the current swap iteration (starting with 1, each swap then adding 1, proceeding to and including 10000),
%
indicates a modulation, and
upper
and lower
indicate the parts of the seed
You will notice that the topmost cell (cell 0,0) does not get swapped. Furthermore, if the seed is 248, the list will remain in order, unchanged.
To populate the right column of the table, each cell is assigned the row number of the left cell that points to it. Thus, the left cell points to a right cell, and the right cell points back to its corresponding left cell. This, of course, need not apply to the topmost cells, which both contain 0.
Python (By Quadduc)
def generateSwapTable(seed): a = 6 + (seed % 250) b = seed / 250 table = (range(0, 256), range(0, 256)) for i in range(1, 10001): j = 1 + ((i * a + b) % 254) table[0][j], table[0][j + 1] = table[0][j + 1], table[0][j] for i in range(1, 256): table[1][table[0][i]] = i return table
C/C++ (By IsmAvatar, translated from Python and tested in gcc)
int table[2][256]; void generateSwapTable(int seed) { int a, b, i, j, t; a = 6 + (seed % 250); b = seed / 250; for (i = 0; i < 256; ++i) { table[0][i] = i; } for (i = 1; i < 10001; ++i) { j = 1 + ((i * a + b) % 254); t = table[0][j]; table[0][j] = table[0][j+1]; table[0][j+1] = t; } table[1][0] = 0; //this operation is optional, as 0 is default for (i = 1; i < 256; ++i) { table[1][table[0][i]] = i; } }
Java (By IsmAvatar, translated from C and tested in javac)
int[][] generateSwapTable(int seed) { int[][] table = new int[2][256]; int a = 6 + (seed % 250); int b = seed / 250; for (int i = 0; i < 256; i++) table[0][i] = i; for (int i = 1; i < 10001; i++) { int j = 1 + ((i * a + b) % 254); int t = table[0][j]; table[0][j] = table[0][j + 1]; table[0][j + 1] = t; } table[1][0] = 0; //this operation is optional, as 0 is default for (int i = 1; i < 256; i++) table[1][table[0][i]] = i; return table; }
In order to de-obfuscate a set of data, we will treat this data like a stream (in other words, a file would work fine). Generally, the data is stored in little-endian. We'll have two stream, one is our input stream, which stores the obfuscated data, while the other will be our output stream, which will store the de-obfuscated data. You may consider using the data as soon as it is de-obfuscated, or use the stream as a buffer, so as to preserve efficiency (as opposed to completing the stream and then re-opening it to read the data from it.
We start the output stream off by transferring over the first 8 bytes or 64 bits from input to output, since this data is not obfuscated so as to determine what kind of file it is and what version it is. This is so older versions know not to bother with it, and newer versions know that it indeed needs to go out of its way to de-obfuscate it. Next, we read in two 32-bit integers (but don't store them in our output stream), we'll call them Bill and Fred, respectively. Bill and Fred are what I like to call Trash-Treasure-Men. I call them this because they each hold a number, which is how much trash we have to pass until we reach the treasure, which is the swap table seed, and how much trash we have to pass until we get out of the dump. Indeed, one of the tricks of this obfuscation method is to store a lot of junk just to hide the treasure. Thus, we skip forward Bill * 4 bytes (that is, if Bill is 3, we'll skip 3 integers, or 3*4=12 bytes, or 3*32=96 bits). We read one more 32-bit integer, and it will be our Seed, so we can use that to generate our swap table (which I explained above). We then skip Fred * 4 bytes (remember Fred from before? He's still with us). Now we read in 1 byte (8 bits) since that byte is not obfuscated. The remainder of the bytes are then read one at a time, translated, and then written to the output stream.
The translation consists of the input stream/file position of the byte subtracted from the byte's swap table lookup (on the right column), or lookup − position
. Don't forget that the position is 0-offset, so if you're reading 1-offset or have already read the byte before reading its position, you may need to subtract 1 from the position or add 1 to the result. You are encouraged to, for efficiency, only note the position once in a variable, and from then on refer to that variable, rather than the actual stream position repeatedly (increase the variable each time it is used). We then ensure that the number is only an 8-bit byte by saying result & 255
where & indicates bitwise and, thus truncating all bits except the least-significant 8 bits (resultantly truncating the negative bit, if one exists, and wrapping the number back to 0 each time it exceeds 256). The resulting algorithm is this (notice, I've added 1 because the position is read after the byte in this case:
out.write((table[1,byte] − in.position() + 1) & 255)
You should read de-obfuscation in order to understand the concept of the obfuscation the best. Again, we will use two streams − one for the unadjusted data, called input, and one for the obfuscated data, called output this time. The first 8 bytes again are transferred over as-is, for use in determining file type and version. We then randomly generate two 32-bit integers, called Bill and Fred, each generated by random(3000) + N
where N
is 123
for Bill, and 231
for Fred, and where random(N)
produces a random integer R such that 0 ≤ R < N
. Both of these numbers are then written to the file, in the order of Bill, and then Fred. Then, we write several junk 32-bit integers to the file, each randomly generated by random(3000)
(with no number added this time). The number of these integers is determined by Bill. We then randomly generate a 32-bit Seed using the formula random(25600) + 3328
. This seed is written to file and used to generate the Swap Table. We then write more junk to the file, in the same fashion as before, this time using Fred to determine how many junk 32-bit integers, each generated by random(3000) again. 1 byte is then transferred over from input to output, as is, since, again, this byte is not obfuscated. The remainder of the bytes are then read one at a time, translated, and then written to the output stream. The translation is now reverse that of De-Obfuscation. We take the byte's value and add the output stream/file position. Don't forget that the position is again 0-offset, so if you are reading 1-offset, subtract one. Since we can guarantee that the position is read before the byte is written, no change is necessary there. The result is then wrapped back into the 255 range via & 255
and then the overall result is looked up on the left side of the table, which gets written to the output stream. Thus:
out.write(table[0,(byte + out.position()) & 255])
Notice that the code below is only tested in Python, but I made sure the Java and C code compiled in javac and gcc respectively.
Python (By Quadduc with modifications by IsmAvatar)
def deobfuscate(stin, stout): stout.write(stin.read(8)) junk = struct.unpack('<II', stin.read(8)) stin.seek(4 * junk[0], 1) (seed,) = struct.unpack('<I', stin.read(4)) table = generateSwapTable(seed) stin.seek(4 * junk[1], 1) stout.write(stin.read(1)) while True: b = stin.read(1) if len(b) == 0: break stout.write(chr((table[1][ord(b)] - stin.tell() + 1) & 0xFF))
C/C++ (By IsmAvatar, uses a 512 byte buffer for efficiency, translated from Python)
void deobfuscate(FILE *stin, FILE *stout) { char buffer[BUFSIZ]; int a, b, c; fread(buffer,1,8,stin); fwrite(buffer,1,8,stout); fread(&a,4,1,stin); fread(&b,4,1,stin); fseek(stin,(long) a * 4, SEEK_CUR); fread(&c,4,1,stin); generateSwapTable(c); fseek(stin,(long) b * 4, SEEK_CUR); fread(buffer,1,1,stin); fwrite(buffer,1,1,stout); c = (int) ftell(stin); while ((a = fread(buffer,1,BUFSIZ,stin)) > 0) { for (b = 0; b < a; ++b) { buffer[b] = (table[1][(int) buffer[b]] - c++) & 255; } fwrite(buffer,1,a,stout); } }
Python (By Quadduc with modifications by IsmAvatar)
def obfuscate(stin, stout): stout.write(sin.read(8)) a = random.randint(123,3122) b = random.randint(231,3230) stout.write(struct.pack('<II', a, b)) for x in range(0,a) : stout.write(struct.pack('<I', random.randint(0,2999))) seed = random.randint(3328, 28927) stout.write(struct.pack('<I', seed)) table = generateSwapTable(seed) for x in range(0,b) : stout.write(struct.pack('<I', random.randint(0,2999))) stout.write(stin.read(1)) while True: b = stin.read(1) if len(b) == 0: break stout.write(chr(table[0][(ord(b) + stout.tell()) & 0xFF]))
C/C++ (By IsmAvatar, uses a 512 byte buffer for efficiency, translated from Python)
void obfuscate(FILE *stin, FILE *stout) { char buffer[BUFSIZ]; int a, b, junk; srand(time(NULL)); fread(buffer,1,8,stin); fwrite(buffer,1,8,stout); a = rand() % 3000 + 123; b = rand() % 3000 + 231; fwrite(&a,4,1,stout); fwrite(&b,4,1,stout); while (a-- > 0) { junk = rand() % 3000; fwrite(&junk,4,1,stout); } junk = (rand() % 25600) + 3328; fwrite(&junk,4,1,stout); //seed generateSwapTable(junk); while (b-- > 0) { junk = rand() % 3000; fwrite(&junk,4,1,stout); } fread(buffer,1,1,stin); fwrite(buffer,1,1,stout); junk = (int) ftell(stout); while ((a = fread(buffer,1,BUFSIZ,stin)) > 0) { for (b = 0; b < a; ++b) { buffer[b] = table[0][(((int) buffer[b]) + junk++) & 255]; } fwrite(buffer,1,a,stout); } }
Java (By IsmAvatar, translated from Python)
int readInt(InputStream stin) throws IOException { int a = stin.read(); int b = stin.read(); int c = stin.read(); int d = stin.read(); return (a | (b << 8) | (c << 16) | (d << 24)); } void deobfuscate(InputStream stin, OutputStream stout) throws IOException { byte[] buf = new byte[8]; stin.read(buf); stout.write(buf); int a = readInt(stin); int b = readInt(stin); stin.skip(a * 4); int[][] table = generateSwapTable(readInt(stin)); stin.skip(b * 4); stout.write(stin.read()); b = a * 4 + b * 4 + 21; while ((a = stin.read()) != -1) stout.write((table[1][a] - b++) & 255); } void writeInt(OutputStream stout, int value) throws IOException { stout.write(value & 255); stout.write((value >> 8) & 255); stout.write((value >> 16) & 255); stout.write(value >> 24); } void obfuscate(InputStream stin, OutputStream stout) throws IOException { byte[] buf = new byte[8]; stin.read(buf); stout.write(buf); int a = (int) (Math.random() * 3000 + 123); int b = (int) (Math.random() * 3000 + 231); writeInt(stout,a); writeInt(stout,b); while (a-- > 0) stout.write((int) (Math.random() * 3000)); int seed = (int) (Math.random() * 25600 + 3328); int[][] table = generateSwapTable(seed); while (b-- > 0) stout.write((int) (Math.random() * 3000)); stout.write(stin.read()); b = a * 4 + b * 4 + 21; while ((a = stin.read()) != -1) stout.write(table[0][(a + b++) & 255]); }