Advertisement

Decryption Puzzle (Not computer related)

Started by June 05, 2010 11:43 PM
5 comments, last by sunandshadow 14 years, 5 months ago
Is anyone a hobbyist cryptanalyst? I have a situation where I have a set of original unencrypted data, and the encoded versions of the same data, and I want to use these two sets to figure out how they were encoded. But I have no idea how to do that.

Here's what the data looks like:
- Both the original data and the encoded data are 16 characters long (I'm referring to them as A through P).
- Every 'message', whether original or encoded, has one of each character (one A, one B, one C, etc.) The encoding only changes the order they are in. The code is not a simple rotation though.
- For each original message there is only one possible encoded version. I don't know if the same is true in reverse.
- I have 1024 examples of original message and their encoded versions.
- Example 'message' and encoded 'message':
NOPM-ABCD-EFGH-IKJL -> HGFE-NOPM-IJKL-BCDA


So, what strategy could I use to find the encryption algorithm from my samples? As a first step, I am making a list of all the original messages next to all the encoded versions, but I don't know what to do after that besides the half-assed approach of color-coding the data and hoping a pattern jumps out at me.

I want to help design a "sandpark" MMO. Optional interactive story with quests and deeply characterized NPCs, plus sandbox elements like player-craftable housing and lots of other crafting. If you are starting a design of this type, please PM me. I also love pet-breeding games.

Do you know if the original system was able to decode the result back to the original? Do you know if there is a secret key involved or if it uses those 16 characters as the only input?

Are the input letters always A-P, or are those placeholders you're using to look for patterns more easily? If they're placeholders, are there any rules about what the original system will accept as input?

In your example, there are four groups of letters. After encoding, the same letters are in the same groups, though the groups themselves and the letters in each group are shuffled. Does that happen for all 1024 versions? If so, it drastically reduces the number of permutations.


If the letters-always-in-same-group rule holds true, I would start by sorting your 1024 examples by the group-shuffling pattern (there are only 24 permutations, so you should have around 42 examples each for an even distribution).

Make a permutation indexing table:

0 = ABCD
1 = ABDC
2 = ACBD
3 = ACDB
4 = ADBC
5 = ADCB
6 = BACD
...
23 = DCBA

Then substitute in the permutation index for the inter-group permutation and well as each intra-group permutation.

My suspicion is that the algorithm either: a) Swaps stuff many times wastefully, based on each input letter it encounters or b) Performs some hash on the overall input to pick permutation indices to use for each of the 5 permutation sets.

[Edited by - Nypyren on June 6, 2010 12:04:08 AM]
Advertisement
Any chance we could get a few more samples, would make it easier to determine what the pattern is. Looking at the sample you gave us, it seems as if the letters just get shuffled around a bit. I made up a quick and dirty test, so you can run a few more through to see if the letters are always shuffled the same way.

 #include <iostream>int main (int argc, char * const argv[]) {	char *in= "NOPMABCDEFGHIKJL";	char out[16] = "";	//Part1 HGFE	out[0] = in[11];	out[1] = in[10];	out[2] = in[9];	out[3] = in[8];	//Part2 NOPM	out[4] = in[0];	out[5] = in[1];	out[6] = in[2];	out[7] = in[3];	//Part3 IJKL	out[8] = in[12];	out[9] = in[14];	out[10] = in[13];	out[11] = in[15];	//Part4 BCDA	out[12] = in[5];	out[13] = in[6];	out[14] = in[7];	out[15] = in[4];			std::cout << out << std::endl;	    return 0;}
Quote: Original post by Nypyren
Do you know if the original system was able to decode the result back to the original? Do you know if there is a secret key involved or if it uses those 16 characters as the only input?

I don't know if the encoding was reversible, but I should be able to tell that by the time I am done making the list, because if 1 unique input always corresponds to one unique output it should be reversible, otherwise not.

I don't think there is a secret key, but it's possible the algorithm could use the input to decide what to do with that input. For example, I am just making this up, but maybe it could check if the first four letters are in alphabetical order and do one thing if they are, a different thing if they aren't.

Quote: Are the input letters always A-P, or are those placeholders you're using to look for patterns more easily? If they're placeholders, are there any rules about what the original system will accept as input?

The input letters are always A-P.

Quote: In your example, there are four groups of letters. After encoding, the same letters are in the same groups, though the groups themselves and the letters in each group are shuffled. Does that happen for all 1024 versions? If so, it drastically reduces the number of permutations.

This part I can answer, yes groups of 4 always stay together although the group can move within the message and the letters can be shuffled within the group. Also, the ABCD and MNOP groups, in either order, are always in either the first two positions or the last two positions in the input. Same for the EFGH and IJKL groups. This is not true for the output.


Quote: If the letters-always-in-same-group rule holds true, I would start by sorting your 1024 examples by the group-shuffling pattern (there are only 24 permutations, so you should have around 42 examples each for an even distribution).

Make a permutation indexing table:

0 = ABCD
1 = ABDC
2 = ACBD
3 = ACDB
4 = ADBC
5 = ADCB
6 = BACD
...
23 = DCBA

Then substitute in the permutation index for the inter-group permutation and well as each intra-group permutation.

My suspicion is that the algorithm either: a) Swaps stuff many times wastefully, based on each input letter it encounters or b) Performs some hash on the overall input to pick permutation indices to use for each of the 5 permutation sets.

Interesting, I will count permutations once I get the list of all the examples made (have to do it by hand, may take a couple of days).

I want to help design a "sandpark" MMO. Optional interactive story with quests and deeply characterized NPCs, plus sandbox elements like player-craftable housing and lots of other crafting. If you are starting a design of this type, please PM me. I also love pet-breeding games.

Quote: Original post by thannett
Any chance we could get a few more samples, would make it easier to determine what the pattern is.


Sure, here are more samples. I'm grateful for the help. [smile]
NOPM-ABCD-EGHF-LKJI -> MNOP-ABCD-EFGH-IJKL
NOPM-ABDC-KJLI-GHEF -> MNOP-GFHE-BCDA-IKJL
NOPM-ABDC-LIKJ-GHEF -> ABCD-IJKL-MNOP-GHFE
NOPM-ACBD-HEGF-IJKL -> ABCD-IJKL-GHFE-MNOP
NOPM-BCAD-LIJK-EFGH -> NPMO-ABCD-LIJK-FEGH
MNOP-CBDA-JKLI-FGHE -> NPOM-ABCD-KLIJ-GEHF

[Edited by - sunandshadow on June 6, 2010 1:57:45 AM]

I want to help design a "sandpark" MMO. Optional interactive story with quests and deeply characterized NPCs, plus sandbox elements like player-craftable housing and lots of other crafting. If you are starting a design of this type, please PM me. I also love pet-breeding games.

Quote: Original post by sunandshadow
NOPM-ABDC-LDKJ-GHEF -> ABCD-IJKL-MNOP-GHFE


LDKJ should be LIKJ?
Advertisement
Quote: Original post by Nypyren
Quote: Original post by sunandshadow
NOPM-ABDC-LDKJ-GHEF -> ABCD-IJKL-MNOP-GHFE


LDKJ should be LIKJ?


Yes thanks. That's always a problem when working with codes and typos, there's no spell check. -_-

I want to help design a "sandpark" MMO. Optional interactive story with quests and deeply characterized NPCs, plus sandbox elements like player-craftable housing and lots of other crafting. If you are starting a design of this type, please PM me. I also love pet-breeding games.

This topic is closed to new replies.

Advertisement