Just to set you all off again, I''m gonna say a few things.
First, the stage "1" does work very well, but really only on things like low-bit WAVs and text files (they almost always have 7bit values - creating 1/8th of the file just a load of 1''s), but this is if this is the first thing done to them. The name "stage 1" is misleading because it really comes after another pre-processor that sorts out the basic bit-length of the data (i.e. 4bit for icons, 8bit for 256bitmaps, and so on), it seems to work well.
You start with 3 bits, and read through all of the file, and creating an index of all the occurances of all the possible 3bit combos.:
Code Value Occurances000 0 7001 1 3010 2 6011 3 3100 4 65101 5 7110 6 3111 7 5
Repeat for 4,5,6,7,8 and 16 bit patterns (this takes most of the time!).
Then, sort from highest occurances to lowest occurances (I just use qsort). The bit-length that has the most occuring bit pattern (of course the occurance count is reduced if the bit size is smaller) is generally the data size.
Then, "stage 1" uses this bit-length information to arrange the bits accordingly.
The same occurance process is then used again, and works a lot better on the re-organised data. Then, a version of the huffman codec is used that deals with bit- rather than byte-patterns.
As I''m slowly giving away my so-called secrets I might as well make the final thing open-source [perhaps to prove you all wrong......I can''t wait to see your faces!!]. No point in getting that patent now.
And for those of you who''ve spammed my mailbox asking for a demo: STOP EMAILING ME ASKING FOR A DEMO
So, what do you think of my compression methods so far?
Any comments? (Like I need to ask)..