Advertisement

MP3-Beating Compression

Started by April 06, 2000 01:58 PM
494 comments, last by kieren_j 24 years, 8 months ago
quote: Original post by kieren_j

Original File:   100mbCAR File:        100mbZIPped Original: 99mbZIPped CAR:      3 bytes  




Well.

I just created a 2byte file in dos edit. No content typed.

Zipped: 110bytes
Size of file within zip: 2 bytes

Try it for yourselves.

3 byte zip is not possible.
3 bytes packed content from a 100mb file? Doubt it :-)

If 100mb was all blank spaces. Compression algo using run-length would need 1 byte for value (32) and 2 bytes for counter (how many) to fit in 3 bytes.
This means max of 65536 bytes that can be compressed before either needing two such entries, or a bigger counter.




regards,

GeniX
regards,GeniXwww.cryo-genix.net
Everyone has already said that that is a gross exaggeration, he was just showing that the CAR file is only scrambled, not compressed. Of course 3 bytes is impossible.

For the patterns: perhaps if one did some data analysis to see what kind of patterns show up most frequently, you could bring that number down from 65k to maybe a couple hundred. As he said, that one bit order pattern tends to yield 20% higher compression rates. So if you had a bunch to try out, bit-wise only, and then a few on a larger scale (maybe) then it''s fine. For any size file, only a handful of bits would be required to describe the descrambling pattern needed.

From his posts however, he only had the one pattern. But from there you could have intelligent processers that can find out a few better ones given the particular file.

I seem to thing he''s sincere and maybe right, maybe not. What I''m debating is the mathematical possibility. Since 1s and 0s are easy to compress when in a straight order, finding a common pattern that can put them in that order most of the time for most of the file will certainly allow 99% compression or better, depending on the number of patterns you allow it to check out.


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Advertisement
Just to set you all off again, I''m gonna say a few things.
First, the stage "1" does work very well, but really only on things like low-bit WAVs and text files (they almost always have 7bit values - creating 1/8th of the file just a load of 1''s), but this is if this is the first thing done to them. The name "stage 1" is misleading because it really comes after another pre-processor that sorts out the basic bit-length of the data (i.e. 4bit for icons, 8bit for 256bitmaps, and so on), it seems to work well.
You start with 3 bits, and read through all of the file, and creating an index of all the occurances of all the possible 3bit combos.:

Code   Value   Occurances000     0        7001     1        3010     2        6011     3        3100     4        65101     5        7110     6        3111     7        5


Repeat for 4,5,6,7,8 and 16 bit patterns (this takes most of the time!).
Then, sort from highest occurances to lowest occurances (I just use qsort). The bit-length that has the most occuring bit pattern (of course the occurance count is reduced if the bit size is smaller) is generally the data size.
Then, "stage 1" uses this bit-length information to arrange the bits accordingly.
The same occurance process is then used again, and works a lot better on the re-organised data. Then, a version of the huffman codec is used that deals with bit- rather than byte-patterns.
As I''m slowly giving away my so-called secrets I might as well make the final thing open-source [perhaps to prove you all wrong......I can''t wait to see your faces!!]. No point in getting that patent now.

And for those of you who''ve spammed my mailbox asking for a demo:
STOP EMAILING ME ASKING FOR A DEMO


So, what do you think of my compression methods so far?
Any comments? (Like I need to ask)..
Well, after posting about my extremely skeptical view of kieren_j last night, I''ve had a day to think things over. A few things clicked into place, and now I''m a lot less skeptic. After reading kieren_j''s last post, I feel even less skeptic. It seems he could very well be onto something. But compressing 4 megs into 32k seems unlikely. I''m no longer saying it''s not possible to get decent compression rates on an mp3 file, but those numbers just don''t seem right. I still need a demo before I''ll accept them. And in light of past hoaxes, I''ll probably still be skeptical until I see some source code.

However, I most certainly hope you ARE right, kieren! Good luck!

-Lutrosis
-Lutrosis#define WHOOPS 0class DogClass {public: CDog() { printf("Ruff!"); } Run() { printf("Run!"); } Crash() { printf("%d",100/WOOPS); }};DogClass CDog;CDog.Run();CDog.Crash();
Sell the ideas to a big company if you need the money. If you've got the money, patent it. If you don't care, release it! You'll be a legend.

This'd be the electronic equivalent to discovering how to do controlled nuclear fusion (almost). Revolution!

You're right. I do now find my self asking 'why didn't I think of that?'. Good job.

By the way, are you going to release some details on what stage you're at and what you're planning on doing or what needs to be done? Are you working on it still or are all the bugs gone?


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!

Edited by - LackOfKnack on 4/13/00 4:08:02 PM
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
kieren_j, sounds like a winner. I really like the idea of open-source. The people at slashdot will go wild with it when you do release it.

One thing: you said a while back that you fixed a bug but it cut performance in half or something... so what is the current performance level of the compressor?
Assassin, aka RedBeard. andyc.org
Advertisement
kieren_j, I''m not going to doubt you for a second. I''m getting a very good idea just reading your posts alone how you''re going about solving it. Don''t worry, I''m not going to try to hit the goldmine before you, because I''m not going to try. But what you''re doing is very feasible. I don''t think the numbers you have posted are correct, however. For instance, a 100MB file could not be compressed to under 100kb, unless it was very repetitive. But I do believe with your method that very random binary files such as a 100MB MP3 file down, to say, a couple MBs. I just can''t believe no one thought about what you''re doing before! You''re incredible! It would be nice to see you patent your algorithm, then release it just so we can all see how it works. We couldn''t use it, but hey, it would be a great experience to look at a masterpiece at hand (it does work, doesn''t it?!). Or maybe you could just become extremely famous by releasing it as free source. If you did, you obviously wouldn''t receive as much money, but you definitely could get a killer job. If money isn''t an object to you, I''d go for the latter. Take it easy, and don''t let the skeptists put you down. Einstein''s parents thought he was retarded as a child, and he created special relativity. Who knows? Keep it up...


ColdfireV
[email=jperegrine@customcall.com]ColdfireV[/email]
Well kieren, I''m sorry to say this.. But for coming up with something that brilliant that you have just described (100mb->10mb, or whatever), you sound like a fool. You are going to make it open source? And give away this invention for free? I don''t think you understand this situation. If that really works like you have described, it''s worth of millions! Don''t be a dumbass at this point...

Haha, now everyone else is going to hate me
Well, not everyone's a greedy bastard.

Just kidding! Just kidding! Ow! Ow!

[out cold]

On second thought, get the biggest pile of money you can, and then send it to starving children programs and such. I'm sure they'll appreciate it.


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!

Edited by - LackOfKnack on 4/13/00 6:48:59 PM
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Go for the money!

(I am not suggesting that I believe in CAR,
but I am not suggesting that I don''t)

Brutes!

This topic is closed to new replies.

Advertisement