MP3-Beating Compression

kieren_j · 2000-05-06T13:18:55

You probably don''t believe me, but if you''re at all interested in my new "CAR" compression alogrithm, check this out: The strange thing is, it works better on compressed files! Zipping an MP3 file gives you 99% of original, but check this out! **** TESTS ON UNCOMPRESSED FILES **** TXT File Example TXT File: 1,318,671 Savings: 1,308,940 CAR File: 9,731 Percent: 0.7% WAV File Example WAV File: 8,362,354 Savings: 8,323,477 CAR File: 38,877 Percent: 0.5% EXE File Example EXE File: 216,064 Savings: 213,336 CAR File: 2,728 Percent: 1.3% **** TESTS ON ALREADY-COMPRESSED FILES **** MP3 File Example MP3 File: 4,961,773 Savings: 4,945,669 CAR File: 16,104 Percent: 0.3% MPG File Example MPG File: 5,976,068 Savings: 5,946,909 CAR File: 29,159 Percent: 0.5% If you didn''t see it first time, I compressed an MP3 file from 5 meg to 16kb. What CAR actually does is obviously a complete secret, but I''m really really excited about it! I''ve been thinking of how to do it for years - but now, yay! (I figured it out playing around in QB, of all things!). What I want to know is basically are there any sites that are relatively easy to understand that tell you how to do: Huffman Compression LZW Compression "Textbook" RLE Compression (I only know PCX''s RLE) I know that you use binary trees and nodes and so on but I have no idea for a software implementation! Anyways you probably don''t believe me, but I just wanna try to make the compression better. Thanks from a very very excited Kieren Johnstone --------------- kieren_j

An Irritable Gent

122

April 14, 2000 08:12 PM

Lack, even checking various window sizes isn''t going to help much on random data. Random means random - no matter which window size you check you won''t find any size that stands out much (if any) better than any other window size. I appreciate what you are saying about "impossible", but if this thread is going to degrade into a formal debate, the burden of proof is on the one making the positive assertion. ;-)

Assume the header size we used was 4 bytes. This would allow us to store in 2 bytes the sequence sizes up to 16 bits, and the other 2 bytes to store the sequence common itself. Now imagine you compressed a file from 10MB to 5MB+4bytes. Why not recompress it again, header and all? Squish that sucker down to 2MB, then again, again, again, until all you got is the 4-byte header, a zero bit for the common sequence and 17bits (16+1) for the non-common sequence. 7 bytes total, with 6 bits to spare. That''s the lower limit. (If you don''t think it will go this low, show me why, and then you''ll understand why random data won''t compress.)

If that doesn''t help, look at it backwards. The key for uncompressing is in the header, 16 bits. The data itself is compressed in 18 bits. Now imagine all the possible byte combinations that could be present in a 10MB file (the size of the original file) - is there any way that you could represent that number of combinations (256^10485760) with only 18 bits, let alone which of all those combinations would be a copy of the original file? Ain''t gonna happen, even with a 32-bit header key to help you. When I thought I had a super-duper compression algorithm a couple years ago, this perspective is what made me realize I had a flaw in my initial concept.

Hey, and I don''t think anyone here is a fool. It takes everyone a bit of effort to dig through this problem the first time they see it.

aig
(I''m not irritable, I just like anagrams)

aig

Zipster

2,420

April 14, 2000 08:44 PM

LackOfKnack: Hmmm....on some the first early posts, he said he didn''t have the decompression algorithm yet, but I guess he does now (I just really didn''t feel like looking though 100 posts to find out)

.

ga

126

April 14, 2000 09:26 PM

Lack, what do you want to tell me? I know that you don''t need to make a pattern index with all patterns - if there are "most-common-patterns". But we talk about compressed "random" data like mp3 files. And there are no most common patterns in random files.

Visit our homepage: www.rarebyte.de.st

GA

Visit our homepage: www.rarebyte.de.stGA

Jesse Chounard

394

April 14, 2000 10:49 PM

Lack: I don't know if you were referring to my post, but it has been proven that a single compression algorithm cannot compress all possible files without loss. There is a nicely worded proof in that URL that was provided a few pages back. It's discussed in the section where they talk about the counting argument and pigeon holes.

If you weren't referring to my post when you said this stuff hasn't been proven, ignore this post.

Jesse

Edited by - Jesse Chounard on 4/14/00 10:51:39 PM

Website - Third Party Ninjas | Twitter - twitter.com/Chounard

kill

146

April 14, 2000 11:05 PM

So many people posted their opinions, here''s my 2 cents.

I am not too clear about the theory of information, but I am pretty sure that you can compress random data up to a certain point, and there was enough math given in this thread. Yes, theory of information is relatively young, but it''s math, nothing more, and if you take a file full of random data, I gurantee you, that you can''t create an algorithm that''s better then the ones that exist right now.

However, it''s different when the data is not random. The theory of information never said anything about that, and here''s where you can shine. Example: sounds are basically waves. So, if you can break up the wav into curves with a lookup table for most common curve equasions, you can replace 1 thousand bytes by 3 bytes. That''s where mp3 is not the end. However, I don''t think that one person can create something better then multimillion dollar sponsored projects, but... you never know.

To wrap it all up. It''s useless to argue about random data compression. It''s been proven a long time ago. And mp3/zip is random data, so nomatter how much code u post, it''s a load of crap.

If you would take advantage of sound / video structure, that''s completely different, then I could totally believe you. But you don''t, so please, grow up.

Ridcully

122

April 15, 2000 02:16 AM

since kieren didn''t react to my last post where i asked him to tell us how good excactly his "algorithm" works, (what doesn''t surprise me), did you notice that he doesn''t claim any more that he can compress random data?
we explain for the 100th time that this is *impossible* and i think even lack has understood that by now.
do kieren why don''t you show us some numbers again? look at my last post.

it''s really not much more than worthless if we continue arguing about something like this. you can as well start another thread saying that the earth is a disc

Joviex

250

April 15, 2000 02:33 AM

quote: Original post by Ridcully

since kieren didn''t react to my last post where i asked him to tell us how good excactly his "algorithm" works, (what doesn''t surprise me), did you notice that he doesn''t claim any more that he can compress random data?
we explain for the 100th time that this is *impossible* and i think even lack has understood that by now.
do kieren why don''t you show us some numbers again? look at my last post.

Correct me if I am wrong, but why would he discuss how the algo works? If it actually does what he claims, I have to admit, I would run laughing to the patent office myself.

What I would like to see is an example. Not just posted examples, I want a working .exe. So.... Submit the patent, get a copyright for the source (faster) and post a damn exe file so we can put this sucka to rest.

int main() {
if(reply.IsSpam()) {
while(true) {
int*ptr=new int[1000000];
reply.RandomInsult();
}
}
else std::cout<< "mailto:amorano@bworks.com"
}

drkool

122

April 15, 2000 02:46 AM

Man this here is getting plain boring ... ain''t funny no more.. Only so much of ground breaking algorithm crap i can take..
And mr. CAR if you ever get this thing working .. send me a demo .. i want to see history being made.( sheesh!!! )

Drkool.

p.s.: Go get some computer science lessons .. maybe that will clear up your mind

Anonymous

April 15, 2000 04:14 AM

kieren_j,
I was impressed by your work, but it seems somebody has already beaten you. A slightly better scheme than yours already exists, you can check out the details
here.

andrew-s@ihug.co.nz

kieren_j

Author

100

April 15, 2000 04:20 AM

ZIPs and MP3s are not essentially random data!
If I run my bit-based huffman on them (after bit-reordering), in thinks it''s 16-bit data. Although this process actually makes the file larger, if I run it again on the data it says it''s 4-bit data and compresses it smaller than the original data. Run it again, 16-bit and it enlarges. Again, 4-bit and it''s smaller again.
I found you can actually do this about 256 times before it stops getting smaller.
I WILL post a demo of this bit-based huffman; and you can watch it repeatedly compress the data.
I''m coding the demo now.

MP3-Beating Compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

MP3-Beating Compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines