Advertisement

MP3-Beating Compression

Started by April 06, 2000 01:58 PM
494 comments, last by kieren_j 24 years, 8 months ago
Lack, even checking various window sizes isn''t going to help much on random data. Random means random - no matter which window size you check you won''t find any size that stands out much (if any) better than any other window size. I appreciate what you are saying about "impossible", but if this thread is going to degrade into a formal debate, the burden of proof is on the one making the positive assertion. ;-)

Assume the header size we used was 4 bytes. This would allow us to store in 2 bytes the sequence sizes up to 16 bits, and the other 2 bytes to store the sequence common itself. Now imagine you compressed a file from 10MB to 5MB+4bytes. Why not recompress it again, header and all? Squish that sucker down to 2MB, then again, again, again, until all you got is the 4-byte header, a zero bit for the common sequence and 17bits (16+1) for the non-common sequence. 7 bytes total, with 6 bits to spare. That''s the lower limit. (If you don''t think it will go this low, show me why, and then you''ll understand why random data won''t compress.)

If that doesn''t help, look at it backwards. The key for uncompressing is in the header, 16 bits. The data itself is compressed in 18 bits. Now imagine all the possible byte combinations that could be present in a 10MB file (the size of the original file) - is there any way that you could represent that number of combinations (256^10485760) with only 18 bits, let alone which of all those combinations would be a copy of the original file? Ain''t gonna happen, even with a 32-bit header key to help you. When I thought I had a super-duper compression algorithm a couple years ago, this perspective is what made me realize I had a flaw in my initial concept.

Hey, and I don''t think anyone here is a fool. It takes everyone a bit of effort to dig through this problem the first time they see it.

aig
(I''m not irritable, I just like anagrams)
aig
LackOfKnack: Hmmm....on some the first early posts, he said he didn''t have the decompression algorithm yet, but I guess he does now (I just really didn''t feel like looking though 100 posts to find out) .
Advertisement
Lack, what do you want to tell me? I know that you don''t need to make a pattern index with all patterns - if there are "most-common-patterns". But we talk about compressed "random" data like mp3 files. And there are no most common patterns in random files.

Visit our homepage: www.rarebyte.de.st

GA
Visit our homepage: www.rarebyte.de.stGA
Lack: I don't know if you were referring to my post, but it has been proven that a single compression algorithm cannot compress all possible files without loss. There is a nicely worded proof in that URL that was provided a few pages back. It's discussed in the section where they talk about the counting argument and pigeon holes.

If you weren't referring to my post when you said this stuff hasn't been proven, ignore this post.

Jesse

Edited by - Jesse Chounard on 4/14/00 10:51:39 PM
So many people posted their opinions, here''s my 2 cents.

I am not too clear about the theory of information, but I am pretty sure that you can compress random data up to a certain point, and there was enough math given in this thread. Yes, theory of information is relatively young, but it''s math, nothing more, and if you take a file full of random data, I gurantee you, that you can''t create an algorithm that''s better then the ones that exist right now.

However, it''s different when the data is not random. The theory of information never said anything about that, and here''s where you can shine. Example: sounds are basically waves. So, if you can break up the wav into curves with a lookup table for most common curve equasions, you can replace 1 thousand bytes by 3 bytes. That''s where mp3 is not the end. However, I don''t think that one person can create something better then multimillion dollar sponsored projects, but... you never know.

To wrap it all up. It''s useless to argue about random data compression. It''s been proven a long time ago. And mp3/zip is random data, so nomatter how much code u post, it''s a load of crap.

If you would take advantage of sound / video structure, that''s completely different, then I could totally believe you. But you don''t, so please, grow up.
since kieren didn''t react to my last post where i asked him to tell us how good excactly his "algorithm" works, (what doesn''t surprise me), did you notice that he doesn''t claim any more that he can compress random data?
we explain for the 100th time that this is *impossible* and i think even lack has understood that by now.
do kieren why don''t you show us some numbers again? look at my last post.

it''s really not much more than worthless if we continue arguing about something like this. you can as well start another thread saying that the earth is a disc
Advertisement
quote: Original post by Ridcully

since kieren didn''t react to my last post where i asked him to tell us how good excactly his "algorithm" works, (what doesn''t surprise me), did you notice that he doesn''t claim any more that he can compress random data?
we explain for the 100th time that this is *impossible* and i think even lack has understood that by now.
do kieren why don''t you show us some numbers again? look at my last post.


Correct me if I am wrong, but why would he discuss how the algo works? If it actually does what he claims, I have to admit, I would run laughing to the patent office myself.

What I would like to see is an example. Not just posted examples, I want a working .exe. So.... Submit the patent, get a copyright for the source (faster) and post a damn exe file so we can put this sucka to rest.


int main() {
if(reply.IsSpam()) {
while(true) {
int*ptr=new int[1000000];
reply.RandomInsult();
}
}
else std::cout<< "mailto:amorano@bworks.com"
}
Man this here is getting plain boring ... ain''t funny no more.. Only so much of ground breaking algorithm crap i can take..
And mr. CAR if you ever get this thing working .. send me a demo .. i want to see history being made.( sheesh!!! )

Drkool.

p.s.: Go get some computer science lessons .. maybe that will clear up your mind
kieren_j,
I was impressed by your work, but it seems somebody has already beaten you. A slightly better scheme than yours already exists, you can check out the details
here.

andrew-s@ihug.co.nz
ZIPs and MP3s are not essentially random data!
If I run my bit-based huffman on them (after bit-reordering), in thinks it''s 16-bit data. Although this process actually makes the file larger, if I run it again on the data it says it''s 4-bit data and compresses it smaller than the original data. Run it again, 16-bit and it enlarges. Again, 4-bit and it''s smaller again.
I found you can actually do this about 256 times before it stops getting smaller.
I WILL post a demo of this bit-based huffman; and you can watch it repeatedly compress the data.
I''m coding the demo now.

This topic is closed to new replies.

Advertisement