About a year ago I thought I had the worlds greatest compresson algorithm. I had it all worked out, and was already counting the money I was going to get, and thinking of all the accolades I would receive for redefining the internet (downloads, video and audio, etc), as well as all aspects of computing (archives and backups, the push for bigger and bigger hard drives, etc). But after a couple weeks, I discovered a fundamental flaw and my compression worked only half as good (but still much better than anything else) - and then another flaw reared it''s ugly head and the whole algorithm went belly up, along with my hopes and dreams. ;-)
Kieren, I wish you luck and hope this works for you. If it does, this is the biggest thing to hit the industry since, well transistors I guess. That''s why I''m still skeptical. ;-)
But if it *does* work, forget the open source - go for the money. It''ll get reverse engineered anyway as soon as it hits the market, so you might as well make some serious cash. Oh yeah, and remember us little people at the gamedev forums...
Brian
MP3-Beating Compression
Frankly, I'm a little suprised by how many people seem to think this is a real program. Heck, feel free to prove me wrong...but here's a fact: The reason random data cannot, and I mean _cannot_ be compressed past a certain amount (and still be considered lossless) is that if you could compress a 100 meg file into 400k, then what stops you from putting together 100 megs of those 400k files, zipping them, CARing them, Zipping them, etc? If you could compress a truly random bitstream, then you would be able to infinitely compress all data into a single space which is, I repeat, impossible. Not "Nobody can fly into space" impossible, but mathematically impossible.
It's like saying "Hey, if I can compress 100megs into 400k, why can't I comprses 400k into 20k. If I can compress 400k into 20k, I can compress 20k into 1k, etc." Eventually will you just compress everything into 1 bit? Good luck.
Feel free to read
http://www.faqs.org/faqs/compression-faq/part1/section-8.html
over and over until you get the idea.
Here is a challenge, though, from one of my professors:
Steve Tate srt@cs.unt.edu suggests a good challenge for programs that are claimed to compress any data by a significant amount:
Here's a wager for you: First, send me the DEcompression algorithm. Then I will send you a file of whatever size you want, but at least 100k. If you can send me back a compressed version that is even 20% shorter (80k if the input is 100k) I'll send you $100. Of course, the file must be able to be decompressed with the program you previously sent me, and must match exactly my original file. Now what are you going to provide when... er... if you can't demonstrate your compression in such a way?
So far no one has accepted this challenge (for good reasons).
If anyone *cough* does feel like accepting, please let me know, so I can see this for myself.
Deathy
Edited by - deathlok on 4/13/00 6:36:00 PM
It's like saying "Hey, if I can compress 100megs into 400k, why can't I comprses 400k into 20k. If I can compress 400k into 20k, I can compress 20k into 1k, etc." Eventually will you just compress everything into 1 bit? Good luck.
Feel free to read
http://www.faqs.org/faqs/compression-faq/part1/section-8.html
over and over until you get the idea.
Here is a challenge, though, from one of my professors:
Steve Tate srt@cs.unt.edu suggests a good challenge for programs that are claimed to compress any data by a significant amount:
Here's a wager for you: First, send me the DEcompression algorithm. Then I will send you a file of whatever size you want, but at least 100k. If you can send me back a compressed version that is even 20% shorter (80k if the input is 100k) I'll send you $100. Of course, the file must be able to be decompressed with the program you previously sent me, and must match exactly my original file. Now what are you going to provide when... er... if you can't demonstrate your compression in such a way?
So far no one has accepted this challenge (for good reasons).
If anyone *cough* does feel like accepting, please let me know, so I can see this for myself.
Deathy
Edited by - deathlok on 4/13/00 6:36:00 PM
You skeptics () seem to be missing the BIG POINT: it''s NOT compression that makes it amazing! There is no magic compression algorithm! It''s the PREPARATION. After that, any regular compression method will achieve great results. And no, you''ll never get it down past a certain point, that''s true. But that point is lower than 1%. It''s about as low as a zip of NULLs can go, but you probably won''t get it that low any time in the near future. And yes, you could repeatedly scramble and compress the thing.
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Nope. By the counting argument (feel free to read the page I referenced in my previous post), any "preparation" on random data results in...random data. You still can''t compress it any better and, because you''ll need a key to ''unprepare'' it, you''ll just end up making it worse in the end.
But, that''s not true if you use a different byte length.
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
April 13, 2000 07:36 PM
kieran,
explain your re-arranging during the compression, and un-re-arranging during the decompression. you haven''t spoken to the fact that a compressed file is essentially random information, and the nature of random information is that it has no pattern, and thus no lossless compression algorithm can make it any smaller (via encoding the more frequent patterns with less information). using an optimal character length or not, huffman compressing random data isn''t going to help. given an infinitely long input stream of truly random characters, there is NO way huffman can make it any smaller, because all characters should have the same frequency. and that''s why i say you need to explain your re-arranging and un-re-arranging. because it''s the only point of contention here. the uncompressability of random data is something that cannot be questioned. you''re saying "well, i re-arrange the data, so it''s not random any more." fine. i say, how much does it cost you to encode the re-arrangement scheme in the compressed file, so you know how to un-re-arrange it? i''m going to go out on a limb and say for CAR to really be able to compress/decompress without error a file of truly random data (ie, a zip file), that the cost of encoding the re-arrange/un-re-arrange scheme is as going to be as high as the savings you''re getting from the re-arrangement. until you''ve proven otherwise, i''m afraid that you either look like a liar, or someone who is a bit naive. i''m hoping it''s the latter.
explain your re-arranging during the compression, and un-re-arranging during the decompression. you haven''t spoken to the fact that a compressed file is essentially random information, and the nature of random information is that it has no pattern, and thus no lossless compression algorithm can make it any smaller (via encoding the more frequent patterns with less information). using an optimal character length or not, huffman compressing random data isn''t going to help. given an infinitely long input stream of truly random characters, there is NO way huffman can make it any smaller, because all characters should have the same frequency. and that''s why i say you need to explain your re-arranging and un-re-arranging. because it''s the only point of contention here. the uncompressability of random data is something that cannot be questioned. you''re saying "well, i re-arrange the data, so it''s not random any more." fine. i say, how much does it cost you to encode the re-arrangement scheme in the compressed file, so you know how to un-re-arrange it? i''m going to go out on a limb and say for CAR to really be able to compress/decompress without error a file of truly random data (ie, a zip file), that the cost of encoding the re-arrange/un-re-arrange scheme is as going to be as high as the savings you''re getting from the re-arrangement. until you''ve proven otherwise, i''m afraid that you either look like a liar, or someone who is a bit naive. i''m hoping it''s the latter.
That''s what it does. You find a pattern mask that seems to create a pattern in the file, then you can use traditional compression. You may need only check a few hundred or so patterns, and it takes barely any space to specify which one to use for descrambling.
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
And that is impossible. Using a pattern to re-arrange random data will just result in more random data. If it can be re-arranged, using a pattern, to something that isn''t random, then the original data isn''t random.
April 13, 2000 08:37 PM
LackOfKnack, you are putting your toe about an inch into the waters of reason. Think about it man.
Suppose that somehow, changing all bit sequences ''110'' into ''011'' somehow increased the frequency of a pattern in the input stream.. Which made the file more "vulnerable" to compression, as Kieran so technically put it. How do you propose to also properly encode all normally existing appearances of ''011'' so that, upon un-re-arrangement, you don''t change them to back to ''110''? More to the point, how would you do it without taking up more space than the savings you would gain from making the rearrangement in the first place.
SERIOUSLY, go read that compression faq someone posted a link to, and think REAL HARD in the part that takes about the pigeonhole principle. Then think REAL HARD some more.
Suppose that somehow, changing all bit sequences ''110'' into ''011'' somehow increased the frequency of a pattern in the input stream.. Which made the file more "vulnerable" to compression, as Kieran so technically put it. How do you propose to also properly encode all normally existing appearances of ''011'' so that, upon un-re-arrangement, you don''t change them to back to ''110''? More to the point, how would you do it without taking up more space than the savings you would gain from making the rearrangement in the first place.
SERIOUSLY, go read that compression faq someone posted a link to, and think REAL HARD in the part that takes about the pigeonhole principle. Then think REAL HARD some more.
April 13, 2000 08:56 PM
deathlok - Thanks for the link. Brought back my graduate school daze. Thank god for pigeons.
kieren_j - I once again offer to sign a Non-Disclosure Agreement to act as an independent tester of your claims. We can use the method detailed in the link provided by deathlok. I had been trying to figure out how we could do a double blind test without you turning your software over to anyone, but was stumped. From the link I see where I was stuck in that you have to give up an executable that is one half of your algorithm - compression or decompression (the idea in the link can be reversed).
For the record, I don''t think kieren_j is lying in anyway. I think he is merely mistaken about the nature of what he has created. Why do I believe this? I myself once thought I had a new cool way of reordering data for compression that took about two weeks to uncover my flawed thinking. This was 12 years ago so at least I didn''t have access to the Internet to make public claims before I had completely implemented my idea and had taken time to carefully review what was really happening in my code. At some point, kieren_j is going to realize that he has a problem (maybe he''ll understand the problem, too ). He''ll work to solve it, but he won''t be able too. If he does make it work, I got $100 for kieren_j and a bucket of crow for me.
And now a bit of humor at kieren_j''s expense:
A better name for the CAR ''compression'' algorithm would be Cold Fusion.
Michael L. Roberts
aka milo
mlbobs@telocity.com
kieren_j - I once again offer to sign a Non-Disclosure Agreement to act as an independent tester of your claims. We can use the method detailed in the link provided by deathlok. I had been trying to figure out how we could do a double blind test without you turning your software over to anyone, but was stumped. From the link I see where I was stuck in that you have to give up an executable that is one half of your algorithm - compression or decompression (the idea in the link can be reversed).
For the record, I don''t think kieren_j is lying in anyway. I think he is merely mistaken about the nature of what he has created. Why do I believe this? I myself once thought I had a new cool way of reordering data for compression that took about two weeks to uncover my flawed thinking. This was 12 years ago so at least I didn''t have access to the Internet to make public claims before I had completely implemented my idea and had taken time to carefully review what was really happening in my code. At some point, kieren_j is going to realize that he has a problem (maybe he''ll understand the problem, too ). He''ll work to solve it, but he won''t be able too. If he does make it work, I got $100 for kieren_j and a bucket of crow for me.
And now a bit of humor at kieren_j''s expense:
A better name for the CAR ''compression'' algorithm would be Cold Fusion.
Michael L. Roberts
aka milo
mlbobs@telocity.com
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement