Back to General and Gameplay Programming

MP3-Beating Compression

kieren_j · 2000-05-06T13:18:55

You probably don''t believe me, but if you''re at all interested in my new "CAR" compression alogrithm, check this out: The strange thing is, it works better on compressed files! Zipping an MP3 file gives you 99% of original, but check this out! **** TESTS ON UNCOMPRESSED FILES **** TXT File Example TXT File: 1,318,671 Savings: 1,308,940 CAR File: 9,731 Percent: 0.7% WAV File Example WAV File: 8,362,354 Savings: 8,323,477 CAR File: 38,877 Percent: 0.5% EXE File Example EXE File: 216,064 Savings: 213,336 CAR File: 2,728 Percent: 1.3% **** TESTS ON ALREADY-COMPRESSED FILES **** MP3 File Example MP3 File: 4,961,773 Savings: 4,945,669 CAR File: 16,104 Percent: 0.3% MPG File Example MPG File: 5,976,068 Savings: 5,946,909 CAR File: 29,159 Percent: 0.5% If you didn''t see it first time, I compressed an MP3 file from 5 meg to 16kb. What CAR actually does is obviously a complete secret, but I''m really really excited about it! I''ve been thinking of how to do it for years - but now, yay! (I figured it out playing around in QB, of all things!). What I want to know is basically are there any sites that are relatively easy to understand that tell you how to do: Huffman Compression LZW Compression "Textbook" RLE Compression (I only know PCX''s RLE) I know that you use binary trees and nodes and so on but I have no idea for a software implementation! Anyways you probably don''t believe me, but I just wanna try to make the compression better. Thanks from a very very excited Kieren Johnstone --------------- kieren_j

General and Gameplay Programming Programming

Started by kieren_j April 06, 2000 01:58 PM

494 comments, last by kieren_j 24 years, 8 months ago

LackOfKnack

122

April 18, 2000 08:16 PM

TheHuff: Aha! You changed something. Now I see what you''re getting at, and I agree.

Oh, dingdingdingding... I think I''m swaying. I have a headache.

AIG, so those two would be... the only random two? If so, what good are they?

Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!

Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!

Anonymous

April 18, 2000 08:26 PM

dethlok: You''re right that 0.99999 != 1. But it is a mathematically proven fact that 0.99999... == 1. For more information, see recent posts to sci.math

mmathias

122

April 18, 2000 08:30 PM

this is my first post in this discussion, but i read all the previous messages. i just want to tell everybody my opinion on compression and i don''t care what other people think about it, i also believe that everybody should support kieren_j.

i think that kieren_j''s bit pattern idea is pretty cool, because those patterns can differ in size and can also be generated by the compressor at runtime (by a ''for''-loop. tell me the beginning and the end, and i will give you a lot of bit patterns ;-) ).

there is a lower limit (sizewise) to which data can be compressed, and kieren''s idea could achieve compression down to this very limit. i believe that that''s what everybody is talking about. there will never be a way to compress data down to 1 bit, because for one bit, there are only two different states.

you pigeonhole discussion also does not lead to anything. of course it is not possible to represent 8 bits by 7 bits without losing data. but it is always possible to find a pattern (even in data that has already been compressed). they will be harder to find however.

//.athias

___________________________________________
//.athias .\\üller mmathias@magnet.at
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

___________________________________________ //.athias .üller mmathias@magnet.at¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Vetinari

133

April 18, 2000 08:53 PM

Quote----------------------------------
I say that the only files that can't be compressed at all are the ones that are too small to contain any pattern large enough to weed out with good results. But then I've also said that those that are that small, nobody would need to compress anyway.
--------------------------------------
It can be shown fairly easily that files compressed with huffman encoding have no patterns large enough to compress it further. It is, in many senses, completely optimal.

Quote---
Also, you speak of using a universal algorithm, and with that, perhaps your statements are true. But if your algorithm is an incorporation of many methods, including rearranging and masking and such, just about anything could be compressed.
-----------------------
Sorry, but an algorithm that incorporates many methods, no matter what the methods, is still ONE algorithm. And you agree that a unversal algorithm cannot compress every file, so where's the problem?
Here is the problem with your 'array of methods' approach. You are increasing the header size so that you know which 'method' you used to compress. Any savings you get will be more than offset by the header. Period. You act like this is a new idea and that people that wrote all those articles didn't have this in mind also, but they did.
Quote----------------------
Someone even said I was a fool because I'm Christian
---------------------------
I believe he said you were a fool becuase you believed this BS and he was referring I believe to your belief in the unconventional creationism.
Quote---------------------
Since no-one had mathematics to back up the statement that data cannot be compressed repeatedly and were making vague statements as I was, I thought I'd introduce an analogy
--------------------------
I have some idea of what you are analogizing, but DNA still operates under the laws of information theory, and I have done quite a bit of reading on this. You have a minor point here, it is believed that every speicies decodes DNA differently, so that the same DNA sequence could code a completely different animal, BUT the way to decode the sequence is also encoded by the species. So, there is no magic compression in DNA either. And, there is definately no seed's within seeds as you earlier mentioned (I have read every one of your posts).

quote:
--------------------------------------------------------------------------------
The sign of a flawed compression algorithm is one that continues to compress a file after being run more than once.
--------------------------------------------------------------------------------
Says who? Why not? This is where the dna/seed analogy comes in.
---------------------------
I have no clue how the DNA/seed anaolgy comes in here. I say it is flawed becuase it is not optimal if it can be further compressed.

Quote---------------------
More like vague scientific opinions versus vague scientific opinions. The faith-only (at least those who were on topic) stopped posting awhile back.
---------------------------
Actually, I was referring to you. You have no scientific backup, while the rest of us have plenty. You have wild and uneducated guesses. I say uneducated becuase it is fairly obvious that you have not worked with compression before. I'm trying my best not be vague, and there are many posts that spell out why in great unvague detail. As do the posted web pages. Definately scientific v hopeful faith.
You act as if, after billions are spent on compression research, this very not-new idea of an array of methods will somehow change everything. Sorry, not going to happen.
Could someone tell me how to use the quote flag? Thanks,

Mike

Edited by - Vetinari on 4/18/00 8:56:43 PM

"Unintentional death of one civilian by the US is a tragedy; intentional slaughter of a million by Saddam - a statistic." - Unknown

deathlok

122

April 18, 2000 09:01 PM

As for .9999...== 1, I did agree that it was true, so long as we were talking about .9 repeating, not just .9999 *grin*
Perhaps I just misread the original post.

And a nod to Vetinari for pointing out the flaw in the DNA argument before I could.

Deathy

126

April 18, 2000 09:04 PM

mmathias, you can''t compress every of the possible n bit patterns - it''s n bit of information, no more and no less.
And you can''t compress what we call random data - data without evident patterns. Look at my definition of random data. Sure, we can find only approximations - but approximations which have most of the characteristics of my "idealized" random pattern.

Visit our homepage: www.rarebyte.de.st

GA

Visit our homepage: www.rarebyte.de.stGA

Anonymous

April 18, 2000 09:44 PM

does this car compression for real? if it aint say so plz, and stop posting. if it is, make a compression program and sell it. we neeeeed better compression programs!!! winzip is not doin the trick anymore

mmathias

122

April 18, 2000 09:48 PM

quote: Original post by ga

mmathias, you can''t compress every of the possible n bit patterns - it''s n bit of information, no more and no less.
And you can''t compress what we call random data - data without evident patterns. Look at my definition of random data. Sure, we can find only approximations - but approximations which have most of the characteristics of my "idealized" random pattern.

Visit our homepage: www.rarebyte.de.st

GA

i am not talking about random data, because there is no true random data. every string of chars can be expressed by an equation. let''s look at a very simple function...

for(int i=0;i<=10;i++);

the sequence i would get from this "equation" is:

0 1 2 3 4 5 6 7 8 9 10

now instead of storing this sequence of bytes, you can simply store the source function. and please don''t tell me that there is data without evident patterns. you will always find a pattern - even in your so-called pseudo-random data! define random! you can''t, can you? ;-)

___________________________________________
//.athias .\\üller mmathias@magnet.at
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

___________________________________________ //.athias .üller mmathias@magnet.at¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Lutrosis

122

April 18, 2000 10:26 PM

Y''know, we haven''t heard from kieren_j in a while. I wonder if he''s encountered a little snag? =) I certainly hope that he at least makes a post saying he was wrong, so this thread can be closed, and we can stop seeing the link to the compression faq and hearing about pigeons. I don''t think I''ve ever heard so many people repeat themselves so many times, in so many different ways.

I''ll admit though, it *has* been fun, and I learned a bit about compression. And pigeons too.

-Lutrosis

-Lutrosis#define WHOOPS 0class DogClass {public: CDog() { printf("Ruff!"); } Run() { printf("Run!"); } Crash() { printf("%d",100/WOOPS); }};DogClass CDog;CDog.Run();CDog.Crash();

osmanb

2,082

April 18, 2000 10:29 PM

You want me to define random? Fine. A random bit stream is a sequence of binary values, every one of which has the value 0 with probability 0.5 and the value 1 with probability 0.5 Furthermore, all of those probabilities are independent. A "random file" is a finite random bit stream. That wasn''t so hard, was it? And no, you can''t write sequence generators for every N bit sequence that will produce the sequence and which are also smaller than N bits. Period.

Lack: Just to clarify, I hope the above makes some more sense (at least our definition of random file). I think I''m beginning to understand why you were not agreeing with us, and where some confusion came from. When we say random file, we don''t mean that the data must have certain properties, or be one of a certain set of files... The random bit stream described above is just as likely to produce a file containing nothing but 0s as it is to generate any other file. That''s the real problem... you have no idea a priori (ahead of time) what types of files you''re going to see, so you can''t write your algorithm with them in mind. Thus, you can produce compressors that work very well on a small subset of files of a given length (those with long repeated sequences, for example) but if you look at every possible file of length N, you realize that most of them don''t have any such property. In the end, you can''t produce a compression algorithm that compresses every file of a given size. You can even pick which two files it won''t be able to compress, but you still have to make that decision. (And at this point, your program has used EVERY bit pattern of less than N bits to represent a file of length N, so you can''t actually compress anything shorter than that either.)

-Brian

MP3-Beating Compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

MP3-Beating Compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines