Advertisement

MP3-Beating Compression

Started by April 06, 2000 01:58 PM
494 comments, last by kieren_j 24 years, 8 months ago
I have an answer for you to prove that you are not lying (I am giving you the benefit of the dought). give me a demo of the source and someone else on the page will give me some files to compress, I will use the compressor to compress the files then decompress and compare the files. Now I know what your thinking but I truly CAN NOT hack your code, I have never been able to hack anyones code (besides I beelive a persons code is his own). then at the end of the test I will delete the program (even if that would hurt me to do so)
PLEASE kieren_j, Just UPDATE us on whats happened/happening, PLEASE!!!, Your really starting to make me a little mad, I just keep checking hear waiting for you to answer, JUST ANSWER!
Advertisement
From what I recall from university, there is a fairly concise definition of what constitutes a random bit stream. A purely random bit stream is one in which the probability of finding each of the possible 2^n values of an n bit word is equal. So, for 1 bit words, you''d have 50% of ''1'' and 50% of ''0''. For 2 bit words, 25% of ''11'', 25% of ''10'', 25% of ''01'' and 25% of ''00''. And so on... For larger words, the probabilities become vanishingly small. Therefore, any lossless, frequency-based compression algorithm is doomed.

No real compressed (or pseudo-random) data *exactly* fits this definition, but does generally approximate it fairly well, which is why a compressed file can *generally* not be compressed any further, losslessly. A random bit stream, or something close to it, resulting from an efficient compression algorithm already contains the maximum information content that it is possible to represent in that stream size.

Please correct me if I''m wrong. It''s been years since I did information theory, and I wish I could remember Shannon''s theorem and the equations for calculating entropy. It''s an interesting discussion. The proofs presented are perfectly logical when you think them through, but they aren''t intuitive when you''re starting out and have your hopes up high, thinking you''ve discovered something.

I, too, once thought I''d created a killer compression algorithm, which would enable an entire game to be stored in a handful of bytes. But when I thought it through, I realised that it was like the old story of the flat earth being carried around on the back of a turtle. What does the turtle stand on? It''s turtles all the way down. By that, I mean you can compress any file down to a few bytes, but it''s not "standing" on anything if you don''t provide an enormous amount of header information to reconstruct it! That was the problem with my algorithm (which I came up with when I was in my mid-teens), and it took me a few weeks to realise it.

- Richard (not registered yet...)
Where is this "nearly done now" demo??
I suppose I got a little carried away the other day. In all truthfulness I wasn''t really aiming my comments at anyone, but was mearly trying to have a good time in a thread that is by far the most entertaining I''ve ever read.
I guess I''m kinda wondering where Kieren went in the past few pages. Maybe someone should go find him. He could be lying in a ditch after being run over by an angry game developer. Or maybe the CIA found out about this, and they have him in a large underground bunker somewhere trying to dissect his brain for this key bit of information.

Or if you are preferink, Mother Russia sent her finest ''diplomats'' to be solvink the mystery. But they are waitink for the overnight deliverance of rabid Georgian lemurs via FedEx. 10:00 AM the next day. Damn this Slavic accent.

He could have also decided to wrap his head in a towel and jump from the edge of his bed into a box of styrofoam shipping popcorn.

Who knows?

It''s a mystery.

Pythius - A simple game developer concerned about the whereabouts of his good friend, Kieren_J. Make that a Code God concerned about the whereabouts of his good friend, Kieren_J.
"The object of war is not to die for your country, but to make the other bastard die for his"
Advertisement
mmathias, I defined random data on page 12 I think, and we'll find only approximations to that, but osmanb and anonymous poster did a better explanation on page 13.

Visit our homepage: www.rarebyte.de.st

GA

Edited by - ga on 4/19/00 5:06:34 AM
Visit our homepage: www.rarebyte.de.stGA
Athias has it.

quote: Original post by Vetinari

It can be shown fairly easily that files compressed with huffman encoding have no patterns large enough to compress it further. It is, in many senses, completely optimal.


You''re saying no bit pattern of any size is ever repeated more than any other?

quote: Sorry, but an algorithm that incorporates many methods, no matter what the methods, is still ONE algorithm.


No, it''s a bunch of algorithms. ?

quote: And you agree that a unversal algorithm cannot compress every file, so where''s the problem?
Here is the problem with your ''array of methods'' approach. You are increasing the header size so that you know which ''method'' you used to compress. Any savings you get will be more than offset by the header.


How do you know this? Does this not depend on the particular file?

quote:
I believe he said you were a fool becuase you believed this BS and he was referring I believe to your belief in the unconventional creationism.


Maybe so, he didn''t specify. It''s still a cheap potshot at my beliefs, which in that case happen to go along with my religion.

quote: I have some idea of what you are analogizing, but DNA still operates under the laws of information theory, and I have done quite a bit of reading on this. You have a minor point here, it is believed that every speicies decodes DNA differently, so that the same DNA sequence could code a completely different animal, BUT the way to decode the sequence is also encoded by the species. So, there is no magic compression in DNA either. And, there is definately no seed''s within seeds as you earlier mentioned (I have read every one of your posts).


Okay, that''s correct that you don''t give birth to a seed that can grow into something, but that''s where it differs: Within say, a human, you have the potential once they are born to give birth to many more unique humans. So the analogy works there.

quote: I have no clue how the DNA/seed anaolgy comes in here. I say it is flawed becuase it is not optimal if it can be further compressed.


I see what you''re saying. But if your compression adds to and rearranges the data, it can have patterns again, especially at a different bit level.

quote: Actually, I was referring to you. You have no scientific backup, while the rest of us have plenty. You have wild and uneducated guesses. I say uneducated becuase it is fairly obvious that you have not worked with compression before. I''m trying my best not be vague, and there are many posts that spell out why in great unvague detail. As do the posted web pages. Definately scientific v hopeful faith.


For the parts we agree on, you have used mathematics and such. For the parts we disagree on, we have both been putting out statements that seem logical to ourselves but have no backup.

quote:
You act as if, after billions are spent on compression research, this very not-new idea of an array of methods will somehow change everything. Sorry, not going to happen.


Why not? It could. But I see your point. But it''s still possible.

quote: Could someone tell me how to use the quote flag?


Sorry, I forgot to mention that when I saw ga doing that. Click the reply-to icon above the person''s message to quote the entire thing.

deathlok: You''ve proven .9 repeating equals 1, but it doesn''t really, right? Very close, but not exactly.

Lutrosis: That''d be a shame.

osmanb: But you could compress _most_ of them, you''re saying?

AP Richard: Then is truly random data possible? Where in groups of 8, all bit pattern occur at the same frequency, and at 5 or 3 or 6 they also occur at all the same frequency?

Pythius: That''d surely kill him.




Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
And sure the header info may be gigantic, but it can be compressed with the rest of the file, since initially it''s small but you compress over and over with different methods each time. As long as the file''s still big enough to have a savings of more than the header, it''s good.


Oh, and somehow I''ve found a way to compress a ''random'' file wherein all byte combinations are represented equally, at least at the three-bit size. Do you agree that I can do this? Do you want me to show how? I believe it was AIG who said using only one bit for the most frequently represented bit pattern, you''d never get enough savings if all of the bit patterns were represented equally. At best, you''d break even or lose a little, unless there was some imbalance. Well, for a 792 bit file, I saved 33 bits or 132 bits, depending on how I went about it. And that 132 bits is definitely enough to cover the header information and still garner some savings.


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack, you can''t compress a file were every pattern occurs with the same frequency, I think I proved it (maybe not as strong as a mathematician would do it) a few pages earlier.
Generally I agree with you if you say that it''s no real argument that billions were spent on research and one man can''t find a better method on his own, but in this case it''s mathematically impossible.

Visit our homepage: www.rarebyte.de.st

GA
Visit our homepage: www.rarebyte.de.stGA

This topic is closed to new replies.

Advertisement