TheHuff: Aha! You changed something. Now I see what you''re getting at, and I agree.
Oh, dingdingdingding... I think I''m swaying. I have a headache.
AIG, so those two would be... the only random two? If so, what good are they?
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
MP3-Beating Compression
April 18, 2000 08:26 PM
dethlok: You''re right that 0.99999 != 1. But it is a mathematically proven fact that 0.99999... == 1. For more information, see recent posts to sci.math
this is my first post in this discussion, but i read all the previous messages. i just want to tell everybody my opinion on compression and i don''t care what other people think about it, i also believe that everybody should support kieren_j.
i think that kieren_j''s bit pattern idea is pretty cool, because those patterns can differ in size and can also be generated by the compressor at runtime (by a ''for''-loop. tell me the beginning and the end, and i will give you a lot of bit patterns ;-) ).
there is a lower limit (sizewise) to which data can be compressed, and kieren''s idea could achieve compression down to this very limit. i believe that that''s what everybody is talking about. there will never be a way to compress data down to 1 bit, because for one bit, there are only two different states.
you pigeonhole discussion also does not lead to anything. of course it is not possible to represent 8 bits by 7 bits without losing data. but it is always possible to find a pattern (even in data that has already been compressed). they will be harder to find however.
//.athias
___________________________________________
//.athias .\\üller mmathias@magnet.at
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
i think that kieren_j''s bit pattern idea is pretty cool, because those patterns can differ in size and can also be generated by the compressor at runtime (by a ''for''-loop. tell me the beginning and the end, and i will give you a lot of bit patterns ;-) ).
there is a lower limit (sizewise) to which data can be compressed, and kieren''s idea could achieve compression down to this very limit. i believe that that''s what everybody is talking about. there will never be a way to compress data down to 1 bit, because for one bit, there are only two different states.
you pigeonhole discussion also does not lead to anything. of course it is not possible to represent 8 bits by 7 bits without losing data. but it is always possible to find a pattern (even in data that has already been compressed). they will be harder to find however.
//.athias
___________________________________________
//.athias .\\üller mmathias@magnet.at
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
___________________________________________ //.athias .üller mmathias@magnet.at¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Quote----------------------------------
I say that the only files that can't be compressed at all are the ones that are too small to contain any pattern large enough to weed out with good results. But then I've also said that those that are that small, nobody would need to compress anyway.
--------------------------------------
It can be shown fairly easily that files compressed with huffman encoding have no patterns large enough to compress it further. It is, in many senses, completely optimal.
Quote---
Also, you speak of using a universal algorithm, and with that, perhaps your statements are true. But if your algorithm is an incorporation of many methods, including rearranging and masking and such, just about anything could be compressed.
-----------------------
Sorry, but an algorithm that incorporates many methods, no matter what the methods, is still ONE algorithm. And you agree that a unversal algorithm cannot compress every file, so where's the problem?
Here is the problem with your 'array of methods' approach. You are increasing the header size so that you know which 'method' you used to compress. Any savings you get will be more than offset by the header. Period. You act like this is a new idea and that people that wrote all those articles didn't have this in mind also, but they did.
Quote----------------------
Someone even said I was a fool because I'm Christian
---------------------------
I believe he said you were a fool becuase you believed this BS and he was referring I believe to your belief in the unconventional creationism.
Quote---------------------
Since no-one had mathematics to back up the statement that data cannot be compressed repeatedly and were making vague statements as I was, I thought I'd introduce an analogy
--------------------------
I have some idea of what you are analogizing, but DNA still operates under the laws of information theory, and I have done quite a bit of reading on this. You have a minor point here, it is believed that every speicies decodes DNA differently, so that the same DNA sequence could code a completely different animal, BUT the way to decode the sequence is also encoded by the species. So, there is no magic compression in DNA either. And, there is definately no seed's within seeds as you earlier mentioned (I have read every one of your posts).
quote:
--------------------------------------------------------------------------------
The sign of a flawed compression algorithm is one that continues to compress a file after being run more than once.
--------------------------------------------------------------------------------
Says who? Why not? This is where the dna/seed analogy comes in.
---------------------------
I have no clue how the DNA/seed anaolgy comes in here. I say it is flawed becuase it is not optimal if it can be further compressed.
Quote---------------------
More like vague scientific opinions versus vague scientific opinions. The faith-only (at least those who were on topic) stopped posting awhile back.
---------------------------
Actually, I was referring to you. You have no scientific backup, while the rest of us have plenty. You have wild and uneducated guesses. I say uneducated becuase it is fairly obvious that you have not worked with compression before. I'm trying my best not be vague, and there are many posts that spell out why in great unvague detail. As do the posted web pages. Definately scientific v hopeful faith.
You act as if, after billions are spent on compression research, this very not-new idea of an array of methods will somehow change everything. Sorry, not going to happen.
Could someone tell me how to use the quote flag? Thanks,
Mike
Edited by - Vetinari on 4/18/00 8:56:43 PM
I say that the only files that can't be compressed at all are the ones that are too small to contain any pattern large enough to weed out with good results. But then I've also said that those that are that small, nobody would need to compress anyway.
--------------------------------------
It can be shown fairly easily that files compressed with huffman encoding have no patterns large enough to compress it further. It is, in many senses, completely optimal.
Quote---
Also, you speak of using a universal algorithm, and with that, perhaps your statements are true. But if your algorithm is an incorporation of many methods, including rearranging and masking and such, just about anything could be compressed.
-----------------------
Sorry, but an algorithm that incorporates many methods, no matter what the methods, is still ONE algorithm. And you agree that a unversal algorithm cannot compress every file, so where's the problem?
Here is the problem with your 'array of methods' approach. You are increasing the header size so that you know which 'method' you used to compress. Any savings you get will be more than offset by the header. Period. You act like this is a new idea and that people that wrote all those articles didn't have this in mind also, but they did.
Quote----------------------
Someone even said I was a fool because I'm Christian
---------------------------
I believe he said you were a fool becuase you believed this BS and he was referring I believe to your belief in the unconventional creationism.
Quote---------------------
Since no-one had mathematics to back up the statement that data cannot be compressed repeatedly and were making vague statements as I was, I thought I'd introduce an analogy
--------------------------
I have some idea of what you are analogizing, but DNA still operates under the laws of information theory, and I have done quite a bit of reading on this. You have a minor point here, it is believed that every speicies decodes DNA differently, so that the same DNA sequence could code a completely different animal, BUT the way to decode the sequence is also encoded by the species. So, there is no magic compression in DNA either. And, there is definately no seed's within seeds as you earlier mentioned (I have read every one of your posts).
quote:
--------------------------------------------------------------------------------
The sign of a flawed compression algorithm is one that continues to compress a file after being run more than once.
--------------------------------------------------------------------------------
Says who? Why not? This is where the dna/seed analogy comes in.
---------------------------
I have no clue how the DNA/seed anaolgy comes in here. I say it is flawed becuase it is not optimal if it can be further compressed.
Quote---------------------
More like vague scientific opinions versus vague scientific opinions. The faith-only (at least those who were on topic) stopped posting awhile back.
---------------------------
Actually, I was referring to you. You have no scientific backup, while the rest of us have plenty. You have wild and uneducated guesses. I say uneducated becuase it is fairly obvious that you have not worked with compression before. I'm trying my best not be vague, and there are many posts that spell out why in great unvague detail. As do the posted web pages. Definately scientific v hopeful faith.
You act as if, after billions are spent on compression research, this very not-new idea of an array of methods will somehow change everything. Sorry, not going to happen.
Could someone tell me how to use the quote flag? Thanks,
Mike
Edited by - Vetinari on 4/18/00 8:56:43 PM
"Unintentional death of one civilian by the US is a tragedy; intentional slaughter of a million by Saddam - a statistic." - Unknown
As for .9999...== 1, I did agree that it was true, so long as we were talking about .9 repeating, not just .9999 *grin*
Perhaps I just misread the original post.
And a nod to Vetinari for pointing out the flaw in the DNA argument before I could.
Deathy
Perhaps I just misread the original post.
And a nod to Vetinari for pointing out the flaw in the DNA argument before I could.
Deathy
mmathias, you can''t compress every of the possible n bit patterns - it''s n bit of information, no more and no less.
And you can''t compress what we call random data - data without evident patterns. Look at my definition of random data. Sure, we can find only approximations - but approximations which have most of the characteristics of my "idealized" random pattern.
Visit our homepage: www.rarebyte.de.st
GA
And you can''t compress what we call random data - data without evident patterns. Look at my definition of random data. Sure, we can find only approximations - but approximations which have most of the characteristics of my "idealized" random pattern.
Visit our homepage: www.rarebyte.de.st
GA
Visit our homepage: www.rarebyte.de.stGA
April 18, 2000 09:44 PM
does this car compression for real? if it aint say so plz, and stop posting. if it is, make a compression program and sell it. we neeeeed better compression programs!!! winzip is not doin the trick anymore
quote: Original post by ga
mmathias, you can''t compress every of the possible n bit patterns - it''s n bit of information, no more and no less.
And you can''t compress what we call random data - data without evident patterns. Look at my definition of random data. Sure, we can find only approximations - but approximations which have most of the characteristics of my "idealized" random pattern.
Visit our homepage: www.rarebyte.de.st
GA
i am not talking about random data, because there is no true random data. every string of chars can be expressed by an equation. let''s look at a very simple function...
for(int i=0;i<=10;i++);
the sequence i would get from this "equation" is:
0 1 2 3 4 5 6 7 8 9 10
now instead of storing this sequence of bytes, you can simply store the source function. and please don''t tell me that there is data without evident patterns. you will always find a pattern - even in your so-called pseudo-random data! define random! you can''t, can you? ;-)
___________________________________________
//.athias .\\üller mmathias@magnet.at
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
___________________________________________ //.athias .üller mmathias@magnet.at¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Y''know, we haven''t heard from kieren_j in a while. I wonder if he''s encountered a little snag? =) I certainly hope that he at least makes a post saying he was wrong, so this thread can be closed, and we can stop seeing the link to the compression faq and hearing about pigeons. I don''t think I''ve ever heard so many people repeat themselves so many times, in so many different ways. I''ll admit though, it *has* been fun, and I learned a bit about compression. And pigeons too.
-Lutrosis
-Lutrosis
-Lutrosis#define WHOOPS 0class DogClass {public: CDog() { printf("Ruff!"); } Run() { printf("Run!"); } Crash() { printf("%d",100/WOOPS); }};DogClass CDog;CDog.Run();CDog.Crash();
You want me to define random? Fine. A random bit stream is a sequence of binary values, every one of which has the value 0 with probability 0.5 and the value 1 with probability 0.5 Furthermore, all of those probabilities are independent. A "random file" is a finite random bit stream. That wasn''t so hard, was it? And no, you can''t write sequence generators for every N bit sequence that will produce the sequence and which are also smaller than N bits. Period.
Lack: Just to clarify, I hope the above makes some more sense (at least our definition of random file). I think I''m beginning to understand why you were not agreeing with us, and where some confusion came from. When we say random file, we don''t mean that the data must have certain properties, or be one of a certain set of files... The random bit stream described above is just as likely to produce a file containing nothing but 0s as it is to generate any other file. That''s the real problem... you have no idea a priori (ahead of time) what types of files you''re going to see, so you can''t write your algorithm with them in mind. Thus, you can produce compressors that work very well on a small subset of files of a given length (those with long repeated sequences, for example) but if you look at every possible file of length N, you realize that most of them don''t have any such property. In the end, you can''t produce a compression algorithm that compresses every file of a given size. You can even pick which two files it won''t be able to compress, but you still have to make that decision. (And at this point, your program has used EVERY bit pattern of less than N bits to represent a file of length N, so you can''t actually compress anything shorter than that either.)
-Brian
Lack: Just to clarify, I hope the above makes some more sense (at least our definition of random file). I think I''m beginning to understand why you were not agreeing with us, and where some confusion came from. When we say random file, we don''t mean that the data must have certain properties, or be one of a certain set of files... The random bit stream described above is just as likely to produce a file containing nothing but 0s as it is to generate any other file. That''s the real problem... you have no idea a priori (ahead of time) what types of files you''re going to see, so you can''t write your algorithm with them in mind. Thus, you can produce compressors that work very well on a small subset of files of a given length (those with long repeated sequences, for example) but if you look at every possible file of length N, you realize that most of them don''t have any such property. In the end, you can''t produce a compression algorithm that compresses every file of a given size. You can even pick which two files it won''t be able to compress, but you still have to make that decision. (And at this point, your program has used EVERY bit pattern of less than N bits to represent a file of length N, so you can''t actually compress anything shorter than that either.)
-Brian
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement