Advertisement

MP3-Beating Compression

Started by April 06, 2000 01:58 PM
494 comments, last by kieren_j 24 years, 8 months ago
quote: Original post by Pythius

Actually, JPEG and MP3 *DO* compress the data, but you are right that they rely on the fact that humans can''t notice the difference to a certain degree. Each requires this, because it limits how much they actually need to compress (meaning fewer bits per symbol). JPEG first breaks the RGB into Chrominance/Luminance. Since we don''t notice the difference in luminance as much, they first of all take this down to 1/4 the size of the actual image and stretch it at a loss of quality. Then it uses a DCT (discrete cosine transform) to get the symbols in a proper order in blocks of 8x8. This is compiled in a table, which is then passed into an RLE compressor, since that is optimal on these tables.



This isn''t quite correct - first humans eyes are less sensitive to chrominance information then luminance information, so in the JPEG algorithm the chrominance information gets downsampled, and the luminance information is left untouched.

Secondly each channel of the image (one luminance, two chrominance) is compressed seperately. Each channel is cut into a grid of 8x8 blocks. Each block is then processed with a DCT transform which transforms the data from the spatial domain into the frequency domain. Then the frequency data is quantized: low frequencies finely, higher frequencies more coarsely. In addition JPEG uses different quantisation factors (called quantum) depending on the channel being quantised. It is this quantisation process which is the main loss of quality in the image.

Once the quantization process is complete the block is encoded into a bitstream using a static frequency table (generated using statistical averaging of the symbol frequencies from thousands of tests).

On a side note - throughout the DCT and quantisation process the data is always in an 8x8 block. However when it is converted into a bitstream it is readout in a zig-zag fasion starting at the lowest frequency and ending on the highest frequency. The idea is after quantisation most of the high frequencies will be zero and the bitstream for this block can be terminated early with a special code.

I am very sceptical about this CAR compression - this kind of thing comes around about once every two years. From what the original poster has said, that the CAR compression simple massages the data so that it can be compressed better then it sounds very much like the Burrows Wheeler Transform. The BWT shift symbols redundancies from high frequencies into lower frequencies making it easier to compress the data.

Chris


Chris Killpack : ckillpack@eaeurope.com
Technology Group : Bullfrog Productions Ltd
Chris Killpack : ckillpack@eaeurope.comTechnology Group : Bullfrog Productions Ltd
quote: Original post by Harvester

If i recall well, it can save a wav file that is 60MB, in some kbytes!!!!


I can save one in just a few bytes, if it''s a sine wave

Advertisement
When''s the demo coming out? Do you have a web site?


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
First - i appreciate everybody who quoted on information theory , they are quite correct. Before you start with groundbreaking inventions, its good to know, what science says about your topic, and what other people have already done in this area. Prepetum mobile has been reinvented more times, than there are different car models in the world. none of them has worked this far, to extent of my knowledge.
But, theres a question -> is interstellar travel possible ? couple hundred years ago everybody would have said, there IS NO INTERSTELLAR TRAVEL. Couple dozens years ago everybody said -> practically not possible, because of amount of time, consumed energy etc. Now there are theories ... I know, its physics question, and math is totally different subject, but remember just HOW YOUNG information theory is. What if Kieren happens to be informatics Einstein ? Remember, nothing is finite, everything is possible.

1=0.999.....
Mathematically proven fact.
-kertropp
-kertropp C:Projectsrg_clueph_opt.c(185) : error C3142: 'PushAll' :bad ideaC:Projectsrg_clueph_opt.c(207) : error C324: 'TryCnt': missing point
Yes! Exactly! Thank you.


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
Finally decided to weigh in here. The Master''s of Engineering in Engineering Mathematics and Computer Science (whew!) finaly pays off. Yes!

1) As my message icon indicates, I''m skeptical of the claims for this new algorithm. I got $5 saying that kieren_j has not invented a super duper new general purpose compression algorithm. I think he has made some fundamental flaw in his thinking that his experience and training has failed to catch. If he can''t find a way to have it independently verified (i.e. a non-disclosure agreement) by someone who does have the experience and training (it ain''t me without alot of work) we may never know what his mistake is.

2) To kieren_j, I am willing to sign a NDA that does not allow me to reverse engineer your algorithm or disseminate information about how your algorithm works. It must allow me to publish the results of the testing in this forum (if it works I''ll do an article for the site if they wish to post it). You will need to provide a compression program and a seperate (this is important) decompression program to me which I will destroy when testing is complete. I will then be able to verify your claims. Hell, you get the $5 if I find that it really works though that won''t be much after you get the millions.

3) If my age(34) addled brain remembers correctly straight Huffman encoding using the optimal substitution table (optimal is one built from the actual frequency of occurence of each symbol in the original data set) is mathematically provable to result in the maximum compression ratio with no loss of data. A lossless algorithm is needed to compress executables, text, word documents, spread sheets, financial data, etc., etc., since it is imperative to get the original file with no error (though for a Word document you could trade tabs for groups of spaces or visca versa ).

4) Lossy (I believe this is the correct term) compression algorithms cannot recreate the original data from the compressed data. The usual route is to identify uneeded or less important information and either remove it or to reduce its bandwidth. These algorithms are great for media where the human brain will ignore (or sometimes fill in) missing data. But look at a DVD frame by frame and artifacts (i.e. errors) are easy to spot. Sometimes they''re easy to spot just watching in real-time.

5) Our brains actually do this ignoring (or selectively paying attention) all the time. Look around you. Appears to be a continuous image (i.e. no framing). Has color. You can see in dim light or bright light. But the reality of our hardware is that the retina is composed of little individual pickups (a hell of a lot of them). Some pickups see color, some see in the dark better. Of the color ones, sensitivity to red, blue, and green light varies. The brain actually does frame the processing, but unlike a monitor where the brain is paying attention between frames inside a brain it essentially ignores the time between frames. And like (and unlike ) a video card the brain uses a pipeline to efficiently process the information. Dots are assembled into little lines. Little lines make bigger lines. Bigger lines make edges and intersections. Edges and intersections make up a line drawing (sort of!). And then the brain textures the image with color near the end of the process. From all this we get what we consider a nice, clean, smooth image. Now look around again and think about what your really seeing. Muhahahahaaaaaaaaaaa!

6) Gawd, I ramble!

7) There is no 7.

Michael L. Roberts
aka milo
mlbobs@telocity.com
Advertisement
It''s just like molecular structure: carbon in one form is diamond, which is tough to compress, but in another form, it makes people, which comparatively are soft and squishy.

Seriously though, if you test the file against say 32 bit-reordering patterns to see which ones make it the most suceptible to traditional compression, then find a pattern of these patterns (maybe of a possible 16 of the most frequent), then find a pattern to those patterns, you maybe use the first two bytes describing which scrambling patterns to use. Of course, the less hit-and-miss you get, and the bigger the file, it will take a LONG time to find a good scrambling pattern-within-a-pattern-within-a-pattern. But once you''ve found it, compression is as fast as usual (since the next step is to run it through WinZIP or whatever).

Once the user gets the file, decompression time is also normal. But descrambling time is really fast, since it doesn''t have to figure out which pattern it needs to use. Do you see what I mean?


Lack

Christianity, Creation, metric, Dvorak, and BeOS for all!
Lack
Christianity, Creation, metric, Dvorak, and BeOS for all!
It''s amazing how much of a dick jealousy can make people. That''s amazing man. I totally believe you, I just get this sense of sincerity from how you write. Congratulations.
Y''know, just because his claims may not be true doesn''t mean he''s lying. I get a sense of sincerity as well. However, I''ve also got this feeling that he''s sincerely wrong. Unfortunately, math seems to be against him being correct. When an entire field of mathematitians who have spent years studying something like this (with full understanding of the possible rewards should they make a big discovery), it carries some weight. I simply can''t throw all that out the window because someone makes a post to a message board. I''ll be a skeptic until I see it happen.

-Lutrosis
-Lutrosis#define WHOOPS 0class DogClass {public: CDog() { printf("Ruff!"); } Run() { printf("Run!"); } Crash() { printf("%d",100/WOOPS); }};DogClass CDog;CDog.Run();CDog.Crash();
Well. Im no mathematician, but heres how I see the problem:

I will simplify a uncompressed file to 4bytes, and the compressed file to 2bytes (50% compression). I will then generalise that the same rule applies for n byte files being compressed.

4 bytes yields (off-hand) 4Gb of possible combinations.
2 bytes yields 65536 possible combinations.

Each one of the 65536 combinations can only represent ONE expanded combination (or the decompression would be non-deterministic).

Thus, the only way to account for the entire 4gb of combinations is to have.. 4 bytes representing them (which would be the same size as the original file anyway).

Yes, one can choose which of the 4gb of combinations you wish to represent in 65536 combinations... but if you did not have a fixed set within the 4gb that you were representing, you would need somewhere to store a value determining which set of 65536 you were representing.

I havent done the maths, but Im willing to bet that any number big enough to tell which set of 65536 you are representing (allowing all of the 4gb of combinations to be represented), the number would take the space required to store the compressed file up to the same size as the original.





regards,

GeniX
regards,GeniXwww.cryo-genix.net

This topic is closed to new replies.

Advertisement