Yes, like others say, "inside" the ZLIB stream is a PNG-filtered representation. The PNG filtered representation is an optimization step that PNG performs on the color data in order for DEFLATE to compress better than it would on the raw color data.
IDAT chunk { ZLIB { PNG Filters (each scanline starts with the filter ID that the scanline is using) { Color data } } }
2. Filter the image data according to the filtering method specified by the IHDR chunk. (Note that with filter method 0, the only one currently defined, this implies prepending a filter-type byte to each scanline.)
And the logic for encoding/decoding with the filters is here:
http://www.libpng.org/pub/png/spec/1.2/PNG-Filters.html
Filtering algorithms are applied to bytes, not to pixels, regardless of the bit depth or color type of the image. The filtering algorithms work on the byte sequence formed by a scanline that has been represented as described in Image layout. If the image includes an alpha channel, the alpha data is filtered in the same way as the image data."
For all filters, the bytes 'to the left of' the first pixel in a scanline must be treated as being zero. For filters that refer to the prior scanline, the entire prior scanline must be treated as being zeroes for the first scanline of an image (or of a pass of an interlaced image).
Unsigned arithmetic modulo 256 is used, so that both the inputs and outputs fit into bytes.
In other words, if you expect your final image data to start with the byte '255', and you have a scanline filter of 1 (sub), then your next byte should be a 1. (0 minus 1 -> modulo 256 -> 255) // NOTE: this is wrong and has been corrected (below)