Advertisement

freading/writing structs - Dangerous?

Started by April 18, 2000 09:39 PM
4 comments, last by C=64 24 years, 6 months ago
I''ve been going through some source recently, and I find them ''fread''ing in structs with alarming frequencies (BMP and MD2 header''s in these cases). Isn''t that making some rather dangerous assumptions, considering compiler optimizations, language/tool variations (including between versions), and platform particulars? Or am I just being paranoid? p.s. And for the record, there aren''t any pragma''s setting the packing alignment in any of these various bits of code, either (not that *that* is a very portable mechanism).
I suppose it depends on what they''re fread-/fwrite-ing. Some of the things that could go wrong:

* pointers! (eeek!)
* byte-swapped data (big- vs little-endian issues)
* compiler padding (what''s the "natural" alignment? 2-byte, 4-byte, 8-byte?)
* what else???

Of course, if they have no intention of cross-platform compilation, or using different compilers, or different versions of either platforms or compilers, then they may be safe.

So you''re not paranoid; just understand how they''re using the data, and fix it if it''s broke.


---- --- -- -
Blue programmer needs food badly. Blue programmer is about to die!
Advertisement
Yes, I think you''re being a little paranoid in these given cases. Generally, publicly available graphics formats have standardised sizes for their headers, etc. For instance, PCX has a 128-byte header. Providing you set up your struct correctly (ie. according to the specification) then you can just read into it. Padding issues shouldn''t really matter - it will either always work (you read in the right number of bytes), or it always won''t (you didn''t). So the developer will tweak it until it does. It may not work on another compiler and/or machine, but that doesn''t matter - source doesn''t have to be portable. If it works on the developer''s setup, the padding won''t change when he or she distributes the executable, therefore it should always work.

As for byte-orders... this is exactly one reason why you should -not- write out individual data members... as writing out an int on an Intel machine will put different stuff into the file to what writing out an int on a Motorola machine would. This means you would need to know what platform a file was saved on to be able to read them. Better to just read and write out raw bytes, and let the program dealing with the graphic work out which is the most significant byte and so on. Besides (and, yes, it is an assumption, but since people usually distribute their own graphics with their own .exes, it usually doesn''t matter), BMP, PCX, and many other common formats are only widely used on 32-bit Windows/Intel machines anyway, so these potential issues will almost never arise.

Kylotan (currently trying to understand the basic-but-badly-worded PCX format )
First off, remember that PCXs are padded to an even scanline length if they’ve got an odd width; 4 years ago when I was playing around with them, that issue plagued me to no end. I had borrowed a friends (very bad) Guru’s book to use as a reference on PCXs. You’d think LaMothe would have mentioned SOMETHING about that in his code. Just goes to show you; when in doubt, go for the official specs.

Also, I hope I didn’t sound like a complete newbie here, as the responses are tilted in that direction. I’ve been coding for over (pulling out slide-rule) 10 years, since the Commodore 64 was high end!

I am WELL aware that graphic formats have standardized header sizes. I am well aware that they always store information in a pre-defined structure. What I was iffy on was how to structure code such that I need not worry about byte alignment in my own code if I were to fwrite/fread my structs for quick parsing without resorting to less-than-portable pragma’s.

I am well aware that documentation generally covers such issues as how to define a structure and the order information is stored. But sometimes documentation is not an option. For MD2 files, for example, I am unable to find any documentation on the format. The only information I have to work with is the source for an MD2 viewer that Chumbalum so graciously released.

I am well aware that once a file reader is working, padding is not an issue (well, for most binary files, anyhow). But getting to that all-so-dandy state is always the struggle, isn’t it? I pulled an all nighter working on my MD2 reader for my graphics engine instead of working on my web server due a few days from now, and the *(#@ thing STILL isn’t working!

I’m REALLY uncomfortable with your “source doesn''t have to be portable” statement. First, considering I’m using other source as a reference to a file format, I sure hope it *is* portable! That, and I’m planning on porting my app to Linux after I’m done, as a simple experiment in graphics code and portable coding style. Heck, it’s not like I don’t have any unix experience (Purdue’s CS department just LOVES Solaris).

Another rather dangerous claim you appear to make is that NOT writing out individual data members will somehow keep edian demons at bay. I really don’t see the sense in that statement; I mean, I AM writing the program that’s forced to deal with them, so I am dealing with them in the end!

In summation, I just wanted to defend my coding honor. I hope I didn’t sound rude or angry. My original post was worded rather plainly, and I failed to mention my curiosity on how to safely achieve similar behavior for my own structures. Next time, I’ll be more blunt.

Good luck with your PCX code Kylotan. If you need, I can always dig up my old Watcom code (it’s around here somewhere…)

C=64

p.s. Anyone know where I can find a good reference on MD2 files? I’ve reached the limit of what OBJ’s files can due for me…
quote: Original post by C=64

I am WELL aware that graphic formats have standardized header sizes. I am well aware that they always store information in a pre-defined structure. What I was iffy on was how to structure code such that I need not worry about byte alignment in my own code if I were to fwrite/fread my structs for quick parsing without resorting to less-than-portable pragma’s.


If you read and write things a byte at a time, then there will be no reordering. It''s when you try writing ints that you may find things reordered for you.

quote: I am well aware that documentation generally covers such issues as how to define a structure and the order information is stored. But sometimes documentation is not an option. For MD2 files, for example, I am unable to find any documentation on the format. The only information I have to work with is the source for an MD2 viewer that Chumbalum so graciously released.

Then I guess it''s a case of trial and error then Check a ''valid'' file with the expected results, and that is how to do it. If the same code produces erroneous results on a different platform, then you know to change it Sounds stupid and obvious, I know, but essentially, it works until it doesn''t Ok, so if the specs say to write out integers, then you could have some issues - basically, you assume that it is written in the byte order on that format''s predominant platform, and readers on less common platforms will have to be able to read that info and swap bytes internally.

quote: I’m REALLY uncomfortable with your “source doesn''t have to be portable” statement. First, considering I’m using other source as a reference to a file format, I sure hope it *is* portable!


Sure, but 99% of source isn''t made to be passed around, and especially not to be used on different architectures. Generally source code is there to be compiled and run Portability is a cool thing to have, but it''s a waste of time to pursue it when you know you are only releasing for 1 platform and do not anticipate releasing source code.

quote: Another rather dangerous claim you appear to make is that NOT writing out individual data members will somehow keep edian demons at bay. I really don’t see the sense in that statement; I mean, I AM writing the program that’s forced to deal with them, so I am dealing with them in the end!


Let me rephrase - not writing out individual members, and instead writing headers out byte by byte, eliminates endian issues -in the file-. You will still have to deal with it once you''ve read it in. But if you write out individual members directly, and those members include integers, you will find them getting swapped around, thus technically producing invalid files.

quote: Good luck with your PCX code Kylotan. If you need, I can always dig up my old Watcom code (it’s around here somewhere…)


The specs are badly written! And in ancient C code. And make no mention of endian-ness either, heh. So I studied the relevant field of a PCX that I knew the size of.
While pragmas are not standardized, I''ve yet to find a moderm compiler that didn''t support the pragma pack() feature. All optimizing compilers need to have some way to turn off padding.

Rock

This topic is closed to new replies.

Advertisement