Advertisement

Binary Random Access Files

Started by August 27, 2001 07:25 AM
5 comments, last by PhilHalf 23 years, 5 months ago
I am reading through C++: How to Program, and I am up to the chapter on File I/O. The examples it gives for sequential files work fine, but the random access files are causing me problems. I have coded the examples given and I can create a file. When I try to fill that file with data from a struct, the file pulls information from other places. I have had my file paths for MS Office included in the files, information from an ini file and most recently a gif header!! The file is being created with the line: ofstream outCredit("credit.dat", ios::binary); and the information is being entered from the clientData struct with the lines: outCredit.seekp((client.accountNumber - 1) * sizeof(clientData)); outCredit.write(reinterpret_cast(&client), sizeof(clientData)); I have tried compiling on both VC++ 6 and DJGPP with similar results. Has anyone come across this problem before? Thanks for any help. PhilHalf
I have tried again without using ios::binary and it is still doing the same.
Please can someone give advice on what might be doing this?
Surely it''s not meant to do it?

Thanks for any help.

PhilHalf
Advertisement
Man, that is definately not right. I''m at a bit of loss as to explain why it''s happening though, I use almost identical code, and it works fine for me. Try using a different API, _lwrite() for instance.
If at first you don't succeed, redefine success.
I''ve had this problem before, it seems to happen when you write at say the 30th byte in a file thats 1 byte long. It just makes the file bigger as it should, and fills in the gap with random data(looks like cached data or something, its usualy from programs i have running).
Well, the only thing I can think of is the structure packing that the compiler performs. See, the compiler will try and align the structure in memory for more effecient access. That means that while your structure may appear to have a size of, say, 18 bytes, the sizeof() operator might return a size of 32 bytes, to compensate for compiler padding. There are a couple of ways to get around this. First, you can use a #pragma around your structure definition, like so:

#pragma pack ( push, 1 )

struct clientData {
// put clientData members here
};

#pragma pack ( pop )

The first #pragma is telling the compiler to take the current packing value, save it on the stack, then then align to 1 byte, which will not pad your structure at all. The second #pragma pops the old packing value off the compiler''s internal stack to restore it.

The second thing you can do is to manually calculate the size, in bytes, of the clientData structure, and then write out each member individually:

struct clientData {
int accountNumber; // 4 bytes
char name[64 + 1]; // 65 bytes
int balance; // 4 bytes
};

...

outCredit.seekp( (client.accountNumber - 1) * 73 );
outCredit.write( (const char*)(&client.accountNumber),
sizeof(client.accountNumber) );
outCredit.write( (const char*)(&client.name[0]),
sizeof(client.name) );
outCredit.write( (const char*)(&client.balance),
sizeof(client.balance) );

The danger with this approach is found in the seek statement: you''ve hardcoded the size of your structure, so if the size of your struct changes, then you have to go through all of your code and hope you don''t miss an instance. Slightly better would be to

#define SIZEOF_CLIENTDATA 73

and then use

outCredit.seekp( (client.accountNumber - 1)* SIZEOF_CLIENTDATA );
outCredit.write( (const char*)&client,SIZEOF_CLIENTDATA );

but I would recommend using the #pragma directive, if your compiler supports it, since it''s less of a headache.

Hope that helps.

-liquid
Taz: It does seem to be cached data. I''ve just experimented with it to be a bit more specific, and it filled it with a book review from Amazon!
You said you''ve had the problem before. Did you manage to sort it out?

Liquid: I''ve just tried some different variations of input and if I don''t put anything in account number 1, it gives me problems.
Also, if I do fill account number 1, I can fill up to (and including) account number 14 without any problems.
In the structure, there is 1 int for the account number, 1 character array with a length of 10, 1 with a length of 15 and a float for the balance. Unfortunately I don''t know the type sizes, but does anything jump out at you as to why it does this with these numbers and types?
Is there any way of filling in the gaps with a blank structure?
I don''t know, maybe fill a file with blank structures and then just overwrite the ones you use? I''m only going to use up to 100 accounts, so is this possible/practical?

Thanks for all your help so far.

PhilHalf
Advertisement
Make sure you''re initializing the data members of the clientData structure, perhaps creating a function like so:

void InitClientData( clientData& cd ) {
cd.accountNumber=1;
cd.balance=0.f;
memset( cd.charArray1,0,sizeof(cd.charArray1) );
memset( cd.charArray2,0,sizeof(cd.charArray2) );
}

Also, make sure that you aren''t trying to store more data in those char arrays than will fit. In charArray1, you can only store 9 characters (charArray1[0]...charArray[8]); the last character must always be a trailing NULL character, that is, charArray[9]=''\0''; Likewise, charArray2 can hold only 14 characters (charArray2[0]...charArray2[13]), charArray[14]=''\0''; Visual C++ will, when in debug mode, add a couple of extra entries to the beginning and end of your arrays so that if you go out-of-bounds, you don''t crash.

Here''s another thing. fstream::seekp() will seek FORWARD if you give it a positive number, and backwards if the amount is negative. You can control the position from which the seek is performed by specifying one of the following flags:

ios::cur - seek from the current position
ios::beg - seek from the beginning of the file
ios::end - seek from the end of the file

It seems like you''d want to use the following:

outCredit.seekp( (client.accountNumber - 1)*sizeof(client),
ios::beg );

and the version that you''re calling (with the single parameter) seeks from the CURRENT position, I believe. Therefore, your structures probably aren''t located in their correct file positions. Obviously, you would also need to make sure that there was no accountNumber of zero.

Just for reference, here are the sizes of the atomic data types on a 32-bit machine, using Microsoft Visual C++:

sizeof(unsigned char) - 1 byte
sizeof(char) - 1 byte
sizeof(unsigned short) - 2 bytes
sizeof(short) - 2 bytes
sizeof(unsigned int) - 4 bytes
sizeof(int) - 4 bytes
sizeof(unsigned long) - 4 bytes
sizeof(long) - 4 bytes
sizeof(float) - 4 bytes
sizeof(double) - 8 bytes (I think)

therefore, the size of your clientData structure, without the padding that the compiler adds by default, is:

sizeof(int) * 1 = 4 bytes +
sizeof(char)*10 = 10 bytes +
sizeof(char)*15 = 15 bytes +
sizeof(float)*1 = 4 bytes

total: 4 + 10 + 15 + 4 = 33 bytes.

The compiler (Visual C++) has a default alignment of 8 bytes, so without using the #pragma pack directives, the sizeof() operator will probably return a size of 40 bytes. Therefore, if you want to just write the whole structure in one go, make sure to use the #pragma directives.

This topic is closed to new replies.

Advertisement