Advertisement

Delimited files - need help

Started by August 22, 2001 02:45 AM
8 comments, last by HairBear 23 years, 6 months ago
Has anyone got a good idea how I can read a delimited file into internal storage without having to analyse each byte for the delimiter ? I only know C at present and I''m hoping to get the information from a text file into a C struct. Cheers, Hair
Hair
what os are u using ??
if windows, use this pseudocode

     HFILE file = CreateFile(...);   DWORD dwSize = GetFileSize(file,NULL);   char * lpFile = new char[dwSize];   ReadFile(lpFile,dwSize);  


for stdio

     FILE f= fopen(..);   fseek(f,0,SEEK_END);   long length = ftell(f);   fseek(f,0,SEEK_SET);   char lp = new char[length];   read(f,lp,length);   

well this pseudocode seems to work. I use the window''s version.

{ Stating the obvious never helped any situation !! }
Advertisement
I''m using Windoze 98 and the pseudo code you gave (thanks BTW) is what I have to read the data into a temporary buffer.

However, I need to get it from there into a struct
e.g.

struct {
char key_field[3];
char some_text[];
} my_struct, *ptr_mystr ;

The problem I have is that the second field is variable length (I assume this is allowed).

Am I going to have to compare each string with the delimiter and use a pointer to place each byte into storage ?


Hair
Hair
hmmm ..thats tricky.
i think you have to check the delimiter. Because of the variable length. If you just read the struct, It will not reflect the correct length.

i have a suggestion.
change the char some_text[] into a string using stl
std::string some_text.

although this may seem more complex, but the stl string can be serialised into a file and read back as though its a normal char array.

{ Stating the obvious never helped any situation !! }
Thanks, I''ll try that.

Hair
Hair
quote:
Original post by jwalker
i have a suggestion.
change the char some_text[] into a string using stl
std::string some_text.

although this may seem more complex, but the stl string can be serialised into a file and read back as though its a normal char array.

Actually, it can''t, and this is something I was complaining about a few weeks back. If you do this in one part of your program:
someFile << myString;
and then this later in the program:
someFile >> myString;
using the same file, same variable etc, you are not guaranteed to get the same string back. This is because the extraction operator stops reading when it hits whitespace. So if you had any spaces or tabs in your string, it gets broken up. Way to go.

If you''re reading in binary mode, you can just store the terminating ''\0'' at the end of the string, and use that as the delimiter with fscanf (or whatever it''s called: I don''t use C stdio) or getline in C++. If you need it to be in text mode, I can only suggest that, before writing every string, you also write an int first indicating how long the string is. This lets your loader know how many characters to expect.
Advertisement
quote:
Original post by jwalker
i have a suggestion.
change the char some_text[] into a string using stl
std::string some_text.

although this may seem more complex, but the stl string can be serialised into a file and read back as though its a normal char array.


quote:
Original post by Kylotan
Actually, it can't, and this is something I was complaining about a few weeks back. If you do this in one part of your program:
someFile << myString;
and then this later in the program:
someFile >> myString;
using the same file, same variable etc, you are not guaranteed to get the same string back. This is because the extraction operator stops reading when it hits whitespace.

If you're reading in binary mode, you can just store the terminating '\0' at the end of the string, and use that as the delimiter with fscanf (or whatever it's called: I don't use C stdio) or getline in C++. If you need it to be in text mode, I can only suggest that, before writing every string, you also write an int first indicating how long the string is. This lets your loader know how many characters to expect.



if you want to use std::string, and you do, and you want a line, then you can use getline(istream,std::string), it's overloaded. This still leaves you with the problem that you loose random access to your file, in other words, no seekg's.

There is no good way around this, The best way is to have two files. A file of fixed length records, in each record you have an offset and length into another file that contains the variable length strings. It is generally worth your time to grab as many of the fixed length records as possible, before attaching the variable length "memo" fields. Seeks in general, but particularly on alternating files, require moving the heads. This can really start to suck if you are reading from dvd/cdroms.

The whole technique can also be used to create an index into a text file. Have a file of fixed length records that consists solely of offset,length pairs. Then you can seekg the records, read a large, fixed length chunk out of your text file, and then point stringstream at that chunk. The rough C equivilent would be sscanf on the chunk. It's also relitively trivial to have your program compare the dates of your text file and index, and re-generate the index automatically.

Optomizing load times is generally the last thing you want to do when you are making a game. Not that I don't wish people would spend more time on it. You will save much more programmer time while developing by having your data in a simple form that you can edit with the same exact program that you are editing your source files.

Edited by - grib on August 23, 2001 2:46:26 PM
yup yup .

i think getline(isstream,std::string) will work.
I nvr used variable size string with files because of this problem. I always use fix size .. make''s things a whole lot easier. But it waste''s memory sometimes.. I think HairBear will be mighty confused now .. hehehe



{ Stating the obvious never helped any situation !! }
Have you looked at the strtok() function.
If your file is text, then strtok() will parse your buffer and return a token each time it encounters a delimiter character.

Shawn
When using the Windows calculator program, always remember to clear any values from memory before exiting to prevent burn-in.
Thanks for all the advice. I am currently looking at strtok() as an option. However, someone has also suggested I include the file I need to read into my .exe as a header file. That way I don''t even need to worry about reading the information in - I can just assign each item of data directly to the relavant field.

i.e.

struct {
char field1[6];
char field2[];
} my_struct = { "woohooo","this works" };

Crude, but effective


Hair
Hair

This topic is closed to new replies.

Advertisement