File handling in Delphi Object Pascal

files

With new users purchasing Delphi every single day, it’s not uncommon for me to meet users that are new to the Object Pascal language. One such new user contacted me recently with questions about reading and writing structured data to files on disk.
In actual fact, this customer was quite specific about the file formats of interest.

  1. Flat files of fixed length records with fixed length fields.
  2. Variable length fields / records where the file contains the size of a field, and then it’s data.
  3. Character delimited files such as CSV (comma separated values).

*A warning to advanced readers, this post is not for you.

None of these file formats are all too common anymore. Modern applications tend to use a known standard such as XML or JSON, for which classes are provided with Delphi. I can still see the value in using the older file types however, for the purposes of interoperability with older systems for example. There are also a few lessons to be learnt about file handling which have merit. So lets take a look at a solution to each of these file types.

Flat files of fixed length records.

In order to answer this question, I turned to Delphi Basics:: http://www.delphibasics.co.uk/
Delphi basics is an excellent resource for users new to Object Pascal. It functions as a great reference to the fundamental syntax features and available units and classes. While it is not a tutorial website, I would recommend every new user put this bookmark in their browser!

This article http://www.delphibasics.co.uk/Article.asp?Name=Files contains a section entitled “Reading and writing to typed binary files” which contains an example of working with flat files of fixed length records. I modified the sample slightly to run in the command-line:

program structuredbinary;
{$APPTYPE CONSOLE}
{$R *.res}
uses
  System.SysUtils;

type
  TCustomer = record
    name : string[20];
    age  : Integer;
    male : Boolean;
  end;

var
  myFile   : File of TCustomer;  // A file of customer records
  customer : TCustomer;          // A customer record variable

begin
  // Try to open the Test.cus binary file for writing to
  AssignFile(myFile, 'Test.cus');
  ReWrite(myFile);

  // Write a couple of customer records to the file
  customer.name := 'Fred Bloggs';
  customer.age  := 21;
  customer.male := true;
  Write(myFile, customer);

  customer.name := 'Jane Turner';
  customer.age  := 45;
  customer.male := false;
  Write(myFile, customer);

  // Close the file
  CloseFile(myFile);

  // Reopen the file in read only mode
  FileMode := fmOpenRead;
  Reset(myFile);

  // Display the file contents
  while not Eof(myFile) do begin
   Read(myFile, customer);
   if customer.male then begin
     Writeln('Man with name '+customer.name+' is '+IntToStr(customer.age));
   end else begin
     Writeln('Lady with name '+customer.name+' is '+IntToStr(customer.age));
   end;
  end;

  // Close the file for the last time
  CloseFile(myFile);
  Readln;
end.

In this example you can see that the file ‘myFile’ uses the datatype ‘File of TCustomer’ where ‘TCustomer’ is a record with a fixed number of bytes. The ‘name’ field is twenty characters in length, which in modern Delphi is forty bytes due to the use of UTF-16LE for the string. This is followed by a 32-bit integer for the field ‘age’ and another 32-bits for the boolean field ‘male’ to represent gender.

When using the ‘File of…’ data types, the compiler will assume you are referring to a flat binary file containing nothing but repetitions of the data type which you specify. This is convenient, and particularly useful for records of fixed length which are to be read sequentially.

Files with variable length fields.

The second type of file of interest, is a file with variable length fields. This gives us an opportunity to look at a more modern method of storing data to files, using streams. I took the example from the first file type above, and rewrote it as follows…

program structuredbinarystream;
{$APPTYPE CONSOLE}
{$R *.res}
uses
  classes,
  System.SysUtils;

type
  TCustomer = record
    name : string;
    age  : Integer;
    male : Boolean;
  end;

procedure WriteCustomerToStream( customer: TCustomer; FS: TStream );
var
  strLength: integer;
  idx: integer;
  ch: char;
begin
  // get the length of the name field.
  strLength := Length(customer.name);
  // write the length
  FS.Write(strLength,sizeof(strLength));
  // write the string a character at a time
  for idx := 1 to strLength do begin
    ch := customer.name[idx];
    FS.Write(ch,sizeof(ch));
  end;
  // write the age and gender
  FS.Write(customer.age,sizeof(customer.age));
  FS.Write(customer.male,sizeof(customer.male));
end;

procedure ReadCustomerFromStream( var customer: TCustomer; FS: TFileStream );
var
  strLength: integer;
  idx: integer;
  ch: char;
begin
  // read length of name field.
  FS.Read(strLength,sizeof(strLength));
  //reading back string a character at a time...
  customer.name := '';
  for idx := 1 to strLength do begin
    FS.Read(ch,sizeof(ch));
    customer.name := customer.name + ch;
  end;
  // reading back age and gender.
  FS.Read(customer.age,sizeof(customer.age));
  FS.Read(customer.male,sizeof(customer.male));
end;


var
  FS: TFileStream;
  customer : TCustomer;          // A customer record variable

begin
  // Try to open the Test.cus binary file for writing to
  FS := TFileStream.Create('Test.cus',fmCreate);
  try
    // Write a couple of customer records to the file
    customer.name := 'Fred Bloggs';
    customer.age  := 21;
    customer.male := true;
    WriteCustomerToStream(customer,FS);
    customer.name := 'Jane Turner';
    customer.age  := 45;
    customer.male := false;
    WriteCustomerToStream(customer,FS);
  finally
    FS.Free;
  end;


  // Reopen the file in read only mode
  FS := TFileStream.Create('Test.cus',fmOpenRead);
  try
    while FS.Position<FS.Size do begin
      ReadCustomerFromStream( customer, FS );
      if customer.male then begin
        Writeln('Man with name '+customer.name+' is '+IntToStr(customer.age));
      end else begin
        Writeln('Lady with name '+customer.name+' is '+IntToStr(customer.age));
      end;
    end;
  finally
    FS.Free;
  end;

  // key to finish
  Readln;
end.

In this program I’m using the ‘TFileStream’ class to write to, and then read from the file sequentially.  The ‘TCustomer’ data type now has a variable length string field for ‘name’.  I’ve added two procedures, one for writing a ‘TCustomer’ record to the file, and another to read a ‘TCustomer’ from a file. In each of them, the name field is handled using a loop to read or write one character (two bytes) at a time.

In the WriteCustomerToStream() procedure, I first measure the length of the string (in characters) and write that value to the stream, followed immediately by each individual character. In ReadCustomerFromStream() I am reading the number of characters back from the stream first, and then immediately loading that number of characters from the stream. This is how we allow for the varying length of data for this field.

Using streams to read and write data is a good modern way to handle reading and writing files. Here are some of the reasons why you *should* use streams:

  1. TFileStream is descended from TStream, in my example code above you’ll notice that the procedures WriteCustomerToStream() and ReadCustomerFromStream() take a TStream parameter, not a TFileStream. This allows any descendant of TStream to be used. Instead of writing data to a file, what if you wanted to write it to a database blob field using a TBlobStream class? Well, because those procedures work on the base class TStream, you can simply pass your blob stream class to them. Similarly you might send the data over a network using a network stream class.
  2. The TFileStream class abstracts you from the underlying operating system calls for reading and writing files. This code is therefore portable to other platforms without change (provided the correct implementation of TFileStream is available for that platform).
  3. In the example the TCustomer record could have been a class, and the WriteCustomerToStream() and ReadCustomerFromStream() procedures could have been methods of that class. In fact, renaming these to SaveToStream() and LoadFromStream() respectively, and then adding these methods to a base class, permits for some great structured data nesting options. A similar system is used by the Delphi IDE to save forms to and load forms from files in processes named ‘serialization’ (structured data to stream) and ‘deserialization’ (structured data from stream).

CSV files

Handling CSV files correctly, should be done using streams as in the above example, combined with a simple parser to ensure the CSV format is adhered to. For example, many CSV formats permit commas inside content data under the provision that the content data is surrounded by quotation characters. Some intelligence in the form of a parser is necessary to handle such situations. Having already provided the streaming example above however, parsing the data structure really is another exercise. So for this file I provided the following ‘hack’ method (of course, explaining that it is such)…

program stringlists;
{$APPTYPE CONSOLE}
{$R *.res}
uses
  classes,
  System.SysUtils;

const
  CRLF = #13 + #10; //- CR and LF characters, ASCII 13, 10 in decimal
  TAB = #09; // TAB character

  // content of the file...
  cFileContent = 'a,b,c' + CRLF +
                 '1,2,3' + CRLF +
                 '4,5,6' + CRLF +
                 '7,8,9' + CRLF;


var
  FileContent: TStringList;
  Fields: TStringList;
  idx: longint;
  idy: longint;

begin
  // First we'll save the CSV content from cFileContent into a file 'testfile.csv'
  FileContent := TStringList.Create;
  try
    FileContent.Text := cFileContent;
    FileContent.SaveToFile('testfile.csv');
  finally
    FileContent.Free;
  end;

  // Now load the file back into memory...
  FileContent := TStringList.Create;
  try
    FileContent.LoadFromFile('testfile.csv');
    // Now lets parse the field content of each line of the file...
    for idx := 0 to pred(FileContent.Count) do begin
      Fields := TStringList.Create;
      try
        Fields.Delimiter := ','; // <--- We're using a comma to delimit fields.
        Fields.DelimitedText := FileContent.Strings[idx]; // <-- parse one line of file.
        for idy := 0 to pred(Fields.Count) do begin
          Write(Fields[idy]);
          // if it's not the last field, add a comma..
          if idy<pred(Fields.Count) then begin
            Write(TAB + ',' + TAB);
          end;
        end;
        Writeln; // new line
      finally
        Fields.Free;
      end;
    end;

  finally
    FileContent.Free;
  end;
  // key to finish
  Readln;
end.

This method really isn’t good code!

What I’ve done in this program is to use the properties and methods of the TStringList class to handle saving data to a file, and loading it back. I’ve also used the TStringList class to parse each record using the ‘DelimitedText’ property, which will separate the string by a ‘Delimiter’, in this case a comma. The reason why I call this bad code is that it simply doesn’t take into account the parsing scenarios that I mentioned above. That being said, if you have a very simple CSV format file such as the one used in this sample, this quick-trick method can save you some time doing the parsing yourself.

For beginners to Object Pascal, the above samples should work if you copy and paste the code into a new “Command-Line” project. I didn’t go into every detail, and leave it as an exercise for you to try out, and to study the examples. *hint* Be sure to check out Delphi Basics as a reference! http://www.delphibasics.co.uk/

Thanks for reading!

Print Friendly, PDF & Email
Facebooktwittergoogle_plusredditpinterestlinkedintumblrmail

Leave a Reply