Sequential access files

Sequential access files were the first digital storage solution, those files present a certain structure to store the data but are still far away from the databases.

Let’s put ourselves at the beginning of the second half of the twentieth century when our protagonist Antonio begins to digitally the data of his business, the SuperGades store. The option available then was simply to pass the data from a paper file to a digital file, using a very simple word processor with very few functionalities, in which we would write the data without any format. This file is also called an archive, taking the name of its analogue antecedent, a physical file with drawers where we kept the files with the data.

If we were storing suppliers data, the file might look similar to:

Smith fruits
Peter Smith
607454545
Southern vegetables
Williams Doors
652854874
Green way Sugar
Elisabeth Roberts
622525885

In this file, we have written the data of our suppliers one after the other, as we would do it in a notebook. We have generated a sequential file, that receives this name because the data access is sequential, so it has to go through the file from the beginning until reaching the data that is wanted to be read.

The EOF mark (End of File)

This way of working forced us to include some kind of mark to indicate the end of the file has been reached. Otherwise, we would get an error when trying to read data beyond the end of the file. This mark is the end character of the file.

Writing data to the file and closing it, provoque this character was automatically added to the end of the file.

Programming languages use the EOF character in data reading processes. They go through the file sequentially, line by line, inside a loop until they read the EOF mark.

Limitations of sequential files

The use of sequential files for data handling has many limitations:

  1. Sequential access: as I mentioned before, access to the data was sequential, we had to go through the entire file from its first line until we reached the data we wanted to recover. This was the only way to move around the file, we always advanced to read the next line, you could not go back, if we wanted to recover data that we had already passed, we had to start the sequential reading from the first line.
  2. Opening modes: When accessing the file to work with it, we have to open it in write or read mode, depending on the operation we want to perform. Thus, we cannot write and read data at the same time, but just one type of operation at each opening. To change operations we have to close the file and open it again in the corresponding mode.

Readings can be partial, as the file is not modified. However, when we open a sequential file in writing mode we are deleting all its previous content and generating it from the beginning. For example, when we want to delete a record, what we do is rewrite the entire file, except for the record we want to delete.

Some programming languages allow writing after the end of the file. A very common command to perform this function is Append.

  1. Exclusive access: Only one user can work on the file at a certain time so so that a new user can access the file, the current user would have to close it before. In other words, multiple users can’t access the file simultaneously.
  2. Fixed structure of fields: The way of working with this type of file, forces us to handle a fixed structure of data. In the example of the file of suppliers we first saved the name of the company, then the name of the contact person and finally the phone number.

When we go through the file, we simply read data, but we do not know what data we are reading. In this way, if we accidentally changed the order of registration, for example by placing the contact first and then the company name, the contact data would be taken as the name of the company and the data of the company name as a contact.

Fields and records

The data storage structure can be defined in fields and records. A field would be a data type and a record would be the set of fields that define an element. Continuing with the example of the supplier file, the fields of our structure would be the name of the supplier, the contact in the supplier and the telephone. A record would be the set of fields that define an element, in our case, a supplier.

To avoid errors with the data,  our friend Antonio decides to make some improvements in the sequential files. It introduces a synchronism mark, that indicates the end of a record. This mark can be any we define, but a commonly used one is: <END>

With this improvement, the provider file would be:

Smith fruits
Peter Smith
607454545
<END>
Southern vegetables
Williams Doors
652854874
<END>
Green way Sugar
Elisabeth Roberts
622525885
<END>

Logically, this mark could never be the value of a field, since it would lead us to error, we would interpret that field as the end of the record.

Physical support

At that time, Antonio would use magnetic tapes to record the data. A tape that was rolled on a roller, and contained magnetic particles to store the data. The tape was unwound and passed through a header for reading or writing, the same system used for the famous music cassettes. The way of operating this system forced access to be sequential, running through the tape from beginning to end.

Magnetic tapes
Magnetic tapes

Sequential File Example (with Visual Basic)

To better understand how sequential access files work, the very best is to experience them on our own. And to do this, we are going to create a sequential file, read and write data in it.

To keep it simple, in order not to have to install any IDE of any programming language, we are going to use VBA (Visual Basic Application) that we have access to it on the classroom computers in the Data Architecture classes. It can be found in Access software within the Microsoft Office Professional package.

The exercise includes 7 steps:

Step 1: Create a database

Create a new database and give it the name you want, for example: BBDD

Step 2: Create a new Module

Go to the tab “DATABASE TOOLS”, click on “Visual Basic”.

Then, in the Insert/Module menu, create a new module. In my case, I leave it with the default name: “Module1”, since we are only going to use it for practice purposes.

Step 3: Program a procedure for data record

The first thing to do is create a new file with the data. For example, with the supplier data that Antonio handled. The procedure that would perform this task would be:

Sub RecordingInSequentialAccessFile()

On Error GoTo e

ChDrive ("C")
ChDir "C:\test"

Open "Suppliers.txt" For Output As #1

'recording suppliers data
Print #1, "Smith fruits"
Print #1, "Peter Smith"
Print #1, "607454545"
Print #1, "<END>"
Print #1, "Southern vegetables"
Print #1, "Williams Doors"
Print #1, "652854874"
Print #1, "<END>"
Print #1, "Green way Sugar"
Print #1, "Elisabeth Roberts"
Print #1, "622525885"
Print #1, "<END>"

'Closing the file:
Close #1
MsgBox ("Saved 3 supplier records")
Exit Sub
e:
MsgBox (Err.Description)
End Sub

Without explaining in detail how the procedure works, since this is not a programming course, simply comment that the procedure creates a file in the specified path and then we write the data in it, sequentially.

And order for the fields has to be followed and each record has to be finished with the synchronism mark: “<END>”

Step 4: Run the procedure

From the Immediate window, we run the procedure.

As a result, the suppliers file: “Suppliers.txt” is generated in the indicated path. If we open the file we will see how the data has been recorded sequentially.

Suppliers data
Suppliers data

Step 5: Program a procedure for reading the data

In my case, I call it “reading” and it would look like this:

Sub Reading()

On Error GoTo e

ChDrive ("C")
ChDir "C:\test"

Dim line As String
Dim MyChar As String
Dim DrawLine As Boolean

Open "Suppliers.txt" For Input As #1

'String to store the line read
line = ""
'Boolean that indicates if the line has to be drawn or not.
'Line is built up character by character. DrawLine get True when the end of the line is reached
DrawLine = False

'Loop till the end of the file
Do While Not EOF(1)
    MyChar = Input(1, #1)
    If MyChar = Chr(13) Then
        DrawLine = True
    ElseIf (MyChar <> Chr(10)) Then
        line = line + MyChar
    Else
    End If
    If DrawLine Then
        Debug.Print line
        line = ""
        DrawLine = False
    End If
Loop

Close #1

Exit Sub
e:
MsgBox (Err.Description)

End Sub

Executing the procedure on the Immediate screen, we will see that all the data of the file is printed on the screen. We have been reading them character by character sequentially and drawing each line of the file on the screen.

Step 6: Program a procedure for searching a supplier and retrieving the data of it

I call the procedure “SearchingForSupplier”, and its code would be:

Sub SearchingForSupplier()

ChDrive ("C")
ChDir "C:\test"

Dim line, MyChar, supplier As String
Dim DrawLine, found As Boolean
Dim numFieldsDrawn As Integer

'save in supplier the name of the supplier introduced by the user
supplier = InputBox("Introduce the name of the supplier", "Search")

Open "Suppliers.txt" For Input As #1

line = ""
DrawLine = False
found = False
numFieldsDrawn = 0

'Going through the file, character by character until EOF mark
Do While Not EOF(1)
    MyChar = Input(1, #1)
    If MyChar = Chr(13) Then
        DrawLine = True
    ElseIf (MyChar <> Chr(10)) Then
        line = line + MyChar
    Else
    End If
    'Checking if supplier has been found
    If line = supplier Then
        found = True
    End If
    'If I have a complete line to paint, DrawtLine is true
    'If I have found the provider, found is true
    'Once found, the record is painted. It has three fields, three lines.
    If DrawLine Then
        If found Then
            Debug.Print line
            numFieldsDrawn = numFieldsDrawn + 1
        End If
        If numFieldsDrawn = 3 Then
            found = False
        End If
        line = ""
        DrawLine = False
    End If
Loop

Close #1

'If the supplier that the user is looking for is not found, It is indicated by a message
If numFieldsDrawn = 0 Then
    MsgBox "Provider not found"
End If

End Sub

Executing it, we see that a dialogue box appears asking for entering the name of the provider we are looking for.

The operation is similar to the previous procedure in which we read the lines of the file, but in this case, I am comparing the lines read with the name of the provider entered by the user, and if they match, the procedure writes on the screen that line and the next two, since the complete record of the provider has three fields.

For example, looking for the supplier: “Southern vegetables”, the result would be:

Searching for supplier
Searching for supplier

Step 7: Forcing an error

As I have already mentioned, the structure of sequential files is very strict. In this example, if we were wrong and entered “Williams Doors” as a supplier, instead of “Southern vegetables”, we would not obtain the complete record but other fields, due to we didn’t start reading the first field of the record. In this case, the result would be:

Forcing an error
Forcing an error

After completing the 7 steps of this exercise, I’m sure you’ve got an idea of how sequential access files work, but this was the first solution for storing digital data, whose limitations were overcome by random access files, but this will be the subject for another post.

NOTE:

This post is part of the collection “Data Access and Storage Systems”. You can see the index of this collection here

Leave a Reply

Your email address will not be published.