Sequential access files were the first digital storage solution, those files present a certain structure to store the data but are still far away from the databases.
Let’s put ourselves at the beginning of the second half of the twentieth century when our protagonist Antonio begins to digitally the data of his business, the SuperGades store. The option available then was simply to pass the data from a paper file to a digital file, using a very simple word processor with very few functionalities, in which we would write the data without any format. This file is also called an archive, taking the name of its analogue antecedent, a physical file with drawers where we kept the files with the data.
If we were storing suppliers data, the file might look similar to:
Smith fruits Peter Smith 607454545 Southern vegetables Williams Doors 652854874 Green way Sugar Elisabeth Roberts 622525885
In this file, we have written the data of our suppliers one after the other, as we would do it in a notebook. We have generated a sequential file, that receives this name because the data access is sequential, so it has to go through the file from the beginning until reaching the data that is wanted to be read.
The EOF mark (End of File)
This way of working forced us to include some kind of mark to indicate the end of the file has been reached. Otherwise, we would get an error when trying to read data beyond the end of the file. This mark is the end character of the file.
Writing data to the file and closing it, provoque this character was automatically added to the end of the file.
Programming languages use the EOF character in data reading processes. They go through the file sequentially, line by line, inside a loop until they read the EOF mark.
Limitations of sequential files
The use of sequential files for data handling has many limitations:
- Sequential access: as I mentioned before, access to the data was sequential, we had to go through the entire file from its first line until we reached the data we wanted to recover. This was the only way to move around the file, we always advanced to read the next line, you could not go back, if we wanted to recover data that we had already passed, we had to start the sequential reading from the first line.
- Opening modes: When accessing the file to work with it, we have to open it in write or read mode, depending on the operation we want to perform. Thus, we cannot write and read data at the same time, but just one type of operation at each opening. To change operations we have to close the file and open it again in the corresponding mode.
Readings can be partial, as the file is not modified. However, when we open a sequential file in writing mode we are deleting all its previous content and generating it from the beginning. For example, when we want to delete a record, what we do is rewrite the entire file, except for the record we want to delete.
Some programming languages allow writing after the end of the file. A very common command to perform this function is Append.
- Exclusive access: Only one user can work on the file at a certain time so so that a new user can access the file, the current user would have to close it before. In other words, multiple users can’t access the file simultaneously.
- Fixed structure of fields: The way of working with this type of file, forces us to handle a fixed structure of data. In the example of the file of suppliers we first saved the name of the company, then the name of the contact person and finally the phone number.
When we go through the file, we simply read data, but we do not know what data we are reading. In this way, if we accidentally changed the order of registration, for example by placing the contact first and then the company name, the contact data would be taken as the name of the company and the data of the company name as a contact.
Fields and records
The data storage structure can be defined in fields and records. A field would be a data type and a record would be the set of fields that define an element. Continuing with the example of the supplier file, the fields of our structure would be the name of the supplier, the contact in the supplier and the telephone. A record would be the set of fields that define an element, in our case, a supplier.
To avoid errors with the data, our friend Antonio decides to make some improvements in the sequential files. It introduces a synchronism mark, that indicates the end of a record. This mark can be any we define, but a commonly used one is: <END>
With this improvement, the provider file would be:
Smith fruits Peter Smith 607454545 <END> Southern vegetables Williams Doors 652854874 <END> Green way Sugar Elisabeth Roberts 622525885 <END>
Logically, this mark could never be the value of a field, since it would lead us to error, we would interpret that field as the end of the record.
At that time, Antonio would use magnetic tapes to record the data. A tape that was rolled on a roller, and contained magnetic particles to store the data. The tape was unwound and passed through a header for reading or writing, the same system used for the famous music cassettes. The way of operating this system forced access to be sequential, running through the tape from beginning to end.
Sequential File Example (with Visual Basic)
To better understand how sequential access files work, the very best is to experience them on our own. And to do this, we are going to create a sequential file, read and write data in it.
To keep it simple, in order not to have to install any IDE of any programming language, we are going to use VBA (Visual Basic Application) that we have access to it on the classroom computers in the Data Architecture classes. It can be found in Access software within the Microsoft Office Professional package.
The exercise includes 7 steps:
Step 1: Create a database
Create a new database and give it the name you want, for example: BBDD
Step 2: Create a new Module
Go to the tab “DATABASE TOOLS”, click on “Visual Basic”.
Then, in the Insert/Module menu, create a new module. In my case, I leave it with the default name: “Module1”, since we are only going to use it for practice purposes.
Step 3: Program a procedure for data record
The first thing to do is create a new file with the data. For example, with the supplier data that Antonio handled. The procedure that would perform this task would be:
Sub RecordingInSequentialAccessFile() On Error GoTo e ChDrive ("C") ChDir "C:\test" Open "Suppliers.txt" For Output As #1 'recording suppliers data Print #1, "Smith fruits" Print #1, "Peter Smith" Print #1, "607454545" Print #1, "<END>" Print #1, "Southern vegetables" Print #1, "Williams Doors" Print #1, "652854874" Print #1, "<END>" Print #1, "Green way Sugar" Print #1, "Elisabeth Roberts" Print #1, "622525885" Print #1, "<END>" 'Closing the file: Close #1 MsgBox ("Saved 3 supplier records") Exit Sub e: MsgBox (Err.Description) End Sub
Without explaining in detail how the procedure works, since this is not a programming course, simply comment that the procedure creates a file in the specified path and then we write the data in it, sequentially.
And order for the fields has to be followed and each record has to be finished with the synchronism mark: “<END>”
Step 4: Run the procedure
From the Immediate window, we run the procedure.
As a result, the suppliers file: “Suppliers.txt” is generated in the indicated path. If we open the file we will see how the data has been recorded sequentially.
Step 5: Program a procedure for reading the data
In my case, I call it “reading” and it would look like this:
Sub Reading() On Error GoTo e ChDrive ("C") ChDir "C:\test" Dim line As String Dim MyChar As String Dim DrawLine As Boolean Open "Suppliers.txt" For Input As #1 'String to store the line read line = "" 'Boolean that indicates if the line has to be drawn or not. 'Line is built up character by character. DrawLine get True when the end of the line is reached DrawLine = False 'Loop till the end of the file Do While Not EOF(1) MyChar = Input(1, #1) If MyChar = Chr(13) Then DrawLine = True ElseIf (MyChar <> Chr(10)) Then line = line + MyChar Else End If If DrawLine Then Debug.Print line line = "" DrawLine = False End If Loop Close #1 Exit Sub e: MsgBox (Err.Description) End Sub
Executing the procedure on the Immediate screen, we will see that all the data of the file is printed on the screen. We have been reading them character by character sequentially and drawing each line of the file on the screen.
Step 6: Program a procedure for searching a supplier and retrieving the data of it
I call the procedure “SearchingForSupplier”, and its code would be:
Sub SearchingForSupplier() ChDrive ("C") ChDir "C:\test" Dim line, MyChar, supplier As String Dim DrawLine, found As Boolean Dim numFieldsDrawn As Integer 'save in supplier the name of the supplier introduced by the user supplier = InputBox("Introduce the name of the supplier", "Search") Open "Suppliers.txt" For Input As #1 line = "" DrawLine = False found = False numFieldsDrawn = 0 'Going through the file, character by character until EOF mark Do While Not EOF(1) MyChar = Input(1, #1) If MyChar = Chr(13) Then DrawLine = True ElseIf (MyChar <> Chr(10)) Then line = line + MyChar Else End If 'Checking if supplier has been found If line = supplier Then found = True End If 'If I have a complete line to paint, DrawtLine is true 'If I have found the provider, found is true 'Once found, the record is painted. It has three fields, three lines. If DrawLine Then If found Then Debug.Print line numFieldsDrawn = numFieldsDrawn + 1 End If If numFieldsDrawn = 3 Then found = False End If line = "" DrawLine = False End If Loop Close #1 'If the supplier that the user is looking for is not found, It is indicated by a message If numFieldsDrawn = 0 Then MsgBox "Provider not found" End If End Sub
Executing it, we see that a dialogue box appears asking for entering the name of the provider we are looking for.
The operation is similar to the previous procedure in which we read the lines of the file, but in this case, I am comparing the lines read with the name of the provider entered by the user, and if they match, the procedure writes on the screen that line and the next two, since the complete record of the provider has three fields.
For example, looking for the supplier: “Southern vegetables”, the result would be:
Step 7: Forcing an error
As I have already mentioned, the structure of sequential files is very strict. In this example, if we were wrong and entered “Williams Doors” as a supplier, instead of “Southern vegetables”, we would not obtain the complete record but other fields, due to we didn’t start reading the first field of the record. In this case, the result would be:
After completing the 7 steps of this exercise, I’m sure you’ve got an idea of how sequential access files work, but this was the first solution for storing digital data, whose limitations were overcome by random access files, but this will be the subject for another post.
This post is part of the collection “Data Access and Storage Systems”. You can see the index of this collection here