0% completed
File organization defines how data records are stored in a database file, impacting how data is retrieved, inserted, updated, or deleted. Proper file organization is essential for optimizing database performance.
This lesson explores below types of file organizations, illustrating their features, advantages, disadvantages, and how records are managed.
In sequential file organization, records are stored in a sequential order based on a specific key field (e.g., primary key). When new records are added, they are placed in the correct order, maintaining the sequence.
In the below diagram, we have records in the R1, R3, R5, R4, sequence.
If we want to insert R2, we need to insert it between R1 and R3.
Heap files are unordered, meaning new records are inserted wherever space is available. No sorting or indexing is applied, making this the simplest form of file organization.
In the below diagram, we have the first empty slot available in the first block.
If we want to insert the new record R2, we can insert it in the first block as shown in diagram below.
In hash file organization, a hash function is used to calculate the address of a record based on a key field. Records are stored in buckets corresponding to hash values.
Hash function: Key MOD 5
Key | Hash Value | Bucket |
---|---|---|
101 | 1 | 1 |
102 | 2 | 2 |
103 | 3 | 3 |
106 MOD 5 = 1
→ Placed in Bucket 1. Here, Collision occurred as 2 values have the same hash value. So, we can use methods like chaining to resolve the collision. We will learn to resolve collision in upcoming chapters of this course.In clustered file organization, records with similar values are stored together physically. This organization is often based on clustering indexes.
In the below diagram, cluster A contains Alice, and Alex, cluster B contains Bob, and Ben and cluster C contains the Charlie record.
If we add a new record "Anna" in cluster A, it will look like as shown below.
We will cover B+ Tree File Organization and ISAM (Indexed Sequential Access Method) methods in the upcoming lessons of this chapter.
In the next lesson, we will explore Access Methods, covering topics like clustered vs. non-clustered storage and hashing mechanisms in greater detail.