next up previous
Next: Multiple Disk Spindles Up: PostgreSQL Hardware Performance Tuning Previous: Cache Size and Sort

Disk Locality

The physical nature of disk drives makes their performance characteristics different from the other storage areas mentioned in this article. The other storage areas can access any byte with equal speed. Disk drives, with their spinning platters and moving heads, access data near the head's current position much faster than data farther away.

Moving the disk head to another cylinder on the platter takes quite a bit of time. Unix kernel developers know this. When storing a large file on disk, they try to place the pieces of the file near each other. For example, suppose a file requires ten blocks on disk. The operating system may place blocks 1-5 on one cylinder and blocks 6-10 on another cylinder. If the file is read from beginning to end, only two head movements are required -- one to get to the cylinder holding blocks 1-5, and another to get to blocks 6-10. However, if the file is read non-sequentially, e.g. blocks 1,6,2,7,3,8,4,9,5,10; ten head movements are required. As you can see, with disks, sequential access is much faster than random access. This is why POSTGRESQL prefers sequential scans to index scans if a significant portion of the table needs to be read. This also highlights the value of the cache.


next up previous
Next: Multiple Disk Spindles Up: PostgreSQL Hardware Performance Tuning Previous: Cache Size and Sort
Bruce Momjian
2003-01-27