More data files / forks

From: Chris Cleveland <ccleve+github(at)dieselpoint(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: More data files / forks
Date: 2022-01-11 18:39:06
Message-ID: CABSN6VfbVsvZnkDZgM=_PaNru-bAVbSV8NGBjuw9g=PEfz+SSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm working on a table access method that stores indexes in a structure
that looks like an LSM tree. Changes get written to small segment files,
which then get merged into larger segment files.

It's really tough to manage these files using existing fork/buffer/page
files, because when you delete a large segment it leaves a lot of empty
space. It's a lot easier to write the segments into separate files on disk
and then delete them as needed.

I could do that, but then I lose the advantages of having data in native
Postgres files, including support for buffering and locking.

It's important to have the segments stored contiguously on disk. I've
benchmarked it; it makes a huge performance difference.

Questions:

1. Are there any other disadvantages to storing data in my own files on
disk, instead of in files managed by Postgres?

2. Is it possible to increase the number of forks? I could store each level
of the LSM tree in its own fork very efficiently. Forks could get truncated
as needed. A dozen forks would handle it nicely.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-01-11 18:47:00 Re: sequences vs. synchronous replication
Previous Message Jacob Champion 2022-01-11 18:38:10 Re: fix crash with Python 3.11