On Wed, 2010-12-01 at 23:22 -0500, Robert Haas wrote:
> Well, let's think about what we'd need to do to make CRCs work
> reliably. There are two problems.
> 1. [...] If we CRC the entire page, the torn pages are never
> acceptable, so every action that modifies the page must be WAL-logged.
> 2. Currently, we allow hint bits on a page to be updated while holding
The way I see it, here are the rules we are breaking, and why:
* We don't get an exclusive lock when dirtying a page with hint bits
- Why: we write while reading, and we want good concurrency.
- Why': because after a bulk load, we don't have any hint bits, and the
only way to get them set without VACUUM is to write while reading. I've
never been entirely sure why VACUUM isn't good enough in this case,
aside from the fact that a user might not run VACUUM (and autovacuum
might not either, if it was only a bulk load and no updates/deletes).
* We don't WAL log setting hint bits (which dirties a page)
- Why: because after a bulk load, we don't want to write the data a 4th
Hypothetically, if we had a bulk loading strategy, these problems would
go away, and we could follow the rules. Right? Is there a case other
than bulk loading which demands that we break these rules?
And, if we had a bulk loading path, we could probably get away with
writing the data only twice (today, we write it 3 times including the
hint bits) or maybe once if WAL archiving is off.
So, is there a case other than bulk loading for which we need to break
these rules? If not, perhaps we should consider bulk loading a different
problem, and simplify the design of all of these other features (and
allow new storage-touching features to come about, like CRCs, without
exponentially increasing the complexity with each one).
In response to
pgsql-hackers by date
|Next:||From: Joachim Wieland||Date: 2010-12-02 19:13:13|
|Subject: Re: WIP patch for parallel pg_dump|
|Previous:||From: Kevin Grittner||Date: 2010-12-02 18:35:35|
|Subject: Re: V3: Idle in transaction cancellation|