Re: crash-safe visibility map, take three

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: crash-safe visibility map, take three
Date: 2010-12-02 22:00:18
Message-ID: AANLkTikNcbySP_HDS0ZoaEWmaA=JBRWhssstD7xTSmNc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 2, 2010 at 2:01 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> * We don't get an exclusive lock when dirtying a page with hint bits
> - Why: we write while reading, and we want good concurrency.
> - Why': because after a bulk load, we don't have any hint bits, and the
> only way to get them set without VACUUM is to write while reading. I've
> never been entirely sure why VACUUM isn't good enough in this case,
> aside from the fact that a user might not run VACUUM (and autovacuum
> might not either, if it was only a bulk load and no updates/deletes).
>
> * We don't WAL log setting hint bits (which dirties a page)
> - Why: because after a bulk load, we don't want to write the data a 4th
> time
>
> Hypothetically, if we had a bulk loading strategy, these problems would
> go away, and we could follow the rules. Right? Is there a case other
> than bulk loading which demands that we break these rules?

I'm not really convinced that this problem is confined to bulk
loading. Every INSERT or UPDATE results in a new tuple that may need
hit bits set and eventually to be frozen. A bulk load is just a time
when you do lots of inserts all at once; it seems to me that a large
update would cause all the same problems, plus bloat. The triple I/O
problem exists for small transactions as well (and isn't desirable
there either); it's just less noticeable because the second and third
writes are, like the first one, small.

> And, if we had a bulk loading path, we could probably get away with
> writing the data only twice (today, we write it 3 times including the
> hint bits) or maybe once if WAL archiving is off.

It seems to me that a COPY command executed in a transaction with no
other open snapshots writing to a table created or truncated within
the same transaction should be able to write frozen tuples from the
get-go, regardless of anything else we do.

> So, is there a case other than bulk loading for which we need to break
> these rules? If not, perhaps we should consider bulk loading a different
> problem, and simplify the design of all of these other features (and
> allow new storage-touching features to come about, like CRCs, without
> exponentially increasing the complexity with each one).

I don't think we're exponentially increasing complexity - I think
we're incrementally improving our algorithms. If you want to propose
a bulk loading path, great. Propose away! But without something a
bit more concrete, I don't think it would be appropriate to hold off
making the visibility map crash-safe, on the off chance that our
design for so doing might complicate something else we want to do
later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-12-02 22:01:16 Re: WIP patch for parallel pg_dump
Previous Message Andres Freund 2010-12-02 21:36:51 Re: [PATCH] V3: Idle in transaction cancellation