Re: extending relations more efficiently

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: extending relations more efficiently
Date: 2012-05-01 15:06:11
Message-ID: CA+TgmobrJwyvOwztdK0mGkzw1wNv1Zqo3ykW7YpkuXHDNvtCwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 1, 2012 at 10:31 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> efficient than our current method - I'm guessing that it actually
>> writes the updated metadata back to disk, where write() does not (this
>> makes one wonder how safe it is to count on write to have the behavior
>> we need here in the first place).
> Currently the write() doesn't need to be crashsafe because it will be repeated
> on crash-recovery and a checkpoint will fsync the file.

That's not what I'm worried about. If the write() succeeds and then a
subsequent close() on the filehandle reports an ENOSPC condition that
means the write didn't really write after all, I am concerned that we
might not handle that cleanly.

> I don't really see why it would need to compare in the 8kb case. What reason
> would there be to further extend in that small increments?

In previous discussions, the concern has been that holding the
relation extension lock across a multi-block extension would cause
latency spikes for both the process doing the extensions and any other
concurrent processes that need the lock. Obviously if it were
possible to extend by 64kB in the same time it takes to extend by 8kB
that would be awesome, but if it takes eight times longer then things
don't look so good.

> There is the question whether this should be done in the background though, so
> the relation extension lock is never hit in anything time-critical...

Yeah, although I'm fuzzy on how and whether that can be made to work,
which is not to say that it can't.

It might also be interesting to provide a mechanism to pre-extend a
relation to a certain number of blocks, though if we did that we'd
have to make sure that autovac got the memo not to truncate those
pages away again.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2012-05-01 15:09:40 Re: proposal: additional error fields
Previous Message Hannu Krosing 2012-05-01 15:02:27 Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?