Re: 9.3: load path to mitigate load penalty for checksums

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.3: load path to mitigate load penalty for checksums
Date: 2012-06-12 18:06:49
Message-ID: CA+TgmoYsE44JbBYP8+=kuuQ-=UF_6r7GvbPzDV_svW7jO5ppKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 6, 2012 at 8:42 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Wed, 2012-06-06 at 15:08 -0400, Robert Haas wrote:
>> On Mon, Jun 4, 2012 at 9:26 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> > Thoughts?
>>
>> Simon already proposed a way of doing this that doesn't require
>> explicit user action, which seems preferable to a method that does
>> require explicit user action, even though it's a little harder to
>> implement.  His idea was to store the XID of the process creating the
>> table in the pg_class row, which I think is *probably* better than
>> your idea of having a process that waits and then flips the flag.
>> There are some finicky details though - see previous thread for
>> discussion of some of the issues.
>
> My goals include:
>
> * The ability to load into existing tables with existing data
> * The ability to load concurrently
>
> My understanding was that the proposal to which you're referring can't
> do those things, which seem like major limitations. Did I miss
> something?

No, you're correct. I misread your original email, sorry.

I'm just thinking about this a little more. It strikes me that the
core trade-off here is between doing more post-commit work and doing
more post-abort work. For example, in your proposal, we've got to run
a lazy vacuum before exiting bulk load mode, because if a transaction
has aborted, we've got to get rid of its tuples before letting anyone
trust hint bits again. That is, abort cleanup gets harder. OTOH,
commit cleanup gets easier, because all the hint bits and visibility
map bits are already set: the only thing left is to freeze. In
general, I think that's a good trade-off. Commits are much more
common than aborts, and so we ought to be optimizing for the commit
case.

But maybe there are other ways of doing it, besides what you've
proposed here. Not sure exactly what, but it might be worth thinking
about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-12 18:35:39 Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Previous Message Bruce Momjian 2012-06-12 18:02:00 Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3