Re: getting rid of freezing

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting rid of freezing
Date: 2013-05-28 23:51:57
Message-ID: CA+TgmoZcq9+C6FAD_R-GdTHbyoNUwMk7LaJqt7ZK8iZLAu6gLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 28, 2013 at 12:29 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 05/28/2013 07:17 AM, Andres Freund wrote:
>> On 2013-05-26 16:58:58 -0700, Josh Berkus wrote:
>>> I was talking this over with Jeff on the plane, and we wanted to be
>>> clear on your goals here: are you looking to eliminate the *write* cost
>>> of freezing, or just the *read* cost of re-reading already frozen pages?
>>
>> Both. The latter is what I have seen causing more hurt, but the former
>> alone is painful enough.
>
> I guess I don't see how your proposal is reducing the write cost for
> most users then?
>
> - for users with frequently, randomly updated data, pdallvisible would
> not be ever set, so they still need to be rewritten to freeze

Do these users never run vacuum? As of 9.3, vacuum phase 2 will
typically set PD_ALL_VISIBLE on each relevant page. The only time
that this WON'T happen is if an insert, update, or delete hits the
page after phases 1 of vacuum and before phase 2 of vacuum. I don't
think that's going to be the common case.

> - for users with append-only tables, allvisible would never be set since
> those pages don't get vacuumed

There's no good solution for append-only tables. Eventually, they
will get vacuumed, and when that happens, PD_ALL_VISIBLE will be set,
and freezing will also happen. I don't think anything that is being
proposed here is going to make that a whole lot better, but it
shouldn't make it any worse than it is now, either. Since it's
probably not solvable without a rewrite of the heap AM, I'm not going
to feel too bad about that.

> - it would prevent us from getting rid of allvisible, which has a
> documented and known write overhead

Again, I think this is going to be much less of an issue with 9.3, for
the reason explained above. In 9.2 and prior, we'd scan a page with
dead tuples, prune them to line pointers, vacuum the indexes, and then
mark the dead pointers as unused. Then, the NEXT vacuum would revisit
the same page and dirty it again ONLY to mark it all-visible. But in
9.3, the first vacuum will mark the page all-visible at the same time
it marks the dead line pointers unused. So the write overhead of
PD_ALL_VISIBLE should basically be gone. If it's not, it would be
good to know why.

> If we just wanted to reduce read cost, why not just take a simpler
> approach and give the visibility map a "isfrozen" bit? Then we'd know
> which pages didn't need rescanning without nearly as much complexity.

That would break pg_upgrade, which would have to remove visibility map
forks when upgrading. More importantly, it would require another
round of complex changes to the write-ahead logging in this area.
It's not obvious to me that we'd end up ahead of where we are today,
although perhaps I am a pessimist.

> That would also make it more effective to do precautionary vacuum freezing.

But wouldn't it be a whole lot nicer if we just didn't have to do
vacuum freezing AT ALL? The point here is to absorb freezing into
some other operation that we already have to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-05-29 00:00:42 Re: preserving forensic information when we freeze
Previous Message Josh Berkus 2013-05-28 23:50:12 Re: commit fest schedule for 9.4