Re: new autovacuum criterion for visible pages

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new autovacuum criterion for visible pages
Date: 2017-01-22 15:45:49
Message-ID: 20170122154549.GF18360@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit,

* Amit Kapila (amit(dot)kapila16(at)gmail(dot)com) wrote:
> On Sun, Jan 22, 2017 at 3:27 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Simon Riggs (simon(at)2ndquadrant(dot)com) wrote:
> >> On 12 August 2016 at 01:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> > Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> >> >> In short, autovacuum will need to scan by itself the VM of each
> >> >> relation and decide based on that.
> >> >
> >> > That seems like a worthwhile approach to pursue. The VM is supposed to be
> >> > small, and if you're worried it isn't, you could sample a few pages of it.
> >> > I do not think any of the ideas proposed so far for tracking the
> >> > visibility percentage on-the-fly are very tenable.
> >>
> >> Sounds good, but we can't scan the VM for every table, every minute.
> >> We need to record something that will tell us how many VM bits have
> >> been cleared, which will then allow autovac to do a simple SELECT to
> >> decide what needs vacuuming.
> >>
> >> Vik's proposal to keep track of the rows inserted seems like the best
> >> approach to this issue.
> >
> > I tend to agree with Simon on this. I'm also worried that an approach
> > which was based off of a metric like "% of table not all-visible" might
> > result in VACUUM running over and over on a table because it isn't able
> > to actually make any progress towards improving that percentage. We'd
> > have to have some kind of "cool-off" period or something.
> >
> > Tracking INSERTs and then kicking off a VACUUM based on them seems to
> > address that in a natural way and also seems like something that users
> > would generally understand as it's very similar to what we do for
> > UPDATEs and DELETEs.
> >
> > Tracking the INSERTs as a reason to VACUUM is also very natural when you
> > consider the need to update BRIN indexes.
>
> Another possible advantage of tracking INSERTs is for hash indexes
> where after split we need to remove tuples from buckets that underwent
> split recently.

That's a good point also, and for clearing GIN pending lists too, now
that I think about it.

We really need to get this fixed for PG10.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2017-01-22 17:50:28 Logical replication failing when foreign key present
Previous Message Magnus Hagander 2017-01-22 11:13:38 Online enabling of page level checksums