Re: [HACKERS] More stats about skipped vacuums

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] More stats about skipped vacuums
Date: 2017-11-25 16:10:30
Message-ID: CA+TgmoYZTHCFZzPrz8tG9TQknAd=MsacTy7aT2ttia=dRB2eGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 25, 2017 at 10:34 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> If we could get rid of the copy-to-a-temporary-file technology for
> transferring the stats collector's data to backends, then this problem
> would probably vanish or at least get a lot less severe. But that seems
> like a nontrivial project. With the infrastructure we have today, we
> could probably keep the stats tables in a DSM segment; but how would
> a backend get a consistent snapshot of them?

I suppose the obvious approach is to have a big lock around the
statistics data proper; this could be taken in shared mode to take a
snapshot or in exclusive mode to update statistics. In addition,
create one or more queues where statistics messages can be enqueued in
lieu of updating the main statistics data directly. If that doesn't
perform well enough, you could keep two copies of the statistics, A
and B. At any given time, one copy is quiescent and the other copy is
being updated. Periodically, at a time when we know that nobody is
taking a snapshot of the statistics, they reverse roles.

Of course, the other obvious question is whether we really need a
consistent snapshot, because that's bound to be pretty expensive even
if you eliminate the I/O cost. Taking a consistent snapshot across
all 100,000 tables in the database even if we're only ever going to
access 5 of those tables doesn't seem like a good or scalable design.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2017-11-25 16:15:05 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message Robert Haas 2017-11-25 15:43:24 Re: [HACKERS] [POC] Faster processing at Gather node