Re: Intermittent regression test failures from index-only plan changes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Intermittent regression test failures from index-only plan changes
Date: 2012-01-27 18:45:28
Message-ID: CA+TgmobU+oM1-irmpAH0DyFbG+q=hpfgG-fFOVTaV+Nm3Y3Mqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 7, 2012 at 12:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I feel like this is a trick question, but I'll ask anyway: Can't we
>> just ignore ANALYZE?
>
> AFAICS, no.  ANALYZE will run user-defined code: not only user-supplied
> stats collection functions, but user-defined index expressions.  We
> cannot assume that none of that ever requires a snapshot.

The question is: Why would it matter if we expunged tuples from table
A while ANALYZE was running on table B? I guess the problem is that
the index on B might involve a user-defined function which (under the
covers) peeks at table A, possibly now seeing an inconsistent view of
the database.

It's pretty unfortunate to have to cater to that situation, though,
because most of the time an ANALYZE on table A is only going to look
at table A and the system catalogs. In fact, it wouldn't even be
disastrous (in most cases) if we removed tuples from the table being
analyzed - we're engaged in an inherently statistical process anyway,
so who really cares if things change on us in medias res?

Could we easily detect the cases where user code is being run and
ignore ANALYZE when none is?

A probably crazy idea is to add an option to vacuum that would cause
it, upon discovering that it can't set PD_ALL_VISIBLE on a page
because the global xmin is too old, to wait for all of the virtual
transaction IDs who might not be able to see every tuple on the page.
This would allow us to get into a state where all the PD_ALL_VISIBLE
bits are known to be set. But that seems a bit complex for something
that we probably don't care about much outside of the regression
tests.

If none of the above is feasible (and I suspect it isn't), we might
just want to tweak the queries to do something that will preclude
using an index-only scan, like including tableoid::regclass in the
target list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-01-27 18:48:46 Re: rewriteheap.c bug: toast rows don't get XIDs matching their parents
Previous Message Tom Lane 2012-01-27 18:18:51 Re: Unreliable "pg_ctl -w start" again