Cause of intermittent rangetypes regression test failures

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Cause of intermittent rangetypes regression test failures
Date: 2011-11-13 20:38:52
Message-ID: 23765.1321216732@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Well, I was overthinking the question of why rangetypes sometimes fails
with

select count(*) from test_range_gist where ir << int4range(100,500);
! ERROR: input range is empty

Turns out that happens whenever auto-analyze has managed to process
test_range_gist before we get to this part of the test. jaguar
is more likely to see this because CLOBBER_CACHE_ALWAYS slows down the
rangetypes code to a really staggering extent, but obviously it can
happen anywhere. If the table has been analyzed, then the
most_common_values array for column ir will consist of
{empty}
which is entirely correct since that value accounts for 16% of the
table. And then, when mcv_selectivity tries to estimate the selectivity
of the << condition, it applies range_before to the empty range along
with the int4range(100,500) value, and range_before spits up.

I think this demonstrates that the current definition of range_before is
broken. It is not reasonable for it to throw an error on a perfectly
valid input ... at least, not unless you'd like to mark it VOLATILE so
that the planner will not risk calling it.

What shall we have it do instead?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-11-13 23:13:05 Re: why do we need two snapshots per query?
Previous Message Tom Lane 2011-11-13 19:32:23 Poor use of caching in rangetypes code