Re: BUG #15184: Planner overestimates number of rows in empty table

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: alexey(dot)ermakov(at)dataegret(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15184: Planner overestimates number of rows in empty table
Date: 2018-05-03 15:40:42
Message-ID: 87k1sk95f2.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

>>>>> "PG" == PG Bug reporting form <noreply(at)postgresql(dot)org> writes:

PG> if table is really empty then in first condition we set curpages =
PG> 10 and second condition doesn't apply. so we estimate that empty
PG> table has 10 pages and 2550 rows (for table with one int column)
PG> which doesn't look good. is it intended behavior?

As the large comment immediately above explains, it is indeed intended
behavior. The reason is that over-estimating usually doesn't cause too
much harm, but under-estimating tends to blow things up in certain
critical cases (such as causing foreign-key checks to do sequential
scans on tables during a data-loading transaction).

It's actually still possible to trigger those kinds of pathological
cases, but in between the estimation hacks and the plan cache, you have
to work a lot harder at it. Consider for example:

create table tree (id integer primary key,
parent_id integer references tree);

insert into tree values (1, null);
vacuum analyze tree; -- now relpages=1 reltuples=1
begin;
insert into tree select i, i-1 from generate_series(2,10) i;
insert into tree select i, i-1 from generate_series(11,100000) i;
commit;

That last insert could take maybe half an hour to run, because the FK
check has a query plan - established as a generic plan since the middle
insert ran it more than 5 times - with the small table size leading to a
sequential scan.

Without the vacuum analyze that I stuck in there, the code in plancat.c
avoids this problem by treating the table as large enough to require an
indexscan from the start.

As the comment says, this does mean we don't handle the case when the
table really is empty and stays empty. But this should be very rare
compared to the case where the table starts out empty but then has rows
added.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2018-05-04 07:05:41 Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Previous Message Alvaro Herrera 2018-05-03 15:17:43 Re: BUG #15185: pg_dump doesn't include statistics when specifying a table