Re: Row estimates for empty tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Christophe Pettus <xof(at)thebuild(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Row estimates for empty tables
Date: 2020-07-25 03:40:08
Message-ID: CAFj8pRCK3oeZLBLr=8Z3VXpyyk1gashW1_7Cg2sr9FX9v_0OmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

so 25. 7. 2020 v 0:34 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> [ redirecting to -hackers ]
>
> I wrote:
> > The core issue here is "how do we know whether the table is likely to
> stay
> > empty?". I can think of a couple of more or less klugy solutions:
>

For these special cases is probably possible to ensure ANALYZE before any
SELECT. When the table is created, then it is analyzed, and after that it
is published and used for SELECT. Usually this table is not modified ever.

Because it is a special case, then it is not necessarily too sophisticated
a solution. But for built in solution it can be designed more goneral

> > 1. Arrange to send out a relcache inval when adding the first page to
> > a table, and then remove the planner hack for disbelieving relpages = 0.
> > I fear this'd be a mess from a system structural standpoint, but it might
> > work fairly transparently.
>
> I experimented with doing this. It's not hard to code, if you don't mind
> having RelationGetBufferForTuple calling CacheInvalidateRelcache. I'm not
> sure whether that code path might cause any long-term problems, but it
> seems to work OK right now. However, this solution causes massive
> "failures" in the regression tests as a result of plans changing. I'm
> sure that's partly because we use so many small tables in the tests.
> Nonetheless, it's not promising from the standpoint of not causing
> unexpected problems in the real world.
>
> > 2. Establish the convention that vacuuming or analyzing an empty table
> > is what you do to tell the system that this state is going to persist.
> > That's more or less what the existing comments in plancat.c envision,
> > but we never made a definition for how the occurrence of that event
> > would be recorded in the catalogs, other than setting relpages > 0.
> > Rather than adding another pg_class column, I'm tempted to say that
> > vacuum/analyze should set relpages to a minimum of 1, even if the
> > relation has zero pages.
>
> I also tried this, and it seems a lot more promising: no existing
> regression test cases change. So perhaps we should do the attached
> or something like it.
>

I am sending a patch that is years used in GoodData.

I am not sure if the company uses 0 or 1, but I can ask.

Regards

Pavel

> regards, tom lane
>
>

Attachment Content-Type Size
fakepages.patch text/x-patch 1.9 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2020-07-25 03:55:53 Re: is JIT available
Previous Message Andres Freund 2020-07-25 01:51:23 Re: bad JIT decision

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-07-25 03:48:10 Re: INSERT INTO SELECT, Why Parallelism is not selected?
Previous Message vignesh C 2020-07-25 01:52:15 Re: handle a ECPG_bytea typo