Re: tableam: abstracting relation sizing code

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: tableam: abstracting relation sizing code
Date: 2019-06-07 12:43:21
Message-ID: CA+TgmoZKe713aWGiKkx96LFyQQ1qSb4LMXAyREN34ZNwK68rEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 7, 2019 at 4:11 AM Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> Makes sense. Regarding one of the hacks:
>
> + * HACK: if the relation has never yet been vacuumed, use a minimum size
> + * estimate of 10 pages. The idea here is to avoid assuming a
> + * newly-created table is really small, even if it currently is, because
> + * that may not be true once some data gets loaded into it.
>
> When this is a generic function for AM’s, would it make sense to make the
> minimum estimate a passed in value rather than hardcoded at 10? I don’t have a
> case in mind, but I also don’t think that assumptions made for heap necessarily
> makes sense for all AM’s. Just thinking out loud.

I think that's probably going in the wrong direction. It's arguable,
of course. However, it seems to me that there's nothing heap-specific
about the number 10. It's not computed based on the format of a heap
page or a heap tuple. It's just somebody's guess (likely Tom's) about
how to plan with empty relations. If somebody finds that another
number works better for their AM, it's probably also better for heap,
at least on that person's workload. It seems more likely to me that
this needs to be a GUC or reloption than that it needs to be an
AM-specific property. It also seems to me that if the parameters of
the hack randomly vary from one AM to another, it's likely to create
confusing plan differences that have nothing to do with the actual
merits of what the AMs are doing. But all that being said, I'm not
blocking anybody from fooling around with this; nobody's obliged to
use the helper function. It's just there for people who want the same
AM-independent logic that the heap uses.

> > Here is a patch that tries to improve the situation. I don't know
> > whether there is some better approach; this seemed like the obvious
> > thing to do.
>
> A small nitpick on the patch:
>
> + * estimate_rel_size callback, because it has a few additional paramters.
>
> s/paramters/parameters/

Good catch, and now I notice that the callback is not called
estimate_rel_size but relation_estimate_size. Updated patch attached;
thanks for the review.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
v2-0001-tableam-Provide-helper-functions-for-relation-siz.patch application/octet-stream 14.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-06-07 12:51:56 Re: hyrax vs. RelationBuildPartitionDesc
Previous Message Robert Haas 2019-06-07 12:32:37 Re: tableam: abstracting relation sizing code