RelOptInfo -> Relation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RelOptInfo -> Relation
Date: 2018-02-02 21:21:53
Message-ID: CA+TgmoYKToP4-adCFFRNrO21OGuH=phx-fiB1dYoqksNYX6YHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Is there some good reason why get_relation_info() copies all of the
data that will be needed later in planning into the RelOptInfo instead
of just storing a pointer to the Relation directly into the
RelOptInfo? Of course, if we did that, then it would have to leave
the relation open instead of closing it, and similarly for the
indexes, but so what? As far as I understand it, we're keeping a lock
on the relation, so none of this stuff should be able to change under
it, and pointing to data is faster than copying it.

I think that this problem has gradually gotten worse as we've added
features to the system. Between 7.4 and master, we've added the
following stuff to what this function needs to propagate into the
RelOptInfo: tablespace, attr_needed, attr_width, allvisfrac,
rel_parallel_workers, statlist, serverid, fdwroutine, fkey_list,
part_scheme, boundinfo, nparts, partexprs. That's a fair amount of
stuff, and IndexOptInfo includes more things now, too. The whole
thing seems pretty inefficient. get_relation_statistics() stores a
list of statistics object OIDs in the relcache and then, in
get_relation_statistics(), builds a StaExtInfo for each one. That,
however, means that the StaExtInfo is getting rebuilt for every query.
If the RelOptInfo could point directly into the relcache, then we
could build that stuff once when the relcache entry was created, or
maybe on demand, and then keep it forever. But if we did that today
it wouldn't work too well: we'd still have to copy the result from the
relcache into the RelOptInfo. The just-copy-it approach is what
set_relation_partition_info does for rel->boundinfo, and that
similarly seems like a waste of cycles. I have a feeling we're all
cargo-culting each new feature into a system that we may have
outgrown, but maybe there's some reason behind this that I'm missing.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-02-02 21:35:59 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Robert Haas 2018-02-02 21:05:25 Re: [HACKERS] path toward faster partition pruning