Re: Unhappy about API changes in the no-fsm-for-small-rels patch

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Date: 2019-05-06 15:10:15
Message-ID: CA+TgmoagQedgMzJ=jHMLh_y2_1Sf925rruL=vLiOJ6AoBGPMXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, May 5, 2019 at 9:25 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> I understand that we have to take a call here shortly, but as there is
> a weekend so I would like to wait for another day to see if anyone
> else wants to share his opinion.

I haven't looked deeply into the issues with this patch, but it seems
to me that if two other committers is saying that this should be
reverted, and the original author of the patch is agreeing, and your
patch to try to fix it still has demonstrable bugs ... it's time to
give up. We're well past feature freeze at this point.

Some other random comments:

I'm really surprised that the original design of this patch involved
storing state in global variables. That seems like a pretty poor
decision. This is properly per-relation information, and any approach
like that isn't going to work well when there are multiple relations
involved, unless the information is only being used for a single
attempt to find a free page, in which case it should use parameters
and function-local variables, not globals.

I think it's legitimate to question whether sending additional
invalidation messages as part of the design of this feature is a good
idea. If it happens frequently, it could trigger expensive sinval
resets more often. I don't understand the various proposals well
enough to know whether that's really a problem, but if you've got a
lot of relations for which this optimization is in use, I'm not sure I
see why it couldn't be.

I think at some point it was proposed that, since an FSM access
involves touching 3 blocks, it ought to be fine for any relation of 4
or fewer blocks to just check all the others. I don't really
understand why we drifted off that design principle, because it seems
like a reasonable theory. Such an approach doesn't require anything
in the relcache, any global variables, or an every-other-page
algorithm.

If we wanted to avoid creating the FSM for relation with >4 blocks, we
might want to take a step back and think about a whole different
approach. For instance, we could have a cache in shared memory that
can store N entries, and not bother creating FSM forks until things no
longer fit in that cache. Or, what might be better, we could put FSM
data for many relations into a single FSM file, instead of having a
separate fork for each relation. I think that would get at what's
really driving this work: having a zillion tiny little FSM files
sucks. Of course, those kinds of changes are far too big to
contemplate for v12, but they might be worth some thought in the
future.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-05-06 15:27:32 Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Previous Message Tom Lane 2019-05-06 15:03:54 Re: Unhappy about API changes in the no-fsm-for-small-rels patch