Re: Speed up transaction completion faster after many relations are accessed in a transaction

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction
Date: 2019-02-19 01:29:29
Message-ID: 15654.1550539769@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2019-02-18 19:24:54 -0500, Tom Lane wrote:
>> Yeah, but if we want to rearrange the members into an illogical order
>> to save some space, we should do that independently of this patch ---

> Sure, we should do that. I don't buy the "illogical" bit, just moving
> hashcode up to after tag isn't more or less logical, and saves most of
> the padding, and moving the booleans to the end isn't better/worse
> either.

I hadn't looked at the details closely, but if we can squeeze out the
padding space without any loss of intelligibility, sure let's do so.
I still say that's independent of whether to adopt this patch though.

> but it's smaller (althoug there's plenty trailing space).

I think you're supposing that these things are independently palloc'd, but
they aren't --- dynahash lays them out in arrays without palloc padding.

> IDK, we, including you, very often make largely independent improvements
> to make the cost of something else more palpable. Why's that not OK
> here?

When we do that, we aren't normally talking about overheads as high as
25% (even more, if it's measured as I think it ought to be). What I'm
concerned about is that the patch is being advocated for cases where
there are lots of LOCALLOCK entries --- which is exactly where the
space overhead is going to hurt the most.

> Especially because we're not comparing to an alternative where no
> cost is added, keeping track of e.g. a running average of the hashtable
> size isn't free either; nor does it help in the intermittent cases.

What I was hoping for --- though perhaps it's not achievable --- was
statistical overhead amounting to just a few more instructions per
transaction. Adding dlist linking would add more instructions per
hashtable entry/removal, which seems like it'd be a substantially
bigger time penalty. As for the intermittent-usage issue, that largely
depends on the details of the when-to-reset heuristic, which we don't
have a concrete design for yet. But I could certainly imagine it waiting
for a few transactions before deciding to chomp.

Anyway, I'm not trying to veto the patch in this form, just suggesting
that there are alternatives worth thinking about.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-19 01:36:44 Re: speeding up planning with partitions
Previous Message Matsumura, Ryo 2019-02-19 01:09:32 RE: [PROPOSAL]a new data type 'bytea' for ECPG