Re: Patch: fix lock contention for HASHHDR.mutex

From: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: Patch: fix lock contention for HASHHDR.mutex
Date: 2016-03-23 09:49:50
Message-ID: 20160323124950.7a7e48a5@fujitsu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > I have a strong feeling that we are just wasting our time here.
>
> That is possible. However, I would like it if you would give me the
> benefit of the doubt and assume that, if I seem to be more cautious
> than you would be were you a committer, there might possibly be some
> good reasons for that. The fact is that, despite being more cautious
> than some people think I should be, I still manage to introduce quite
> a number of bugs via the patches I commit - see the thread 'Missing
> rows with index scan when collation is not "C"' on pgsql-bugs for just
> the very latest example of that. Nobody thinks that will happen with
> *their* patch, of course, but it does all the same.

Oh, it explains a lot! You see, I couldn't figure out whats happening.
Patch was discussed and reviewed a long time ago, everyone seems to be
reasonably happy with it, etc. But for some reason it's ignored for
weeks and no one tells why. Now it makes sense.

I should probably mention that this patch was merged to PostgresPro
build a few months ago. Several our clients are already using this code
in production environment (guess who discovered this issue and wanted
it to be fixed). There were no complains so far.

> I'd still like an answer to the question of why this helps so much
> when there must be huge amounts of false sharing between the different
> mutexes. Maybe it doesn't matter, though.

Well, the reason is that there is no more bottleneck here. Code is
executed like 1% of the time here and 99% of the time somewhere else.
There is a false sharing but it's not as expensive as 99% of other
code. Thus optimizing 1% of the code any further doesn't give noticeable
performance improvement.

I still believe that optimizing 1% blindly without considering possible
side effects this optimization can bring (other data alignment, some
additional machine instructions - just to name a few) and having no
way to measure these side effects is a bad idea. But I also admit a
possibility that I can somehow be wrong on this. So I rewrote this
patch one again :'( the way you suggested (without that alignment
related hack I tried, it's just too ugly). I also attached original
hashhdr-rmh.patch just to have both patches in one message so it would
be easier to find both patches in this thread.

If there are any other questions or doubts regarding these patches
please don't hesitate to ask.

--
Best regards,
Aleksander Alekseev
http://eax.me/

Attachment Content-Type Size
hashhdr-rmh.patch text/x-patch 10.9 KB
hashhdr-rmh-new.patch text/x-patch 11.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2016-03-23 10:04:11 Re: pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data
Previous Message Rajkumar Raghuwanshi 2016-03-23 09:24:35 Re: Postgres_fdw join pushdown - getting server crash in left outer join of three table