Re: Use simplehash.h instead of dynahash in SMgr

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Use simplehash.h instead of dynahash in SMgr
Date: 2021-05-05 12:32:00
Message-ID: CAApHDvp36Mtu3Kb9cdZT3rGVNWqUxQd3LFmCwSG7h53Re5sxqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jakub,

On Wed, 5 May 2021 at 20:16, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> wrote:
> I might be a little bit out of the loop, but as Alvaro stated - Thomas did plenty of excellent job related to recovery performance in that thread. In my humble opinion and if I'm not mistaken (I'm speculating here) it might be *not* how Smgr hash works, but how often it is being exercised and that would also explain relatively lower than expected(?) gains here. There are at least two very important emails from him that I'm aware that are touching the topic of reordering/compacting/batching calls to Smgr:
> https://www.postgresql.org/message-id/CA%2BhUKG%2B2Vw3UAVNJSfz5_zhRcHUWEBDrpB7pyQ85Yroep0AKbw%40mail.gmail.com
> https://www.postgresql.org/message-id/flat/CA%2BhUKGK4StQ%3DeXGZ-5hTdYCmSfJ37yzLp9yW9U5uH6526H%2BUeg%40mail.gmail.com

I'm not much of an expert here and I didn't follow the recovery
prefetching stuff closely. So, with that in mind, I think there are
lots that could be done along the lines of what Thomas is mentioning.
Batching WAL records up by filenode then replaying each filenode one
by one when our batching buffer is full. There could be some sort of
parallel options there too, where workers replay a filenode each.
However, that wouldn't really work for recovery on a hot-standby
though. We'd need to ensure we replay the commit record for each
transaction last. I think you'd have to batch by filenode and
transaction in that case. Each batch might be pretty small on a
typical OLTP workload, so it might not help much there, or it might
hinder.

But having said that, I don't think any of those possibilities should
stop us speeding up smgropen().

> Another potential option that we've discussed is that the redo generation itself is likely a brake of efficient recovery performance today (e.g. INSERT-SELECT on table with indexes, generates interleaved WAL records that touch often limited set of blocks that usually put Smgr into spotlight).

I'm not quite sure if I understand what you mean here. Is this
queuing up WAL records up during transactions and flush them out to
WAL every so often after rearranging them into an order that's more
optimal for replay?

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-05-05 13:04:53 Re: Performance degradation of REFRESH MATERIALIZED VIEW
Previous Message Dilip Kumar 2021-05-05 11:09:16 Re: Small issues with CREATE TABLE COMPRESSION