Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2018-10-18 21:10:02
Message-ID: CAH2-Wz=03oLFYfKBFJtER2wsySXtKMaB8MHKCsUNBd8C5mD0qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 18, 2018 at 1:44 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> What kind of backend_flush_after values where you trying?
> backend_flush_after=0 obviously is the default, so I'm not clear on
> that. How large is the database here, and how high is shared_buffers

I *was* trying backend_flush_after=512kB, but it's
backend_flush_after=0 in the benchmark I posted. See the
"postgres*settings" files.

On the master branch, things looked like this after the last run:

pg(at)tpcc_oltpbench[15547]=# \dt+
List of relations
Schema │ Name │ Type │ Owner │ Size │ Description
────────┼────────────┼───────┼───────┼──────────┼─────────────
public │ customer │ table │ pg │ 4757 MB │
public │ district │ table │ pg │ 5240 kB │
public │ history │ table │ pg │ 1442 MB │
public │ item │ table │ pg │ 10192 kB │
public │ new_order │ table │ pg │ 140 MB │
public │ oorder │ table │ pg │ 1185 MB │
public │ order_line │ table │ pg │ 19 GB │
public │ stock │ table │ pg │ 9008 MB │
public │ warehouse │ table │ pg │ 4216 kB │
(9 rows)

pg(at)tpcc_oltpbench[15547]=# \di+
List of relations
Schema │ Name │ Type │ Owner │
Table │ Size │ Description
────────┼──────────────────────────────────────┼───────┼───────┼────────────┼─────────┼─────────────
public │ customer_pkey │ index │ pg │
customer │ 367 MB │
public │ district_pkey │ index │ pg │
district │ 600 kB │
public │ idx_customer_name │ index │ pg │
customer │ 564 MB │
public │ idx_order │ index │ pg │
oorder │ 715 MB │
public │ item_pkey │ index │ pg │ item
│ 2208 kB │
public │ new_order_pkey │ index │ pg │
new_order │ 188 MB │
public │ oorder_o_w_id_o_d_id_o_c_id_o_id_key │ index │ pg │
oorder │ 715 MB │
public │ oorder_pkey │ index │ pg │
oorder │ 958 MB │
public │ order_line_pkey │ index │ pg │
order_line │ 9624 MB │
public │ stock_pkey │ index │ pg │ stock
│ 904 MB │
public │ warehouse_pkey │ index │ pg │
warehouse │ 56 kB │
(11 rows)

> Is it possible that there's new / prolonged cases where a buffer is read
> from disk after the patch? Because that might require doing *write* IO
> when evicting the previous contents of the victim buffer, and obviously
> that can take longer if you're running with backend_flush_after > 0.

Yes, I suppose that that's possible, because the buffer
popularity/usage_count will be affected in ways that cannot easily be
predicted. However, I'm not running with "backend_flush_after > 0"
here -- that was before.

> I wonder if it'd make sense to hack up a patch that logs when evicting a
> buffer while already holding another lwlock. That shouldn't be too hard.

I'll look into this.

Thanks
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-10-18 21:59:00 Re: file cloning in pg_upgrade and CREATE DATABASE
Previous Message Peter Geoghegan 2018-10-18 20:46:21 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons