Re: Faster inserts with mostly-monotonically increasing values

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(dot)riggs(at)2ndquadrant(dot)com>
Subject: Re: Faster inserts with mostly-monotonically increasing values
Date: 2018-03-14 04:36:07
Message-ID: CABOikdP5wnqOaUOjzDmvnf8qtXEbrEx2cWLUP-s2HE9dD6cjzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 11, 2018 at 9:18 PM, Claudio Freire <klaussfreire(at)gmail(dot)com>
wrote:

> On Sun, Mar 11, 2018 at 2:27 AM, Pavan Deolasee
>
> >
> > Yes, I will try that next - it seems like a good idea. So the idea would
> be:
> > check if the block is still the rightmost block and the insertion-key is
> > greater than the first key in the page. If those conditions are
> satisfied,
> > then we do a regular binary search within the page to find the correct
> > location. This might add an overhead of binary search when keys are
> strictly
> > ordered and a single client is inserting the data. If that becomes a
> > concern, we might be able to look for that special case too and optimise
> for
> > it too.
>
> Yeah, pretty much that's the idea. Beware, if the new item doesn't
> fall in the rightmost place, you still need to check for serialization
> conflicts.
>

So I've been toying with this idea since yesterday and I am quite puzzled
with the results. See the attached patch which compares the insertion key
with the last key inserted by this backend, if the cached block is still
the rightmost block in the tree. I initially only compared with the first
key in the page, but I tried this version because of the strange
performance regression which I still have no answers.

For a small number of clients, the patched version does better. But as the
number of clients go up, the patched version significantly underperforms
master. I roughly counted the number of times the fastpath is taken and I
noticed that almost 98% inserts take the fastpath. I first thought that the
"firstkey" location in the page might be becoming a hot-spot for concurrent
processes and hence changed that to track the per-backend last offset and
compare against that the next time. But that did not help much.

+---------+--------------------------------+-------------------------------+
| clients | Master - Avg load time in sec | Patched - Avg load time in sec |
+---------+--------------------------------+-------------------------------+
| 1 | 500.0725203 | 347.632079 |
+---------+--------------------------------+-------------------------------+
| 2 | 308.4580771 | 263.9120163 |
+---------+--------------------------------+-------------------------------+
| 4 | 359.4851779 | 514.7187444 |
+---------+--------------------------------+-------------------------------+
| 8 | 476.4062592 | 780.2540855 |
+---------+--------------------------------+-------------------------------+

The perf data does not show anything interesting either. I mean there is a
reduction in CPU time spent in btree related code in the patched version,
but the overall execution time to insert the same number of records go up
significantly.

Perf (master):
===========

+ 72.59% 1.81% postgres postgres [.] ExecInsert

+ 47.55% 1.27% postgres postgres [.]
ExecInsertIndexTuples
+ 44.24% 0.48% postgres postgres [.] btinsert

- 42.40% 0.87% postgres postgres [.] _bt_doinsert

- 41.52% _bt_doinsert

+ 21.14% _bt_search

+ 12.57% _bt_insertonpg

+ 2.03% _bt_binsrch

1.60% _bt_mkscankey

1.20% LWLockAcquire

+ 1.03% _bt_freestack

0.67% LWLockRelease

0.57% _bt_check_unique

+ 0.87% _start

+ 26.03% 0.95% postgres postgres [.] ExecScan

+ 21.14% 0.82% postgres postgres [.] _bt_search

+ 20.70% 1.31% postgres postgres [.] ExecInterpExpr

+ 19.05% 1.14% postgres postgres [.] heap_insert

+ 18.84% 1.16% postgres postgres [.] nextval_internal

+ 18.08% 0.84% postgres postgres [.] ReadBufferExtended

+ 17.24% 2.03% postgres postgres [.] ReadBuffer_common

+ 12.57% 0.59% postgres postgres [.] _bt_insertonpg

+ 11.12% 1.63% postgres postgres [.] XLogInsert

+ 9.90% 0.10% postgres postgres [.] _bt_relandgetbuf

+ 8.97% 1.16% postgres postgres [.] LWLockAcquire

+ 8.42% 2.03% postgres postgres [.] XLogInsertRecord

+ 7.26% 1.01% postgres postgres [.] _bt_binsrch

+ 7.07% 1.20% postgres postgres [.]
RelationGetBufferForTuple
+ 6.27% 4.92% postgres postgres [.] _bt_compare

+ 5.97% 0.63% postgres postgres [.]
read_seq_tuple.isra.3
+ 5.70% 4.89% postgres postgres [.]
hash_search_with_hash_value
+ 5.44% 5.44% postgres postgres [.] LWLockAttemptLock

Perf (Patched):
============

+ 69.33% 2.36% postgres postgres [.] ExecInsert

+ 35.21% 0.64% postgres postgres [.]
ExecInsertIndexTuples
- 32.14% 0.45% postgres postgres [.] btinsert

- 31.69% btinsert

- 30.35% _bt_doinsert

+ 13.10% _bt_insertonpg

+ 5.11% _bt_getbuf

+ 2.75% _bt_binsrch

+ 2.49% _bt_mkscankey

+ 2.43% _bt_search

+ 0.96% _bt_compare

0.70% CheckForSerializableConflictIn

+ 1.34% index_form_tuple

+ 30.35% 1.53% postgres postgres [.] _bt_doinsert

+ 28.31% 1.53% postgres postgres [.] ExecScan

+ 26.52% 0.77% postgres postgres [.] heap_insert

+ 22.11% 1.21% postgres postgres [.] ExecInterpExpr

+ 20.51% 0.51% postgres postgres [.] nextval_internal

+ 16.55% 1.15% postgres postgres [.] ReadBufferExtended

+ 15.40% 3.77% postgres postgres [.] ReadBuffer_common

+ 14.25% 2.36% postgres postgres [.] XLogInsert

+ 13.10% 0.77% postgres postgres [.] _bt_insertonpg

+ 10.93% 3.07% postgres postgres [.] XLogInsertRecord

+ 10.80% 0.96% postgres postgres [.]
RelationGetBufferForTuple
+ 7.41% 0.38% postgres postgres [.] _bt_getbuf

+ 5.18% 0.58% postgres postgres [.]
read_seq_tuple.isra.3
+ 5.18% 0.19% postgres postgres [.] pg_class_aclcheck

I am testing this on r3.2xlarge AWS instance.

Is there something wrong with the patch? What can cause sudden slowdown
with the increasing number of clients, given that the patch is still doing
what it's supposed to do? I wonder if the shortened code path actually
leads to heavier contention for EXCLUSIVE lock on the rightmost page, which
in turn causes the slowdown. But that's just a theory. Any ideas how to
prove or disprove that theory or any other pointers?

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
pg_btree_target_block_v2.patch application/octet-stream 22.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-03-14 04:36:26 Re: add queryEnv to ExplainOneQuery_hook
Previous Message David G. Johnston 2018-03-14 03:59:34 Re: planner bug regarding lateral and subquery?