Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-21 13:17:18
Message-ID: CABOikdP1yeicUPH0NByjrg2Sv3ZtJXWyFPSqwppid8G3kLVKjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 14, 2017 at 10:47 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> After looking at how index_fetch_heap and heap_hot_search_buffer
> interact, I can't say I'm in love with the idea. I started thinking
> that we should not have index_fetch_heap release the buffer lock only to
> re-acquire it five lines later, so it should keep the buffer lock, do
> the recheck and only release it afterwards (I realize that this means
> there'd be need for two additional "else release buffer lock" branches);
> but then this got me thinking that perhaps it would be better to have
> another routine that does both call heap_hot_search_buffer and then call
> recheck -- it occurs to me that what we're doing here is essentially
> heap_warm_search_buffer.
>
> Does that make sense?
>
> Another thing is BuildIndexInfo being called over and over for each
> recheck(). Surely we need to cache the indexinfo for each indexscan.
>
>
Please find attached rebased patches. There are a few changes in this
version, so let me mention them here instead of trying to reply in-line to
various points on various emails:

1. The patch now has support for hash redo recovery since that was added to
the master (it might be broken since a bug was reported in the original
code itself)

2. Based on Robert's comments and my discussion with him in person, I
removed the Blue/Red naming and instead now using CLEAR and WARM to
identify the parts of the chain and the index pointers. This also resulted
in changes to the way heap tuple header bits are named. So
HEAP_WARM_UPDATED is now used to mark the old tuple which gets WARM updated
and the same flag is copied to all subsequent versions of the tuple, until
a non-HOT updates happens. The new version and all subsequent versions are
marked with HEAP_WARM_TUPLE flag (in the earlier versions this was used for
marking old and the new versions. This might cause confusion, but looks a
more accurate naming to me.

3. IndexInfo is now cached inside IndexScanDescData, which should address
your comment above.

4. I realised that we don't really need to ever compare expression
attributes in the index since WARM is never used when one of those columns
is updated. Hence I've now created a new version of FormIndexDatum which
only returns plain attributes and hence recheck routine does not need
access to any executor stuff.

5. We don't release the lock of the buffer if we are going to apply
recheck. This should address part of the your comment. I haven't though put
them inside a single wrapper function because there is just one caller to
amrecheck function and after this change, it looked ok. But if you don't
still like, I'll make that change.

6. Unnecessary header files included at various places have been removed.

7. Some comments have been updated and rewritten. Hopefully they look
better than before now.

8. I merged the main WARM patch and the chain conversion code in a single
patch since I don't think we will apply them separately. But if it helps
with review, let me know and I can split that again.

9. I realised that we don't really need xs_tuple_recheck in the scan
descriptor and hence removed that and used a stack variable to get that
info.

10. Accidentally WARM was disabled on the system relations during one of
the earlier rebases. So restored that back and made a slight change to
regression expected output.

All tests pass with the patch set. I am now writing TAP tests for WARM and
will submit that separately. Per your suggestion, I am first turning the
stress tests I'd used earlier to use TAP tests and then add more tests,
especially around recovery and index addition/deletion.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0004_freeup_3bits_ip_posid_v18.patch application/octet-stream 6.6 KB
0003_clear_ip_posid_blkid_refs_v18.patch application/octet-stream 11.0 KB
0002_track_root_lp_v18.patch application/octet-stream 38.4 KB
0001_interesting_attrs_v18.patch application/octet-stream 11.4 KB
0005_warm_updates_v18.patch application/octet-stream 212.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-21 13:25:49 Re: Patch: Write Amplification Reduction Method (WARM)
Previous Message Stephen Frost 2017-03-21 13:04:09 Re: increasing the default WAL segment size