Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-27 08:49:03
Message-ID: CABOikdM_sVU9aCJ29NR=2ZxkrvMcUmd4ueM3=c7eeupJ2Z-2Cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
wrote:

>
>
> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
>
>>
>>
>> I was worried for the case if the index is created non-default
>> collation, will the datumIsEqual() suffice the need. Now again
>> thinking about it, I think it will because in the index tuple we are
>> storing the value as in heap tuple. However today it occurred to me
>> how will this work for toasted index values (index value >
>> TOAST_INDEX_TARGET). It is mentioned on top of datumIsEqual() that it
>> probably won't work for toasted values. Have you considered that
>> point?
>>
>>
> No, I haven't and thanks for bringing that up. And now that I think more
> about it and see the code, I think the naive way of just comparing index
> attribute value against heap values is probably wrong. The example of
> TOAST_INDEX_TARGET is one such case, but I wonder if there are other
> varlena attributes that we might store differently in heap and index. Like
> index_form_tuple() ->heap_fill_tuple seem to some churning for varlena.
> It's not clear to me if index_get_attr will return the values which are
> binary comparable to heap values.. I wonder if calling index_form_tuple on
> the heap values, fetching attributes via index_get_attr on both index
> tuples and then doing a binary compare is a more robust idea. Or may be
> that's just duplicating efforts.
>
> While looking at this problem, it occurred to me that the assumptions made
> for hash indexes are also wrong :-( Hash index has the same problem as
> expression indexes have. A change in heap value may not necessarily cause a
> change in the hash key. If we don't detect that, we will end up having two
> hash identical hash keys with the same TID pointer. This will cause the
> duplicate key scans problem since hashrecheck will return true for both the
> hash entries. That's a bummer as far as supporting WARM for hash indexes is
> concerned, unless we find a way to avoid duplicate index entries.
>
>
Revised patches are attached. I've added a few more regression tests which
demonstrates the problems with compressed and toasted attributes. I've now
implemented the idea of creating index tuple from heap values before doing
binary comparison using datumIsEqual. This seems to work ok and I see no
reason this should not be robust. But if there are things which could still
be problematic, please let me know.

Seeing the problem that hash indexes will have, I've removed support for
it. It's probably a good decision anyways since hash indexes are being
hacked around very actively and it might take it some time to settle down
fully. It'll be a good idea to keep WARM away from it to avoid more
complication. I've a few ideas about how to make it work, but we can
address those later.

Other than that, I've now converted the stress test framework used earlier
to test WARM into TAP tests and those tests are attached too.

Finally, I've implemented complete pg_stat support for tracking amount of
WARM chains in the table. AV can use that to trigger clean-up only when the
fraction of warm chains goes beyond configured scale. Similarly, the patch
also adds an index-level scale factor and cleanup is triggered on an index
only if the number of WARM pointers in the index are beyond the set
fraction. This should greatly help us to avoid second index scans on
indexes which are either not updated at all or updated rarely. The best
case scenario where out of N indexes only one index receives update, WARM
will avoid updates to N-1 indexes and these N-1 indexes need not be scanned
twice during WARM cleanup. OTOH if most indexes on a table receive updates,
then probably neither WARM nor cleanup will be efficient for such
workloads. I wonder if we should provide a table-level knob to turn WARM
completely off on such workloads, however rare they might be. I think this
patch requires some more work and documentation changes are completely
missing.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0005_warm_updates_v21.patch application/octet-stream 253.1 KB
0002_track_root_lp_v21.patch application/octet-stream 38.4 KB
0003_clear_ip_posid_blkid_refs_v21.patch application/octet-stream 11.0 KB
0004_freeup_3bits_ip_posid_v21.patch application/octet-stream 6.6 KB
0001_interesting_attrs_v21.patch application/octet-stream 12.5 KB
0007_vacuum_enhancements_v21.patch application/octet-stream 55.9 KB
0006_warm_taptests_v21.patch application/octet-stream 19.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2017-03-27 09:02:03 Re: Parallel bitmap heap scan
Previous Message Pavel Stehule 2017-03-27 08:48:56 Re: New CORRESPONDING clause design