Quick Links

Re: Patch: Write Amplification Reduction Method (WARM)

From:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Patch: Write Amplification Reduction Method (WARM)
Date:	2017-01-24 17:42:44
Message-ID:	CABOikdNigDQ59DyAk1hQh6PpDJwaVqs3VV4ZqqFtDHZiz9-2-Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jan 19, 2017 at 6:35 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
wrote:

>
>
> Revised patch is attached.
>

I've now also rebased the main WARM patch against the current master
(3eaf03b5d331b7a06d79 to be precise). I'm attaching Alvaro's patch to get
interesting attributes (prefixed with 0000 since the other two patches are
based on that). The changes to support system tables are now merged with
the main patch. I could separate them if it helps in review.

I am also including a stress test workload that I am currently running to
test WARM's correctness since Robert raised a valid concern about that. The
idea is to include a few more columns in the pgbench_accounts table and
have a few more indexes. The additional columns with indexes kind of share
a relationship with the "aid" column. But instead of a fixed value, values
for these columns can vary within a fixed, non-overlapping range. For
example, for aid = 1, aid1's original value will be 10 and it can vary
between 8 to 12. Similarly, aid2's original value will be 20 and it can
vary between 16 to 24. This setup allows us to update these additional
columns (thus force WARM), but still ensure that we can do some sanity
checks on the results.

The test contains a bunch of UPDATE, FOR UPDATE, FOR SHARE transactions.
Some of these transactions commit and some rollback. The checks are
in-place to ensure that we always find exactly one tuple irrespective of
which column we use to fetch the row. Of course, when the aid[1-4] columns
are used to fetch tuples, we need to scan with a range instead of an
equality. Then we do a bunch of operations like CREATE INDEX, DROP INDEX,
CIC, run long transactions, VACUUM FULL etc while the tests are running and
ensure that the sanity checks always pass. We could do a few other things
like, may be marking these indexes as UNIQUE or keeping a long transaction
open while doing updates and other operations. I'll add some of those to
the test, but suggestions are welcome.

I do see a problem with CREATE INDEX CONCURRENTLY with these tests, though
everything else has run ok so far (I am yet to do very long running tests.
Probably just a few hours tests today).

I'm trying to understand why CIC fails to build a consistent index. I think
I've some clue now why it could be happening. With HOT, we don't need to
worry about broken chains since at the very beginning we add the index
tuple and all subsequent updates will honour the new index while deciding
on HOT updates i.e. we won't create any new broken HOT chains once we start
building the index. Later during validation phase, we only need to insert
tuples that are not already in the index. But with WARM, I think the check
needs to be more elaborate. So even if the TID (we always look at its root
line pointer etc) exists in the index, we will need to ensure that the
index key matches the heap tuple we are dealing with. That looks a bit
tricky. May be we can lookup the index using key from the current heap
tuple and then see if we get a tuple with the same TID back. Of course, we
need to do this only if the tuple is a WARM tuple. The other option is that
we collect not only TIDs but also keys while scanning the index. That might
increase the size of the state information for wildly wide indexes. Or may
be just turn WARM off if there exists a build-in-progress index.

Suggestions/reviews/tests welcome.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
warm_stress_test.tar.gz	application/x-gzip	3.0 KB
0000_interesting_attrs.patch	application/octet-stream	11.6 KB
0001_track_root_lp_v9.patch	application/octet-stream	37.3 KB
0002_warm_updates_v9.patch	application/octet-stream	253.2 KB

In response to

Re: Patch: Write Amplification Reduction Method (WARM) at 2017-01-19 13:05:21 from Pavan Deolasee

Responses

Re: Patch: Write Amplification Reduction Method (WARM) at 2017-01-25 16:36:48 from Alvaro Herrera
Re: Patch: Write Amplification Reduction Method (WARM) at 2017-01-25 21:08:32 from Alvaro Herrera

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2017-01-24 17:44:11	Re: ICU integration
Previous Message	Andres Freund	2017-01-24 17:41:58	Re: lseek/read/write overhead becomes visible at scale ..