Re: WARM and indirect indexes

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WARM and indirect indexes
Date: 2017-01-12 04:09:34
Message-ID: CABOikdOxe0yDiRk6GTY3ystZROyEpap7ZNL4qkXVEatynX3KPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 12, 2017 at 3:08 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Jan 10, 2017 at 2:24 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > The big advantage of WARM is that it works automatically, like HOT: the
> > user doesn't need to do anything different than today to get the
> > benefit. With indirect indexes, the user needs to create the index as
> > indirect explicitely.
>
> However, this cuts both ways. If the WARM implementation has bugs --
> either data-corrupting bugs or crash bugs or returns-wrong-answer bugs
> or performance-in-corner-cases bugs -- everyone will be exposed to
> them.

IMHO WARM is way less complicated or intrusive than HOT was. It doesn't
change any of the MVCC mechanics or doesn't change when and how tuples are
marked dead or when and how dead tuples are removed. What it changes is how
tuples are indexed and accessed via index methods. So I believe bugs in
this area can possibility corrupt indexes or return wrong results, which is
bad but may have happened with many other patches we did in recent past.
The other thing the patch changes is how update-chain is maintained. In
order to quickly find the root offset while updating a tuple, we now store
the root offset in the t_ctid field of the last tuple in the chain and use
a separate bit to mark end-of-the-chain (instead of relying of t_ctid =
t_self check). That can lead to problems if chains are not maintained or
followed correctly. These changes are in the first patch of the patch
series and if you've any suggestions on how to improve that or solidify
chain following, please let me know. I was looking for some way to hide
t_ctid field to ensure that the links are only accessed via some standard
API.

I think as a developer of the patch, what I would like to know is what can
we do address concerns raised by you? What kind of tests you would like to
do to get confidence in the patch? What I've done so far is to rely on the
existing tests such as regression, isolation and pgbench. After adding
support for system tables, the code gets exercised even more during
regression tests, which is good. I also performed a few tests where I would
turn sequential scan off and then run "make installcheck" and compare
regression diffs between master and patched code. That helps because the
index access paths are used even more often. I did not find any bugs in
those tests.

My favourite test during HOT development was to run pgbench with large
number of clients and periodically check for data consistency while tests
are running, by comparing sum(tbalance), sum(bbalance) and sum(abalance)
values. I'm yet to do that kind of test with WARM because that would
require a slightly different test setup (more indexes and more update
statements), but I intend to do those tests too. I have also started
writing regression test cases which could lead to some corner cases and
share them for inclusion irrespective of WARM.

Please share your thoughts on what more can be and should be done.

Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-01-12 04:12:15 Re: Passing query string to workers
Previous Message Peter Eisentraut 2017-01-12 04:01:12 Re: background sessions