Re: Zedstore - compressed in-core columnar storage

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: Alexandra Wang <lewang(at)pivotal(dot)io>
Cc: Taylor Vesely <tvesely(at)pivotal(dot)io>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, DEV_OPS <devops(at)ww-it(dot)cn>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Zedstore - compressed in-core columnar storage
Date: 2020-11-11 00:13:17
Message-ID: CAE-ML+-HwY4X4uTzBesLhOotHF7rUvP2Ur-rvEpqz2PUgK4K3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

We (Jacob and me) have an update for this thread.

1. We recently made some improvements to the table AM APIs for fetching
a single row (tuple_fetch_row_version()) and locking (and fetching) a
tuple (tuple_lock()), such that they could take a set of columns. We
extracted these columns at plan time and in some cases, executor time.
The changes are in the same spirit as some column-oriented changes that
are already a part of Zedstore - namely the ability to pass a set of
columns to sequential and index scans among other operations.

We observed that the two table AM functions are called in contexts
which don't need the entire set of columns to be populated in the
output TupleTableSlots associated with these APIs. For instance, in
DELETE RETURNING, we don't need to fetch all of the columns, just the
ones in the RETURNING clause.

We saw improvements (see results attached) for a variety of tests - we
added a bunch of tests in our storageperf test suite to test these
cases. We don't see a performance improvement for UPSERT and ON CONFLICT
DO NOTHING as there is an index lookup pulling in the entire row
preceding the call to table_tuple_lock() in both these cases. We do
see significant improvements (~3x) for DELETE RETURNING and row-level
locking and around a ~25x improvement in TidScan runtime.
Please refer to src/test/storageperf for the storageperf test suite.

2. We absorbed the scanCols patch [1], replacing some of the existing
executor-level column extraction for scans with the scanCols populated
during planning as in [1].

3. We also merged Zedstore upto PG 14 commit: efc5dcfd8a
PFA the latest version of the Zedstore patch.

Regards,

Jacob and Soumyadeep

[1] https://www.postgresql.org/message-id/flat/CAAKRu_YxyYOCCO2e83UmHb51sky1hXgeRzQw-PoqT1iHj2ZKVg%40mail.gmail.com#681a254981e915805aec2aea9ea9caf4

Attachment Content-Type Size
storagerperf_results application/octet-stream 9.9 KB
v5-zedstore.patch text/x-patch 2.0 MB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-11-11 00:44:46 Re: Reduce the number of special cases to build contrib modules on windows
Previous Message Michael Paquier 2020-11-11 00:06:59 Re: Prefer TG_TABLE_NAME over TG_RELNAME in tests