Re: [HACKERS] Proposal: generic WAL compression

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Oleg Ivanov <o(dot)ivanov(at)postgrespro(dot)ru>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Proposal: generic WAL compression
Date: 2017-11-17 12:02:43
Message-ID: CAO_YK0VEAMm1ChYidO+XkbEPWDdOkDdnAJR1j3QVgJdtkrP_yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wonder if the algorithm could be changed to operate with units of 2 or 4
bytes (instead of 1).
Since the algorithm speed is essentially doubled/quadrupled it could yield
a better runtime/compression trade-off.

Does that make sense?

--
Arthur Silva

On Wed, Nov 1, 2017 at 12:43 AM, Oleg Ivanov <o(dot)ivanov(at)postgrespro(dot)ru>
wrote:

> Hackers,
>
> a few years ago generic WAL was proposed by Alexander Korotkov (
> https://www.postgresql.org/message-id/flat/CAPpHfdsXwZmojm6
> Dx%2BTJnpYk27kT4o7Ri6X_4OSWcByu1Rm%2BVA%40mail.gmail.com#CAP
> pHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA(at)mail(dot)gmail(dot)com). and was
> committed into PostgreSQL 9.6.
> One of the generic WAL advantages is the common interface for safe
> interaction with WAL for modules and extensions. The interface allows
> module to register the page, then change it, and then generic WAL generates
> and writes into xlog the algorithm of changing the old version of page into
> the new one. In the case of crash and recovery this algorithm may be
> applied.
>
> Bloom and RUM indexes use generic WAL. The issue is that the generic
> algorithm of turning the old page into the new one is not optimal in the
> sense of record size in xlog. Now the main idea of the approach is to find
> all positions in the page where new version is different from the original
> one and write these changes into xlog. It works well if the page was
> rewritten completely or if only a few bytes have been changed.
> Nevertheless, this approach produces too large WAL record in the case of
> inserting or deleting a few bytes into the page. In this case there are
> almost no position where the old version and the new one are equal, so the
> record size is near the page size, but actual amount of changes in the page
> is small. This is exactly what happens often in RUM indexes.
>
> In order to overcome that issue, I would like to propose the patch, which
> provides possibility to use another approach of the WAL record
> construction. If another approach fails to make a short enough record, it
> rolls back to the original approach. The main idea of another approach is
> not to perform bytewise comparison of pages, but finding the minimal
> editing distance between two pages and the corresponding editing algorithm.
> In the general case, finding editing distance takes O(N^2) time and memory.
> But there is also an algorithm which requires only O(ND) time and O(D^2)
> memory, where D is the editing distance between two sequences. Also for
> given D algorithm may show that the minimal editing distance between two
> sequences is more than D in the same amount of time and memory.
>
> The special case of this algorithm which does not consider replace
> operations is described in the paper (http://www.xmailserver.org/diff2.pdf).
> The version of this algorithm which consumes O(ND) time and O(N) memory is
> used in diff console command, but for our purposes we don't need to
> increase the constant for the time in order to decrease memory complexity.
> For RUM indexes we usually have small enough editing distance (less than
> 64), so D^2 is not too much to store.
>
> The results of experiments:
>
> +------------------------------+----------------+-----------
> -----+----------------+----------------+
> | Name | WAL_diff(MB) | WAL_orig(MB) |
> Time_diff(s) | Time_orig(s) |
> +------------------------------+----------------+-----------
> -----+----------------+----------------+
> | rum: make installcheck | 38.9 | 82.5 | 4.37 |
> 4.16 |
> +------------------------------+----------------+-----------
> -----+----------------+----------------+
> | bloom: make installcheck | 1.0 | 1.0 | 0.66 |
> 0.53 |
> +------------------------------+----------------+-----------
> -----+----------------+----------------+
> | test00.sql | 20.5 | 51.0 | 1.86 |
> 1.41 |
> +------------------------------+----------------+-----------
> -----+----------------+----------------+
> | test01.sql | 207.1 | 219.7 | 8.06 | 6.89
> |
> +------------------------------+----------------+-----------
> -----+----------------+----------------+
>
> We can see that the patch provides a little slowdown, but compresses
> generic WAL efficiently for RUM index. Also I'm going to do a few more
> experiments on this patch with another data.
>
> The patch was tested on Lubuntu 14.04, but should not contain any
> platform-specific items. The patch, the files and scripts for doing the
> experiments and performance tests are attached.
>
> Oleg Ivanov
> Postgres Professional
> The Russian PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-11-17 12:24:59 Re: [HACKERS] Partition-wise aggregation/grouping
Previous Message David Rowley 2017-11-17 11:58:04 Re: [HACKERS] Proposal: Local indexes for partitioned table