Re: pglz compression performance, take two

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Andrey Borodin <amborodin86(at)gmail(dot)com>
Cc: Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pglz compression performance, take two
Date: 2023-02-08 10:16:47
Message-ID: e04e4097-4cac-445b-f7bf-53edf305eccc@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/7/23 21:18, Andres Freund wrote:
> Hi,
>
> On 2023-02-05 10:36:39 -0800, Andrey Borodin wrote:
>> On Fri, Jan 6, 2023 at 10:02 PM Andrey Borodin <amborodin86(at)gmail(dot)com> wrote:
>>>
>>> Hello! Please find attached v8.
>>
>> I got some interesting feedback from some patch users.
>> There was an oversight that frequently yielded results that are 1,2 or
>> 3 bytes longer than expected.
>> Looking closer I found that the correctness of the last 3-byte tail is
>> checked in two places. PFA fix for this. Previously compressed data
>> was correct, however in some cases few bytes longer than the result of
>> current pglz implementation.
>
> This version fails on cfbot, due to address sanitizer:
>
> https://cirrus-ci.com/task/4921632586727424
> https://api.cirrus-ci.com/v1/artifact/task/4921632586727424/log/src/test/regress/log/initdb.log
>
>
> performing post-bootstrap initialization ... =================================================================
> ==15991==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61e000002ee0 at pc 0x558e1b847b16 bp 0x7ffd35782f70 sp 0x7ffd35782f68
> READ of size 1 at 0x61e000002ee0 thread T0
> #0 0x558e1b847b15 in pglz_hist_add /tmp/cirrus-ci-build/src/common/pg_lzcompress.c:310
> #1 0x558e1b847b15 in pglz_compress /tmp/cirrus-ci-build/src/common/pg_lzcompress.c:680
> #2 0x558e1aa86ef0 in pglz_compress_datum /tmp/cirrus-ci-build/src/backend/access/common/toast_compression.c:65
> #3 0x558e1aa87af2 in toast_compress_datum /tmp/cirrus-ci-build/src/backend/access/common/toast_internals.c:68
> #4 0x558e1ac22989 in toast_tuple_try_compression /tmp/cirrus-ci-build/src/backend/access/table/toast_helper.c:234
> #5 0x558e1ab6af24 in heap_toast_insert_or_update /tmp/cirrus-ci-build/src/backend/access/heap/heaptoast.c:197
> #6 0x558e1ab4a2a6 in heap_update /tmp/cirrus-ci-build/src/backend/access/heap/heapam.c:3533
> ...
>

Yeah, and valgrind seems to hit the same issue (it's not labeled as
buffer overflow, but it seems to be exactly the same place):

==380682== Invalid read of size 1
==380682== at 0xBCEAAB: pglz_hist_add (pg_lzcompress.c:310)
==380682== by 0xBCF130: pglz_compress (pg_lzcompress.c:670)
==380682== by 0x4A911F: pglz_compress_datum (toast_compression.c:65)
==380682== by 0x4A97E2: toast_compress_datum (toast_internals.c:68)
==380682== by 0x54CCA4: toast_tuple_try_compression (toast_helper.c:234)
==380682== by 0x4FFC33: heap_toast_insert_or_update (heaptoast.c:197)
==380682== by 0x4ED498: heap_update (heapam.c:3624)
==380682== by 0x4EE023: simple_heap_update (heapam.c:4060)
==380682== by 0x5B1B2B: CatalogTupleUpdateWithInfo (indexing.c:329)
==380682== by 0x65C3AB: update_attstats (analyze.c:1741)
==380682== by 0x65A054: do_analyze_rel (analyze.c:602)
==380682== by 0x659405: analyze_rel (analyze.c:261)
==380682== by 0x70A162: vacuum (vacuum.c:523)
==380682== by 0x8DF8F7: autovacuum_do_vac_analyze (autovacuum.c:3155)
==380682== by 0x8DE74A: do_autovacuum (autovacuum.c:2473)
==380682== by 0x8DD49E: AutoVacWorkerMain (autovacuum.c:1716)
==380682== by 0x8DD097: StartAutoVacWorker (autovacuum.c:1494)
==380682== by 0x8EA5B2: StartAutovacuumWorker (postmaster.c:5481)
==380682== by 0x8EA10A: process_pm_pmsignal (postmaster.c:5192)
==380682== by 0x8E6121: ServerLoop (postmaster.c:1770)
==380682== Address 0xe722c78 is 103,368 bytes inside a recently
re-allocated block of size 131,072 alloc'd
==380682== at 0x48457AB: malloc (vg_replace_malloc.c:393)
==380682== by 0xB95423: AllocSetAlloc (aset.c:929)
==380682== by 0xBA2B6C: palloc (mcxt.c:1224)
==380682== by 0x4A0962: heap_copytuple (heaptuple.c:687)
==380682== by 0x73A2BB: tts_buffer_heap_copy_heap_tuple
(execTuples.c:842)
==380682== by 0x658E42: ExecCopySlotHeapTuple (tuptable.h:464)
==380682== by 0x65B288: acquire_sample_rows (analyze.c:1261)
==380682== by 0x659E42: do_analyze_rel (analyze.c:536)
==380682== by 0x659405: analyze_rel (analyze.c:261)
==380682== by 0x70A162: vacuum (vacuum.c:523)
==380682== by 0x8DF8F7: autovacuum_do_vac_analyze (autovacuum.c:3155)
==380682== by 0x8DE74A: do_autovacuum (autovacuum.c:2473)
==380682== by 0x8DD49E: AutoVacWorkerMain (autovacuum.c:1716)
==380682== by 0x8DD097: StartAutoVacWorker (autovacuum.c:1494)
==380682== by 0x8EA5B2: StartAutovacuumWorker (postmaster.c:5481)
==380682== by 0x8EA10A: process_pm_pmsignal (postmaster.c:5192)
==380682== by 0x8E6121: ServerLoop (postmaster.c:1770)
==380682== by 0x8E5B54: PostmasterMain (postmaster.c:1463)
==380682== by 0x7A806C: main (main.c:200)
==380682==

The place allocating the buffer changes over time, but the first part
(invalid read) seems to be exactly the same.

FWIW I did run previous versions using valgrind, so this gotta be due
some recent change.

>
> Independent of this failure, I'm worried about the cost/benefit analysis of a
> pglz change that changes this much at once. It's quite hard to review.
>

I agree.

I think I managed to understand what the patch does during the review,
but it's so much harder - it'd definitely be better to have this split
into smaller parts, somehow. Interestingly enough the commit message
actually says this:

This patch accumulates several changes to pglz compression:
1. Convert macro-functions to regular functions for readability
2. Use more compact hash table with uint16 indexes instead of pointers
3. Avoid prev pointer in hash table
4. Use 4-byte comparisons during a search instead of 1-byte
comparisons

Which I think is a pretty good recipe how to split the patch. (And we
also need a better commit message, or at least a proposal.)

This'd probably also help when investigating the extra byte issue,
discussed yesterday. (Assuming it's not related to the invalid access
reported by valgrind / asan).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2023-02-08 10:24:33 Re: Why cann't simplify stable function in planning phase?
Previous Message Hayato Kuroda (Fujitsu) 2023-02-08 09:47:08 RE: Exit walsender before confirming remote flush in logical replication