| From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> | 
|---|---|
| To: | Binguo Bao <djydewang(at)gmail(dot)com> | 
| Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Optimize partial TOAST decompression | 
| Date: | 2019-07-06 15:23:37 | 
| Message-ID: | 20190706152337.l64zhoj3xs3qkfc4@development | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sat, Jul 06, 2019 at 02:27:56AM +0800, Binguo Bao wrote:
>Hi, Tomas!
>Thanks for your testing and the suggestion.
>
>That's quite bizarre behavior - it does work with a prefix, but not with
>> suffix. And the exact ERROR changes after the prefix query.
>
>
>I think bug is caused by "#2  0x00000000004c3b08 in
>heap_tuple_untoast_attr_slice (attr=<optimized out>, sliceoffset=0,
>slicelength=-1) at tuptoaster.c:315",
>since I ignore the case where slicelength is negative, and I've appended
>some comments for heap_tuple_untoast_attr_slice for the case.
>
>FWIW the new code added to this
>> function does not adhere to our code style, and would deserve some
>> additional explanation of what it's doing/why. Same for the
>> heap_tuple_untoast_attr_slice, BTW.
>
>
>I've added more comments to explain the code's behavior.
>Besides, I also modified the macro "TOAST_COMPRESS_RAWDATA" to
>"TOAST_COMPRESS_DATA" since
>it is used to get toast compressed data rather than raw data.
>
Thanks, this seems to address the issue - I can no longer reproduce the
crashes, allowing the benchmark to complete. I'm attaching the script I
used and spreadsheet with a summary of results.
For the cases I've tested (100k - 10M values, different compressibility,
different prefix/length values), the results are kinda mixed - many
cases got much faster (~2x), but other cases got slower too. We're
however talking about queries taking a couple of miliseconds, so in
absolute numbers the differences are small.
That does not mean the optimization is useless - but the example shared
at the beginning of this thread is quite extreme, as the values are
extremely compressible. Each value is ~38MB (in plaintext), but a table
with 100 such values has only ~40MB. Tha's 100:1 compression ratio,
which I think is not typical for real-world data sets.
The data I've used are less extreme, depending on the fraction of random
data in values.
I went through the code too. I've reworder a couple of comments and code
style issues, but there are a couple of more serious issues.
1) Why rename TOAST_COMPRESS_RAWDATA to TOAST_COMPRESS_DATA?
This seems unnecessary, and it discards the clear hint that it's about
accessing the *raw* data, and the relation to TOAST_COMPRESS_RAWSIZE.
IMHO we should keep the original naming.
2) pglz_maximum_compressed_size signatures are confusing
There are two places with pglz_maximum_compressed_size signature, and
those places are kinda out of sync when it comes to parameter names:
  int32
  pglz_maximum_compressed_size(int32 raw_slice_size,
                               int32 total_compressed_size)
  extern
  int32 pglz_maximum_compressed_size(int32 raw_slice_size,
                                     int32 raw_size);
Also, pg_lzcompress.c has no concept of a "slice" because it only deals
with simple compression, slicing is responsibility of the tuptoaster. So
we should not mix those two, not even in comments.
I propose tweaks per the attached patch - I think it makes the code
clearer, and it's mostly cosmetic stuff. But I haven't tested the
changes beyond "it compiles".
regards
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001-Optimize-partial-TOAST-decompression.patch | text/plain | 6.9 KB | 
| 0002-review-reworks.patch | text/plain | 7.4 KB | 
| random-bench2.sh | application/x-sh | 1.6 KB | 
| comparison.ods | application/vnd.oasis.opendocument.spreadsheet | 48.6 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2019-07-06 16:05:14 | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) | 
| Previous Message | Dilip Kumar | 2019-07-06 14:56:29 | Re: POC: Cleaning up orphaned files using undo logs |