Re: Optimize partial TOAST decompression

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Binguo Bao <djydewang(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize partial TOAST decompression
Date: 2019-07-10 14:47:25
Message-ID: 20190710144725.kaswinyw2bdgb6tz@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 10, 2019 at 01:35:25PM +0800, Binguo Bao wrote:
>Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> 于2019年7月10日周三 上午5:12写道:
>
>> On Sat, Jul 06, 2019 at 05:23:37PM +0200, Tomas Vondra wrote:
>> >On Sat, Jul 06, 2019 at 02:27:56AM +0800, Binguo Bao wrote:
>> >>Hi, Tomas!
>> >>Thanks for your testing and the suggestion.
>> >>
>> >>That's quite bizarre behavior - it does work with a prefix, but not with
>> >>>suffix. And the exact ERROR changes after the prefix query.
>> >>
>> >>
>> >>I think bug is caused by "#2 0x00000000004c3b08 in
>> >>heap_tuple_untoast_attr_slice (attr=<optimized out>, sliceoffset=0,
>> >>slicelength=-1) at tuptoaster.c:315",
>> >>since I ignore the case where slicelength is negative, and I've appended
>> >>some comments for heap_tuple_untoast_attr_slice for the case.
>> >>
>> >>FWIW the new code added to this
>> >>>function does not adhere to our code style, and would deserve some
>> >>>additional explanation of what it's doing/why. Same for the
>> >>>heap_tuple_untoast_attr_slice, BTW.
>> >>
>> >>
>> >>I've added more comments to explain the code's behavior.
>> >>Besides, I also modified the macro "TOAST_COMPRESS_RAWDATA" to
>> >>"TOAST_COMPRESS_DATA" since
>> >>it is used to get toast compressed data rather than raw data.
>> >>
>> >
>> >Thanks, this seems to address the issue - I can no longer reproduce the
>> >crashes, allowing the benchmark to complete. I'm attaching the script I
>> >used and spreadsheet with a summary of results.
>> >
>> >For the cases I've tested (100k - 10M values, different compressibility,
>> >different prefix/length values), the results are kinda mixed - many
>> >cases got much faster (~2x), but other cases got slower too. We're
>> >however talking about queries taking a couple of miliseconds, so in
>> >absolute numbers the differences are small.
>> >
>> >That does not mean the optimization is useless - but the example shared
>> >at the beginning of this thread is quite extreme, as the values are
>> >extremely compressible. Each value is ~38MB (in plaintext), but a table
>> >with 100 such values has only ~40MB. Tha's 100:1 compression ratio,
>> >which I think is not typical for real-world data sets.
>> >
>> >The data I've used are less extreme, depending on the fraction of random
>> >data in values.
>> >
>> >I went through the code too. I've reworder a couple of comments and code
>> >style issues, but there are a couple of more serious issues.
>> >
>> >
>> >1) Why rename TOAST_COMPRESS_RAWDATA to TOAST_COMPRESS_DATA?
>> >
>> >This seems unnecessary, and it discards the clear hint that it's about
>> >accessing the *raw* data, and the relation to TOAST_COMPRESS_RAWSIZE.
>> >IMHO we should keep the original naming.
>> >
>> >
>> >2) pglz_maximum_compressed_size signatures are confusing
>> >
>> >There are two places with pglz_maximum_compressed_size signature, and
>> >those places are kinda out of sync when it comes to parameter names:
>> >
>> > int32
>> > pglz_maximum_compressed_size(int32 raw_slice_size,
>> > int32 total_compressed_size)
>> >
>> > extern
>> > int32 pglz_maximum_compressed_size(int32 raw_slice_size,
>> > int32 raw_size);
>> >
>> >Also, pg_lzcompress.c has no concept of a "slice" because it only deals
>> >with simple compression, slicing is responsibility of the tuptoaster. So
>> >we should not mix those two, not even in comments.
>> >
>> >
>> >I propose tweaks per the attached patch - I think it makes the code
>> >clearer, and it's mostly cosmetic stuff. But I haven't tested the
>> >changes beyond "it compiles".
>> >
>> >
>> >regards
>> >
>>
>> FWIW I've done another round of tests with larger values (up to ~10MB)
>> and the larger the values the better the speedup (a bit as expected).
>> Attached is the script I used and a spreasheet with a summary.
>>
>> We still need a patch addressing the review comments, though.
>>
>>
>> regards
>>
>> --
>> Tomas Vondra http://www.2ndQuadrant.com
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>

>Hi, Tomas! Sorry for the late reply. Thank you for further testing, I
>am trying to reproduce your first test summary, since I think the
>performance of the patch will not drop in almost all cases.
>

I've done some changes to the test script since the first benchmark,
aiming to make the results more stable

1) uses larger amount of data (10x more)

2) the data set recreated for each run (to rule out random differences in
the random data affecting the runs differently)

3) minor configuration changes (more shared buffers etc.)

I don't think we need to worry about small differences (within ~5%) which
can easily be due to changes to binary layout. And otherwise results from
the second benchmark round seem much more stable.

>Besides, If a value is composed of random characters,
>it will be hard to be compressed, because pglz requires a 25% compression
>ratio by default or not worth it.
>This means that querying the value will not trigger the patch. But the
>first test results show that the patch
>is slower than the master when the value is composed of random characters,
>which is confusing.
>

Yes, I know. But the values have compressible and incompressible (random)
part, so in most cases the value should be compressible, although with
various compression ratio. I have not tracked the size of the loaded data
so I don't know which cases happened to be compressed or not. I'll rerun
the test and I'll include this information.

>From the second test result, we can infer that the first test result
>was indeed affected by a random disturbance in the case of a small
>time-consuming.
>

Yes, I agree.

>> We still need a patch addressing the review comments, though.
>done:)

Thanks.

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2019-07-10 14:48:43 Re: Index Skip Scan
Previous Message Floris Van Nee 2019-07-10 14:39:55 Re: Index Skip Scan