Re: Re: A separate table level option to control compression

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: Re: A separate table level option to control compression
Date: 2019-04-03 06:10:57
Message-ID: CABOikdO5LpuHXPZjm4tHUHRrV0oa4VNvW8y23rxxiHyg47Jm+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 3, 2019 at 10:19 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:

>
>
> I have been looking at this patch more, and here are some notes:
> - The tests can be really simplified using directly reltoastrelid, so
> I changed the queries this way. I am aware that the surroundings
> hardcode directly the relation name, but that's not really elegant in
> my opinion. And I am really tempted to adjust these as well to
> directly use reltoastrelid.
>

I agree using reltoastrelid is a better way. I just followed the
surrounding tests, but +1 to change all of these use reltoastrelid.

> - The docs had a weird indentation.
>

Sorry about that. I was under the impression that it doesn't matter since
the doc builder ultimately chooses the correct indentation. I'd a patch
ready after Sawada's review, but since you already fixed that, I am not
sending it.

> - I think that we should be careful with the portability of
> pg_column_size(), so I have added comparisons instead of the direct
> values in a way which does not change the meaning of the tests nor
> their coverage.
>

Ok, understood.

> - Having RelationGetCompressTupleTarget use directly
> toast_tuple_threshold as default argument is I think kind of
> confusing, so let's use a different static value, named
> COMPRESS_TUPLE_TARGET in the attached. This is similar to
> TOAST_TUPLE_TARGET for the toast tuple threshold.
>

Sounds good. This also takes care of the other issue you brought about
"hoff" getting subtracted twice.

> - The comments in tuptoaster.h need to be updated to outline the
> difference between the compression invocation and the toast invocation
> thresholds. The wording could be better though.
>

I see that you've done this already. But please let me if more is needed.

> - Better to avoid comments in the middle of the else/if blocks in my
> opinion.
>

This is probably a personal preference and I've seen many other places
where we do that (see e.g. lazy_scan_heap()). But I don't have any strong
views on this. I see that you've updated comments in tuptoaster.h, so no
problem if you want to remove the comments from the code block.

>
> Also, the previous versions of the patch do that when doing a heap
> insertion (heapam.c and rewriteheap.c):
> + toast_tuple_threshold = RelationGetToastTupleTarget(relation,
> + TOAST_TUPLE_THRESHOLD);
> + compress_tuple_threshold =
> RelationGetCompressTupleTarget(relation,
> + toast_tuple_threshold);
> + compress_tuple_threshold = Min(compress_tuple_threshold,
> + toast_tuple_threshold);
> [...]
> need_toast = (HeapTupleHasExternal(&oldtup) ||
> HeapTupleHasExternal(newtup) ||
> - newtup->t_len > TOAST_TUPLE_THRESHOLD);
> + newtup->t_len > compress_tuple_threshold);
>
> This means that the original code always uses the compilation-time,
> default value of toast_tuple_target for all relations. But this gets
> changed so as we would use the value set at relation level for
> toast_tuple_target if the reloption is changed, without touching
> compress_tuple_threshold. This is out of the scope of this patch, but
> shouldn't we always use the relation-level value instead of the
> compiled one? Perhaps there is something I am missing?
>

Yeah, this is an issue with the existing code. Even though we allow setting
toast_tuple_target to a value less than compile-time TOAST_TUPLE_THRESHOLD,
the code doesn't really honour a value less than TOAST_TUPLE_THRESHOLD. In
other words, setting toast_tuple_target lesser than TOAST_TUPLE_THRESHOLD
doesn't have any effect. We don't even create a toast table if the
estimated length of tuple is not greater than TOAST_TUPLE_THRESHOLD.

The change introduced by this patch will now trigger the tuptoaster code
when the compression or toast threshold is set to a value lower than
TOAST_TUPLE_THRESHOLD. But as far as I can see, that doesn't have any bad
effect on the toasting since toast_insert_or_update() is capable of
handling the case when the toast table is missing. There will be a
behavioural change though. e.g.

CREATE TABLE testtab (a text) WITH (toast_tuple_target=256);
INSERT INTO testtab VALUES <some value larger than 256 bytes but less than
TOAST_TUPLE_THRESHOLD>;

Earlier this tuple would not have been toasted, even though
toast_tuple_target is set to 256. But with the new code, the tuple will get
toasted. So that's a change, but in the right direction as far as I can
see. Also, this is a bit unrelated to what this patch is trying to achieve.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-04-03 06:23:33 Re: Re: A separate table level option to control compression
Previous Message Andres Freund 2019-04-03 05:56:46 Re: COPY FROM WHEN condition