From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | James Finnerty <jfinnert(at)amazon(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: postgres_fdw: using TABLESAMPLE to collect remote sample |
Date: | 2023-01-05 21:47:23 |
Message-ID: | 3e8cca01-f102-d4b9-abe9-ed7b5e938323@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/5/23 22:05, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>> The second patch adds the relkind check, so that we only issue
>> TABLESAMPLE on valid relation types (tables, matviews). But I'm not sure
>> we actually need it - the example presented earlier was foreign table
>> pointing to a sequence. But that actually works fine even without this
>> patch, because fore sequences we have reltuples=1, which disables
>> sampling due to the check mentioned above (because targrows >= 1).
>
> Seems pretty accidental still.
>
>> The other issue with this patch is that it seems wrong to check the
>> relkind value locally - instead we should check this on the remote
>> server, because that's where we'll run TABLESAMPLE. Currently there are
>> no differences between versions, but what if we implement TABLESAMPLE
>> for another relkind in the future? Then we'll either fail to use
>> sampling or (worse) we'll try using TABLESAMPLE for unsupported relkind,
>> depending on which of the servers has newer version.
>
> We don't have a way to do that, and I don't think we need one. The
> worst case is that we don't try to use TABLESAMPLE when the remote
> server is a newer version that would allow it. If we do extend the
> set of supported relkinds, we'd need a version-specific test in
> postgres_fdw so that we don't wrongly use TABLESAMPLE with an older
> remote server ... but that's hardly complicated, or a new concept.
>
> On the other hand, if we let the code stand, the worst case is that
> ANALYZE fails because we try to apply TABLESAMPLE in an unsupported
> case, presumably with some remote relkind that doesn't exist today.
> I think that's clearly worse than not exploiting TABLESAMPLE
> although we could (with some other new remote relkind).
>
OK
> As far as the patch details go: I'd write 0001 as
>
> + Assert(sample_frac >= 0.0 && sample_frac <= 1.0);
>
> The way you have it seems a bit contorted.
Yeah, that's certainly cleaner.
> As for 0002, personally
> I'd rename the affected functions to reflect their expanded duties,
> and they *definitely* require adjustments to their header comments.
> Functionally it looks fine though.
>
No argument about the header comments, ofc - I just haven't done that in
the WIP patch. As for the renaming, any suggestions for the new names?
The patch tweaks two functions in a way that affects what they return:
1) deparseAnalyzeTuplesSql - adds relkind to the "reltuples" SQL
2) postgresCountTuplesForForeignTable - adds can_sample flag
There are no callers outside postgresAcquireSampleRowsFunc, so what
about renaming them like this?
deparseAnalyzeTuplesSql
-> deparseAnalyzeInfoSql
postgresCountTuplesForForeignTable
-> postgresGetAnalyzeInfoForForeignTable
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Matthias van de Meent | 2023-01-05 21:49:23 | Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE |
Previous Message | Thomas Munro | 2023-01-05 21:39:59 | Re: Cygwin cleanup |