From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Experimenting with hash tables inside pg_dump |
Date: | 2021-10-22 14:53:31 |
Message-ID: | 2709766.1634914411@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2021-10-21 22:13:22 -0400, Tom Lane wrote:
>> I've thought about doing something like
>> SELECT unsafe-functions FROM pg_class WHERE oid IN (someoid, someoid, ...)
>> but in cases with tens of thousands of tables, it seems unlikely that
>> that's going to behave all that nicely.
> That's kinda what I'm doing in the quick hack. But instead of using IN(...) I
> made it unnest('{oid, oid, ...}'), that scales much better.
I'm skeptical of that, mainly because it doesn't work in old servers,
and I really don't want to maintain two fundamentally different
versions of getTableAttrs(). I don't think you actually need the
multi-array form of unnest() here --- we know the TableInfo array
is in OID order --- but even the single-array form only works
back to 8.4.
However ... looking through getTableAttrs' main query, it seems
like the only thing there that's potentially unsafe is the
"format_type(t.oid, a.atttypmod)" call. I wonder if it could be
sane to convert it into a single query that just scans all of
pg_attribute, and then deal with creating the formatted type names
separately, perhaps with an improved version of getFormattedTypeName
that could cache the results for non-default typmods. The main
knock on this approach is the temptation for somebody to stick some
unsafe function into the query in future. We could stick a big fat
warning comment into the code, but lately I despair of people reading
comments.
> To see where it's worth putting in time it'd be useful if getSchemaData() in
> verbose mode printed timing information...
I've been running test cases with log_min_duration_statement = 0,
which serves well enough.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2021-10-22 15:21:59 | Re: Experimenting with hash tables inside pg_dump |
Previous Message | Magnus Hagander | 2021-10-22 14:42:01 | Re: parallelizing the archiver |