Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, buschmann(at)nidsa(dot)net, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12
Date: 2019-10-10 20:40:22
Message-ID: 20191010204022.6hiiubm2cueu3ghk@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, Oct 10, 2019 at 04:14:20PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On Thu, Oct 10, 2019 at 10:19:12AM -0400, Tom Lane wrote:
>>> To identify such domains, I think we'd need something like
>>> WHERE attypid IN (recursive-WITH-query), which makes me nervous.
>>> We did support those starting with 8.4, which is as far back as
>>> pg_upgrade will go, so in theory it should work. But I think we
>>> had bugs with such cases in old releases. Do we want to assume
>>> that the source server has been updated enough to avoid any such
>>> bugs? The expense of such a query might be daunting, too.
>
>> For the query cost, I think we can assume the domain hierarchies are not
>> particularly deep (in practice I'd expect just domains directly on the
>> sql_identifier type). And I doubt people are using that very widely,
>> it's probably more like this report - ad-hoc CTAS, so just a couple of
>> items. So I wouldn't expect it to be a huge deal in most cases. But even
>> if it takes a second or two, it's a one-time cost.
>
>What I was worried about was the planner possibly trying to apply the
>atttypid restriction as a scan qual using a subplan, which might be rather
>awful. But it doesn't look like that happens.

OK.

> I get a hash semijoin to
>the CTE output, in all branches back to 8.4, on this trial query:
>
>explain
>with recursive sqlidoids(toid) as (
>select 'information_schema.sql_identifier'::pg_catalog.regtype as toid
>union
>select oid from pg_catalog.pg_type, sqlidoids
> where typtype = 'd' and typbasetype = sqlidoids.toid
>)
>SELECT n.nspname, c.relname, a.attname
>FROM pg_catalog.pg_class c,
> pg_catalog.pg_namespace n,
> pg_catalog.pg_attribute a
>WHERE c.oid = a.attrelid AND
> NOT a.attisdropped AND
> a.atttypid in (select toid from sqlidoids) AND
> c.relkind IN ('r','v','i') and
> c.relnamespace = n.oid AND
> n.nspname !~ '^pg_temp_' AND
> n.nspname !~ '^pg_toast_temp_' AND
> n.nspname NOT IN ('pg_catalog', 'information_schema');
>

I think that's not quite sufficient - the problem is that we can have
domains and composite types on sql_identifier, in some arbitrary order.
And the recursive CTE won't handle that the way it's written - it will
miss domains on composite types containing sql_identifier. And we have
quite a few of them in the information schema, so maybe someone created
a domain on one of those (however unlikely it may seem).

I think this recursive CTE does it correctly:

WITH RECURSIVE oids AS (
-- type itself
SELECT 'information_schema.sql_identifier'::regtype AS oid
UNION ALL
SELECT * FROM (
-- domains on the type
WITH x AS (SELECT oid FROM oids)
SELECT t.oid FROM pg_catalog.pg_type t, x WHERE typbasetype = x.oid AND typtype = 'd'
UNION
-- composite types containing the type
SELECT t.oid FROM pg_catalog.pg_type t, pg_catalog.pg_class c, pg_catalog.pg_attribute a, x
WHERE t.typtype = 'c' AND
t.oid = c.reltype AND
c.oid = a.attrelid AND
a.atttypid = x.oid
) foo
)

I had to use CTE within CTE, because the 'oids' can be referenced only
once, but we have two subqueries there. Maybe there's a better solution.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2019-10-11 00:28:14 Re: BUG #16039: PANIC when activating replication slots in Postgres 12.0 64bit under Windows
Previous Message Tom Lane 2019-10-10 20:14:20 Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-10-10 21:34:51 Re: stress test for parallel workers
Previous Message Peter Geoghegan 2019-10-10 20:26:18 Re: BTP_DELETED leaf still in tree