Re: [HACKERS] MERGE SQL Statement for PG11

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-03-23 11:56:42
Message-ID: CABOikdOSsdiJLnrU6UyU-G1hergOhd74t1jsKUM4f9Fq2DZQ0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 23, 2018 at 4:43 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Thu, Mar 22, 2018 at 11:42 AM, Pavan Deolasee
> <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> > A slightly improved version attached.
>
> You still need to remove this change:
>
> > diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
> > index a4574cd533..dbfb5d2a1a 100644
> > --- a/src/include/miscadmin.h
> > +++ b/src/include/miscadmin.h
> > @@ -444,5 +444,6 @@ extern bool has_rolreplication(Oid roleid);
> > /* in access/transam/xlog.c */
> > extern bool BackupInProgress(void);
> > extern void CancelBackup(void);
> > +extern int64 GetXactWALBytes(void);
>

Sigh. Fixed in the recent version.

>
> I see that we're still using two target RTEs in this latest revision,
> v23e -- the new approach to partitioning, which I haven't had time to
> study in more detail, has not produced a change there.

Yes, we continue to use two RTEs because I don't have any brighter idea
than that to handle the case of partitioned table and right outer join. As
I explained sometime back, this is necessary to ensure that we don't
produce duplicate rows when a partition is joined with the source and then
a second partition is joined again with the source.

Now I don't know if we can run a join query and still have a single RTE,
but that looks improbable and wrong.

This causes
> weird effects, such as the following:
>
> """
> pg(at)~[20658]=# create table foo(bar int4);
> CREATE TABLE
> pg(at)~[20658]=# merge into foo f using (select 1 col) dd on f.bar=dd.col
> when matched then update set bar = f.barr + 1;
> ERROR: column f.barr does not exist
> LINE 1: ...n f.bar=dd.col when matched then update set bar = f.barr + 1...
> ^
> HINT: Perhaps you meant to reference the column "f.bar" or the column
> "f.bar".
>
> """
>
> While I think this this particular HINT buglet is pretty harmless, I
> continue to be concerned about the unintended consequences of having
> multiple RTEs for MERGE's target table. Each RTE comes from a
> different lookup path -- the first one goes through setTargetTable()'s
> parserOpenTable() + addRangeTableEntryForRelation() calls. The second
> one goes through transformFromClauseItem(), for the join, which
> ultimately ends up calling transformTableEntry()/addRangeTableEntry().
>

How's it different than running a INSERT query with the target table again
specified in a subquery producing the rows to be inserted? For example,

postgres=# insert into target as t select sid from source s join target t
on t.ttid = s.sid;
ERROR: column t.ttid does not exist
LINE 1: ...rget as t select sid from source join target t on t.ttid = s...
^
HINT: Perhaps you meant to reference the column "t.tid" or the column
"t.tid".
postgres=#

This produces a very similar looking HINT as your test above. I am certain
that "target" table gets two RTEs, exactly via the same code paths as you
discussed above. So if this is not a problem for INSERT, why it would be a
problem for MERGE? May be I am missing a point here.

> Consider commit 5f173040, which fixed a privilege escalation security
> bug around multiple name lookup. Could the approach taken by MERGE
> here introduce a similar security issue?
>
> Using GDB, I see two calls to RangeVarGetRelidExtended() when a simple
> MERGE is executed. They both have identical relation arguments, that
> look like this:
>
> (gdb) p *relation
> $4 = {
> type = T_RangeVar,
> catalogname = 0x0,
> schemaname = 0x0,
> relname = 0x5600ebdcafb0 "foo",
> inh = 1 '\001',
> relpersistence = 112 'p',
> alias = 0x5600ebdcb048,
> location = 11
> }
>
> This seems like something that needs to be explained, at a minimum.
> Even if I'm completely wrong about there being a security hazard,
> maybe the suggestion that there might be still gives you some idea of
> what I mean about unintended consequences.
>

Ok. I will try to explain it better and also think about the security
hazards.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2018-03-23 11:59:25 Re: Re: csv format for psql
Previous Message Etsuro Fujita 2018-03-23 11:55:17 Re: [HACKERS] Add support for tuple routing to foreign partitions