Re: [SQL] MAX() of 0 records.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Paul McGarry <paulm(at)opentec(dot)com(dot)au>
Cc: pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [SQL] MAX() of 0 records.
Date: 2000-07-07 07:06:03
Message-ID: 9124.962953563@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers pgsql-sql

Paul McGarry <paulm(at)opentec(dot)com(dot)au> writes:
> CREATE TABLE entry_stats
> (
> entry_id INT4 NOT NULL REFERENCES entry ON DELETE CASCADE,
> entry_minprice INT4 NOT NULL DEFAULT 0
> );
>
> CREATE TABLE item(
> item_id INT4 PRIMARY KEY,
> item_entry_id INT4 NOT NULL REFERENCES entry ON DELETE NO ACTION,
> item_price INT4 NOT NULL,
> item_live bool NOT NULL DEFAULT 'n'
> );
>
> [trigger using]
>
> UPDATE entry_stats
> SET entry_minprice=min(item_price)
> FROM item where item_entry_id=NEW.item_entry_id AND item_live='t';
>
> ERROR: ExecutePlan: (junk) `ctid' is NULL!

Hmm. There are several things going on here, but one thing that needs
clarification is whether this UPDATE is written correctly. Since it
has no constraint on entry_stats, it seems to me that *every* row of
entry_stats will have entry_minprice set to the same value, namely
the minimum item_price over those item rows that satisfy the WHERE
condition. Surely that wasn't what you wanted? Shouldn't there be an
additional WHERE clause like entry_id = item_entry_id?

Anyway, the proximate cause of the error message is as follows.
A cross-table UPDATE like this is actually implemented as if it were
a SELECT:
SELECT entry_stats.ctid, min(item_price)
FROM entry_stats, item WHERE ...;
For each row emitted by this underlying SELECT, the executor takes
the ctid result column (which identifies the particular target tuple
in the target table) and updates that tuple by stuffing the additional
SELECT result column(s) into the specified fields of that tuple.

Now, if you try a SELECT like the above in a situation where there are
no tuples matching the WHERE clause, what you get out is a row of all
NULLs --- because that's what you get from SELECT if there's an
aggregate function with no GROUP BY and no input rows. The executor
gets this dummy row, tries to do a tuple update using it, and chokes
because the ctid is NULL. So that explains why the error message is
what it is. Next question is what if anything should be done
differently. We could just have the executor ignore result rows where
ctid is NULL, but that seems like patching around the problem not fixing
it.

The thing that jumps out at me is that if you actually try the SELECT
illustrated above, you do not get any row, null or otherwise; you get
ERROR: Attribute entry_stats.ctid must be GROUPed or used in an
aggregate function
which is a mighty valid complaint. If you are aggregating rows to get
the MIN() then you don't have a unique ctid to deliver, so which row
ought to be updated? This is the system's way of expressing the same
concern I started with: this query doesn't seem to be well-posed.

You don't see this complaint when you try the UPDATE, because ctid
is added to the implicit select result in a back-door way that doesn't
get checked for GROUP-BY validity. I wonder whether that is the bug.
If so, we'd basically be saying that no query like this is valid
(since UPDATE doesn't have a GROUP BY option, there'd be no way to
pass the grouping check).

Another way to look at it is that perhaps an UPDATE involving aggregate
functions ought to be implicitly treated as GROUP BY targetTable.ctid.
In other words, the MIN() or other aggregate function is implicitly
evaluated over only those join tuples that are formed for a single
target tuple. Intuitively that seems to make sense, and it solves the
problem you're complaining of, because no matching tuples = no groups =
no result tuples = update does nothing = no problem. But I have a
sneaking suspicion that I'm missing some nasty problem with this idea
too.

Comments anyone? What exactly *should* be the behavior of an UPDATE
that uses an aggregate function and a join to another table? Over what
set of tuples should the aggregate be evaluated?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jan Ehrhardt 2000-07-07 08:54:13 Re: Bug??: getattproperties fails !!
Previous Message Matthew Altus 2000-07-07 06:53:30 ps_status.h on FreeBSD 4.0 problems and fix

Browse pgsql-hackers by date

  From Date Subject
Next Message Karel Zak 2000-07-07 07:17:51 Re: build system
Previous Message Malcontent 2000-07-07 06:19:52 FYI

Browse pgsql-sql by date

  From Date Subject
Next Message Antti Linno 2000-07-07 07:10:50 Ancient postgres+EXCEPT
Previous Message Paul McGarry 2000-07-07 05:34:13 Re: [SQL] MAX() of 0 records.