From: | Andreas Tille <tillea(at)rki(dot)de> |
---|---|
To: | PostgreSQL Hacker Liste <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Suboptimal evaluation of CASE expressions |
Date: | 2006-04-11 14:43:33 |
Message-ID: | Pine.LNX.4.62.0604111635430.13764@wr-linux02 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
when dealing with a bug in the postgresql-plr interface I think
I found a suboptimal method to process CASE statements. First
to the problem:
I'm using the Debian packaged version of PLR version 0.6.2-2 (Debian
testing) and found a problem calculating median from a set of
values that contain only NULL values. The problem becomes clear
if you look at the following example:
-------------------------------------------------------------------------
$ psql -t test
create table plrtest( nonnull numeric not null, mightbenull numeric, flag int);
insert into plrtest values(42.0, 42.0, 1);
insert into plrtest values(17.0, 17.0, 1);
insert into plrtest values(23.0, NULL, 0 );
insert into plrtest values(4711.0, 4711.0, 1);
insert into plrtest values(174.0, NULL, 0);
CREATE OR REPLACE FUNCTION plr_call_handler()
RETURNS LANGUAGE_HANDLER
AS '\$libdir/plr' LANGUAGE C;
CREATE LANGUAGE plr HANDLER plr_call_handler;
create or replace function r_median(_numeric) returns numeric as 'median(arg1)'
language 'plr';
CREATE OR REPLACE FUNCTION r_median(_numeric) returns numeric as '
median(arg1)
' language 'plr';
CREATE OR REPLACE FUNCTION plr_array_accum (_numeric, numeric)
RETURNS numeric[]
AS '\$libdir/plr','plr_array_accum'
LANGUAGE 'C';
CREATE AGGREGATE median (
sfunc = plr_array_accum,
basetype = numeric,
stype = _numeric,
finalfunc = r_median
);
SELECT median(nonnull) from plrtest;
42
SELECT median(mightbenull) from plrtest;
42
SELECT median(nonnull), median(mightbenull) from plrtest where flag = 0;
ERROR: R interpreter expression evaluation error
DETAIL: Error in median(arg1) : need numeric data
CONTEXT: In PL/R function r_median
-------------------------------------------------------------------------
I would expect NULL as result of the last query.
So I thought I will verify in a CASE statement whether there
are only NULL values in the column by max(mightbenull) like this:
# SELECT CASE WHEN max(mightbenull) IS NULL THEN 0 ELSE median(mightbenull) END from plrtest where flag = 0;
ERROR: R interpreter expression evaluation error
DETAIL: Error in median(arg1) : need numeric data
CONTEXT: In PL/R function r_median
The problem I want to discuss here is the following: Usually in
programming languages only one branch of the IF-THEN-ELSE statement
will be calculated. But here *both* branches are calculated
(obviousely because of the error that occures). If we just forget
that my goal was to circumvent the error by some hack, I think
if there is some kind of complex query in the ELSE branche that
calculation would just cost extra processing time with no need.
I would regard this as a bug.
Kind regards
Andreas.
PS: Please CC me. I'm not subscribed.
--
http://fam-tille.de
From | Date | Subject | |
---|---|---|---|
Next Message | Myron Scott | 2006-04-11 14:47:14 | Re: Support Parallel Query Execution in Executor |
Previous Message | Thomas Sondag | 2006-04-11 14:03:18 | pg_contrib default schema |