Suboptimal evaluation of CASE expressions

From: Andreas Tille <tillea(at)rki(dot)de>
To: PostgreSQL Hacker Liste <pgsql-hackers(at)postgresql(dot)org>
Subject: Suboptimal evaluation of CASE expressions
Date: 2006-04-11 14:43:33
Message-ID: Pine.LNX.4.62.0604111635430.13764@wr-linux02
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

when dealing with a bug in the postgresql-plr interface I think
I found a suboptimal method to process CASE statements. First
to the problem:

I'm using the Debian packaged version of PLR version 0.6.2-2 (Debian
testing) and found a problem calculating median from a set of
values that contain only NULL values. The problem becomes clear
if you look at the following example:

-------------------------------------------------------------------------
$ psql -t test

create table plrtest( nonnull numeric not null, mightbenull numeric, flag int);
insert into plrtest values(42.0, 42.0, 1);
insert into plrtest values(17.0, 17.0, 1);
insert into plrtest values(23.0, NULL, 0 );
insert into plrtest values(4711.0, 4711.0, 1);
insert into plrtest values(174.0, NULL, 0);

CREATE OR REPLACE FUNCTION plr_call_handler()
RETURNS LANGUAGE_HANDLER
AS '\$libdir/plr' LANGUAGE C;

CREATE LANGUAGE plr HANDLER plr_call_handler;

create or replace function r_median(_numeric) returns numeric as 'median(arg1)'
language 'plr';

CREATE OR REPLACE FUNCTION r_median(_numeric) returns numeric as '
median(arg1)
' language 'plr';

CREATE OR REPLACE FUNCTION plr_array_accum (_numeric, numeric)
RETURNS numeric[]
AS '\$libdir/plr','plr_array_accum'
LANGUAGE 'C';

CREATE AGGREGATE median (
sfunc = plr_array_accum,
basetype = numeric,
stype = _numeric,
finalfunc = r_median
);

SELECT median(nonnull) from plrtest;
42

SELECT median(mightbenull) from plrtest;
42

SELECT median(nonnull), median(mightbenull) from plrtest where flag = 0;
ERROR: R interpreter expression evaluation error
DETAIL: Error in median(arg1) : need numeric data
CONTEXT: In PL/R function r_median

-------------------------------------------------------------------------

I would expect NULL as result of the last query.

So I thought I will verify in a CASE statement whether there
are only NULL values in the column by max(mightbenull) like this:

# SELECT CASE WHEN max(mightbenull) IS NULL THEN 0 ELSE median(mightbenull) END from plrtest where flag = 0;
ERROR: R interpreter expression evaluation error
DETAIL: Error in median(arg1) : need numeric data
CONTEXT: In PL/R function r_median

The problem I want to discuss here is the following: Usually in
programming languages only one branch of the IF-THEN-ELSE statement
will be calculated. But here *both* branches are calculated
(obviousely because of the error that occures). If we just forget
that my goal was to circumvent the error by some hack, I think
if there is some kind of complex query in the ELSE branche that
calculation would just cost extra processing time with no need.
I would regard this as a bug.

Kind regards

Andreas.

PS: Please CC me. I'm not subscribed.
--
http://fam-tille.de

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Myron Scott 2006-04-11 14:47:14 Re: Support Parallel Query Execution in Executor
Previous Message Thomas Sondag 2006-04-11 14:03:18 pg_contrib default schema