GROUP BY checks inadequate when set returning functions in column list

From: Chris Travers <chris(at)metatrontech(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: GROUP BY checks inadequate when set returning functions in column list
Date: 2012-08-22 14:51:47
Message-ID: CAPKNUtf=0XFxYuy79qmoSQz_Q=663yuZTCZWhF-MXSr_UUXS+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi all;

In some of my tests regarding set-returning functions I came across
some very strange behavior. Group by seems to have very strange (and
inconsistent) behavior when connected to the use of a set-returning
function in a column list.

Consider the following function which returns a list of rows from a table:

mtech_test=# select * from account_heading__list();
id | accno | parent_id | description
-------+-------+-----------+-----------------------------------
10001 | 1000 | | CURRENT ASSETS
10006 | 1500 | | INVENTORY ASSETS
10010 | 1800 | | CAPITAL ASSETS
10015 | 2000 | | CURRENT LIABILITIES
10027 | 2600 | | LONG TERM LIABILITIES
10451 | 2700 | | Expense Accounts for Individuals
10225 | 3000 | | CAPITAL
10030 | 3300 | | SHARE CAPITAL
10032 | 4000 | | SALES REVENUE
10036 | 4300 | | CONSULTING REVENUE
10039 | 4400 | | OTHER REVENUE
10043 | 5000 | | COST OF GOODS SOLD
10049 | 5400 | | PAYROLL EXPENSES
10055 | 5600 | | GENERAL & ADMINISTRATIVE EXPENSES
(14 rows)

(Source code for function will be included below but I dont think this
is a function issue).

The above results are expected. Similarly if I run it in the column
list, I get tuple representations of the same data:

mtech_test=# select account_heading__list();
account_heading__list
---------------------------------------------------
(10001,1000,,"CURRENT ASSETS")
(10006,1500,,"INVENTORY ASSETS")
(10010,1800,,"CAPITAL ASSETS")
(10015,2000,,"CURRENT LIABILITIES")
(10027,2600,,"LONG TERM LIABILITIES")
(10451,2700,,"Expense Accounts for Individuals")
(10225,3000,,CAPITAL)
(10030,3300,,"SHARE CAPITAL")
(10032,4000,,"SALES REVENUE")
(10036,4300,,"CONSULTING REVENUE")
(10039,4400,,"OTHER REVENUE")
(10043,5000,,"COST OF GOODS SOLD")
(10049,5400,,"PAYROLL EXPENSES")
(10055,5600,,"GENERAL & ADMINISTRATIVE EXPENSES")
(14 rows)

It's when we add group by that things appear broken. Note it starts
returning 196 (14 x 14) records, which suggests a cross join against
itself.

mtech_test=# explain analyze select (account_heading__list()).* group by accno
mtech_test-# ;

QUERY PLAN

---------------------------------------------------------------------------------
------------
HashAggregate (cost=0.26..1.27 rows=1 width=0) (actual
time=0.456..1.986 rows=1
96 loops=1)
-> Result (cost=0.00..0.26 rows=1 width=0) (actual
time=0.170..0.194 rows=14
loops=1)
Total runtime: 2.076 ms
(3 rows)

My guess from looking at this deeper is that this is likely just
behavior that is prevented by group by column checks absent set
returning functions. The behavior goes away when the return columns
are brought back in line with the group by:

mtech_test=# select count(*) from (select
(account_heading__list()).accno group by accno) c;
count
-------
14
(1 row)

Is this something we should be checking for and throwing exceptions based on?

mtech_test=# select version()
mtech_test-# ;
version

---------------------------------------------------------------------------------
--------------------------
PostgreSQL 9.1.4 on i386-redhat-linux-gnu, compiled by gcc (GCC)
4.7.0 20120507
(Red Hat 4.7.0-5), 32-bit
(1 row)

mtech_test=# \df+ account_heading__list

List of functions
Schema | Name | Result data type | Argument
data types |
Type | Volatility | Owner | Language | Source
code
| Description

--------+-----------------------+-----------------------+---------------------+--
------+------------+----------+----------+---------------------------------------
----------+----------------------------------------------------------------------
---------
public | account_heading__list | SETOF account_heading |
| n
ormal | stable | postgres | sql | SELECT * FROM
account_heading order b
y accno; | Returns a list of all account headings, currently ordered
by account
number.+
| | |
|
| | | |
|
(1 row)

Best Wishes,
Chris Travers

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-08-22 15:04:39 Re: GROUP BY checks inadequate when set returning functions in column list
Previous Message Andres Freund 2012-08-22 12:40:11 Re: BUG #7500: hot-standby replica crash after an initial rsync