Quick Links

Processing of subqueries in union

From:	Markus Hervén <markus(dot)herven(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Processing of subqueries in union
Date:	2013-03-01 06:32:01
Message-ID:	51304B61.5030103@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I am currently looking at a performance issue where a subquery is
executed even though I cannot see the result ever being asked for or
used. Not even google has helped me find any information about this one.
I have created a simple test case to demonstrate this.

CREATE TABLE test1 (t1 INT);
INSERT INTO test1 VALUES(1);
CREATE TABLE test2 (t2 INT);
INSERT INTO test2 VALUES(1);
CREATE TABLE test3 (t1 INT, cnt INT);
INSERT INTO test3 VALUES(3, 3);
CREATE VIEW tv AS SELECT t1, (SELECT COUNT(*) FROM test2 WHERE t1=t2) AS
CNT FROM test1;
CREATE VIEW tv2 AS SELECT t1, (SELECT COUNT(*) FROM test2 WHERE t1=t2)
AS CNT FROM test1 UNION ALL SELECT t1, cnt FROM test3;
EXPLAIN ANALYZE SELECT t1 FROM tv;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Seq Scan on test1 (cost=0.00..34.00 rows=2400 width=4) (actual
time=0.005..0.006 rows=1 loops=1)
Total runtime: 0.033 ms
(2 rows)
EXPLAIN ANALYZE SELECT t1 FROM tv2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Subquery Scan on tv2 (cost=0.00..96252.20 rows=4540 width=4) (actual
time=0.026..0.042 rows=2 loops=1)
-> Append (cost=0.00..96206.80 rows=4540 width=6) (actual
time=0.024..0.036 rows=2 loops=1)
-> Seq Scan on test1 (cost=0.00..96130.00 rows=2400 width=4)
(actual time=0.023..0.024 rows=1 loops=1)
SubPlan 1
-> Aggregate (cost=40.03..40.04 rows=1 width=0)
(actual time=0.013..0.014 rows=1 loops=1)
-> Seq Scan on test2 (cost=0.00..40.00 rows=12
width=0) (actual time=0.005..0.006 rows=1 loops=1)
Filter: (test1.t1 = t2)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..52.80 rows=2140
width=8) (actual time=0.005..0.007 rows=1 loops=1)
-> Seq Scan on test3 (cost=0.00..31.40 rows=2140
width=8) (actual time=0.002..0.004 rows=1 loops=1)
Total runtime: 0.089 ms
(10 rows)

As can be seen in the last explain it does a sequential scan of table
test2 even though this is not needed. It will perform the scan once for
each row in table test1. No scan of test2 is done if there is not a
union in the view. I cannot see any reason for this happening and am
guessing that the query planner does not know that it can get rid of the
subquery. Is there anything I can do to rid of the subquery myself?

The postgresql version I am using is:
opsspace=# select version();
version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2.3 on x86_64-redhat-linux-gnu, compiled by gcc (GCC)
4.7.2 20121109 (Red Hat 4.7.2-8), 64-bit
(1 row)

Many thanks
Markus

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Steven Crandell	2013-03-01 09:52:02	hardware upgrade, performance degrade?
Previous Message	Ao Jianwang	2013-03-01 00:30:09	Re: SELECT is slow on smaller table?