Quick Links

Optimizer on sort aggregate

From:	Feng Tian <fengttt(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Optimizer on sort aggregate
Date:	2014-10-17 16:10:13
Message-ID:	CAFjtmHU3Obf5aSpWY7i18diapvjg-418hYySdqUuYhXZtjChhg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Consider the following queries.

create table t(i int, j int, k int, t text);
insert into t select i, i % 100, i % 1000, 'AABBCCDD' || i from
generate_series(1, 1000000) i;

ftian=# explain select count(distinct j) from t group by t, i;
QUERY PLAN
------------------------------------------------------------------------
GroupAggregate (cost=158029.84..178029.84 rows=1000000 width=22)
-> Sort (cost=158029.84..160529.84 rows=1000000 width=22)
Sort Key: t, i
-> Seq Scan on t (cost=0.00..17352.00 rows=1000000 width=22)
(4 rows)

The query,
select count(distinct j) from t group by t, i;

runs for 35 seconds. However, if I change the query to,
select count(distinct j) from t group by i, t; -- note switching the
ordering
select count(distinct j) from t group by decode(t, 'escape'), i; -- convert
t to bytea

Run times are just about 5 and 6.5 seconds. The reason is clear, compare a
string with collation is slow, which is well understood by pg hackers.
However, here, the sorting order is forced by the planner, not user.
Planner can do the following optimizations,

1. for the sort we generated for sort agg, planner can switch column
ordering, put int before string,
2. for the sort we generated for sort agg, use bytea compare instead of
string compare.

They will bring big improvement to this common query. Is this something
reasonable?

Thanks,

Responses

Re: Optimizer on sort aggregate at 2014-10-17 23:35:16 from Tatsuo Ishii
Re: Optimizer on sort aggregate at 2014-10-17 23:36:30 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2014-10-17 16:22:19	Hash index creation warning
Previous Message	Marti Raudsepp	2014-10-17 16:05:09	Re: Support UPDATE table SET(*)=...