ERROR: out of memory when using aggregates over a partitioned table

From: Matteo Beccati <php(at)beccati(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: ERROR: out of memory when using aggregates over a partitioned table
Date: 2009-05-05 00:35:59
Message-ID: 49FF89EF.7060404@beccati.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi everyone,

I'm unexpectedly getting out of memory error both with 8.3.3 and
8.4beta1 when doing something as simple as:

SELECT id, COUNT(*) AS counter, MAX(last_modified) AS last_modified FROM
foo GROUP BY id;

where foo is a partitioned table and id is a uuid column.

It looks like the HashAggregate estimate is set to a default of 200 even
though ndistinct in each partition is averaging at -0.59. As RhodiumToad
pointed out there's a comment explaining the behaviour:

* XXX This means the Var represents a column of an append
* relation. Later add code to look at the member relations and
* try to derive some kind of combined statistics?

I just wanted to raise it as something that might happen to those using
partitions as it's very likely that a partitioned table is bigger than
the available RAM. However I don't think it happens very often that one
needs to run an aggregate query on it. I just needed it to populate a
separate table that will be kept up to date via triggers.

Here's the EXPLAIN output:

HashAggregate (cost=1344802.32..1344805.32 rows=200 width=24)
-> Append (cost=0.00..969044.47 rows=50101047 width=24)
-> Seq Scan on foo (cost=0.00..16.60 rows=660 width=24)
-> Seq Scan on part_0 foo (cost=0.00..60523.89 rows=3129289
width=24)
-> Seq Scan on part_1 foo (cost=0.00..60555.37 rows=3130937
width=24)
-> Seq Scan on part_2 foo (cost=0.00..60532.17 rows=3129717
width=24)
-> Seq Scan on part_3 foo (cost=0.00..60550.86 rows=3130686
width=24)
-> Seq Scan on part_4 foo (cost=0.00..60545.07 rows=3130407
width=24)
-> Seq Scan on part_5 foo (cost=0.00..60579.93 rows=3131393
width=24)
-> Seq Scan on part_6 foo (cost=0.00..60566.70 rows=3131470
width=24)
-> Seq Scan on part_7 foo (cost=0.00..60610.66 rows=3133766
width=24)
-> Seq Scan on part_8 foo (cost=0.00..60546.67 rows=3129667
width=24)
-> Seq Scan on part_9 foo (cost=0.00..60509.92 rows=3128592
width=24)
-> Seq Scan on part_a foo (cost=0.00..60581.25 rows=3132225
width=24)
-> Seq Scan on part_b foo (cost=0.00..60552.81 rows=3130781
width=24)
-> Seq Scan on part_c foo (cost=0.00..60621.15 rows=3134315
width=24)
-> Seq Scan on part_d foo (cost=0.00..60714.26 rows=3139126
width=24)
-> Seq Scan on part_e foo (cost=0.00..60552.85 rows=3130785
width=24)
-> Seq Scan on part_f foo (cost=0.00..60484.31 rows=3127231
width=24)

Cheers
--
Matteo Beccati
http://www.openx.org/

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Wong 2009-05-05 00:49:02 community equipment
Previous Message Tom Lane 2009-05-05 00:01:05 Re: Unicode string literals versus the world