OOM Errors as a result of table inheritance and a bad plan(?)

From: Arctic Toucan <arctic_toucan(at)hotmail(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: OOM Errors as a result of table inheritance and a bad plan(?)
Date: 2007-09-28 17:50:32
Message-ID: BAY126-W604E273FFED65685953D679CB20@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In keeping with some of the recent threads regarding the planner...

I have a fair sized data warehouse in which I am trying to perform an aggregation, but getting OOM errors in Postgres(8.2.4).
I believe the reason for the OOM is that Postgres is attempting to do a hash aggregation, but it has grossly underestimated the rows resulting from the aggregation.
The data in the database is very uniformly distributed so I don't believe that the table stats are the cause of the problem.
This may be related to table inheritance, and can be demonstrated pretty easily.

CREATE TABLE foo(a INT);ANALYZE foo;
CREATE TABLE foo_1() INHERITS(foo);insert into foo_1 select generate_series(1,100000);insert into foo_1 select generate_series(1,100000);insert into foo_1 select generate_series(1,100000);ANALYZE foo_1;
CREATE TABLE foo_2() INHERITS(foo);insert into foo_2 select generate_series(1,100000);insert into foo_2 select generate_series(1,100000);insert into foo_2 select generate_series(1,100000);ANALYZE foo_2;
-- If I query a particular partition, the plan estimate for the hash aggregate is good
EXPLAIN ANALYZE SELECT a,COUNT(*) from foo_1 group by a;
HashAggregate (cost=5822.00..7061.01 rows=99121 width=4) (actual time=554.556..657.121 rows=100000 loops=1) -> Seq Scan on foo_1 (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.014..203.290 rows=300000 loops=1) Total runtime: 712.211 ms
-- If I query the base table, the plan estimate for the hash aggregate is off by several orders of magnitude
EXPLAIN ANALYZE SELECT a,COUNT(*) from foo group by a;
HashAggregate (cost=11686.10..11688.60 rows=200 width=4) (actual time=1724.188..1826.630 rows=100000 loops=1) -> Append (cost=0.00..8675.40 rows=602140 width=4) (actual time=0.016..1045.134 rows=600000 loops=1) -> Seq Scan on foo (cost=0.00..31.40 rows=2140 width=4) (actual time=0.001..0.001 rows=0 loops=1) -> Seq Scan on foo_1 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.012..205.130 rows=300000 loops=1) -> Seq Scan on foo_2 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.011..203.542 rows=300000 loops=1) Total runtime: 1879.550 ms(6 rows)
-- Is there something magical about the hash aggregate estimate of 200 rows?
-- I can have 30,000 or 300,000 rows in each child partition table and multiple partition's with different values of "a" and yet it always come up with 200.
-- eg.
create table foo_3() inherits(foo);insert into foo_3 select generate_series(100000,300000);analyze foo_3;
EXPLAIN ANALYZE SELECT a,COUNT(*) from foo group by a;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------ HashAggregate (cost=15568.12..15570.62 rows=200 width=4) (actual time=2386.813..2691.254 rows=300000 loops=1) -> Append (cost=0.00..11557.41 rows=802141 width=4) (actual time=0.016..1403.121 rows=800001 loops=1) -> Seq Scan on foo (cost=0.00..31.40 rows=2140 width=4) (actual time=0.002..0.002 rows=0 loops=1) -> Seq Scan on foo_1 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.013..201.549 rows=300000 loops=1) -> Seq Scan on foo_2 foo (cost=0.00..4322.00 rows=300000 width=4) (actual time=0.010..211.332 rows=300000 loops=1) -> Seq Scan on foo_3 foo (cost=0.00..2882.01 rows=200001 width=4) (actual time=0.011..137.262 rows=200001 loops=1) Total runtime: 2851.990 ms
Is this a bug, or some subtlety of the Postgres query planner?

In my particular case, I am doing a join out to another small table as part of the aggregation and using constraint exclusion on the partitions, but I believe the cause of my problem is the same.

I am running on 64bit FreeBSD/Postgres 8.2.4 on a machine with 8GB of memory.

An explain of the query and resulting OOM diagnostics follow:

The aggregation will result in 5,000,000 rows, not 5,000.In the stats_dtl table there are 12 observations(hourly) for each customerThere are 125 different policy_id and 25 different policy_group_id'sPolicy's and policy_groups are even distributed across all customers

userquery Scan table_4760 (cost=2897243.60..2897418.60 rows=5000 width=152) -> HashAggregate (cost=2897243.60..2897368.60 rows=5000 width=40) -> Hash Join (cost=7.81..2241002.00 rows=37499520 width=40) Hash Cond: (public.customer_stats_dtl.policy_id = policy.policy_id) -> Append (cost=0.00..1641001.87 rows=59999232 width=40) -> Seq Scan on customer_stats_dtl (cost=0.00..22.45 rows=4 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Seq Scan on customer_stats_dtl_027 customer_stats_dtl (cost=0.00..1640979.42 rows=59999228 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Hash (cost=6.25..6.25 rows=125 width=8) -> Seq Scan on policy (cost=0.00..6.25 rows=125 width=8) TopMemoryContext: 268400 total in 32 blocks; 20912 free (43 chunks); 247488 usedunnamed prepared statement: 2097152 total in 8 blocks; 721456 free (2 chunks); 1375696 usedTopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 usedSPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 usedSPI Plan: 7168 total in 3 blocks; 1312 free (0 chunks); 5856 usedSPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 usedSPI Plan: 3072 total in 2 blocks; 928 free (0 chunks); 2144 usedSPI Plan: 7168 total in 3 blocks; 3920 free (0 chunks); 3248 usedSPI Plan: 3072 total in 2 blocks; 1224 free (0 chunks); 1848 usedPL/PgSQL function context: 24576 total in 2 blocks; 12008 free (9 chunks); 12568 usedCFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedRendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedPLpgSQL function cache: 24224 total in 2 blocks; 3744 free (0 chunks); 20480 usedType information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 usedS_2: 1024 total in 1 blocks; 488 free (0 chunks); 536 usedS_1: 1024 total in 1 blocks; 488 free (0 chunks); 536 usedPrepared Queries: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 usedRecord information cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 usedMessageContext: 8192 total in 1 blocks; 7744 free (0 chunks); 448 usedOperator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedsmgr relation table: 57344 total in 3 blocks; 21872 free (8 chunks); 35472 usedTransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 usedPortal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedPortalMemory: 8192 total in 1 blocks; 7888 free (1 chunks); 304 usedPortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 usedExecutorState: 122880 total in 4 blocks; 68728 free (2 chunks); 54152 usedHashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedHashBatchContext: 32888 total in 3 blocks; 14512 free (0 chunks); 18376 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 usedAggContext: 1033887744 total in 133 blocks; 688 free (0 chunks); 1033887056 used **TupleHashTable: 566485040 total in 78 blocks; 2064896 free (304 chunks); 564420144 used **ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 8192 total in 1 blocks; 8040 free (1 chunks); 152 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedRelcache by OID: 57344 total in 3 blocks; 31024 free (6 chunks); 26320 usedCacheMemoryContext: 4679720 total in 23 blocks; 404104 free (1 chunks); 4275616 used

If you have actually read this far, the following wrinkle is where I am currently stuck.

I set ENABLE_HASHAGG=OFF and still get an OOM :-(
In this situation Postgres runs for much longer before getting the OOM and I believe it has actually written some records to the aggregationI base this on seeing autovacuum kick off against the aggregation table after the query fails, which may not be valid.
userquery Scan table_4760 (cost=8771233.77..9521474.17 rows=5000 width=152) -> GroupAggregate (cost=8771233.77..9521349.17 rows=5000 width=40) -> Sort (cost=8771233.77..8864982.57 rows=37499520 width=40) Sort Key: public.customer_stats_dtl.user_id, policy.policy_group_id -> Hash Join (cost=7.81..2241002.00 rows=37499520 width=40) Hash Cond: (public.customer_stats_dtl.policy_id = policy.policy_id) -> Append (cost=0.00..1641001.87 rows=59999232 width=40) -> Seq Scan on customer_stats_dtl (cost=0.00..22.45 rows=4 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Seq Scan on customer_stats_dtl_027 customer_stats_dtl (cost=0.00..1640979.42 rows=59999228 width=40) Filter: ((period_start >= '2007-09-08 20:00:00-04'::timestamp with time zone) AND (period_start < '2007-09-09 08:00:00-04'::timestamp with time zone)) -> Hash (cost=6.25..6.25 rows=125 width=8) -> Seq Scan on policy (cost=0.00..6.25 rows=125 width=8)

TopMemoryContext: 874608 total in 106 blocks; 14240 free (45 chunks); 860368 usedunnamed prepared statement: 2097152 total in 8 blocks; 720544 free (2 chunks); 1376608 usedTopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544 usedAfterTriggerEvents: 758112256 total in 102 blocks; 1792 free (8 chunks); 758110464 usedS_2: 1024 total in 1 blocks; 488 free (0 chunks); 536 usedType information cache: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 usedSPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 usedSPI Plan: 7168 total in 3 blocks; 1312 free (0 chunks); 5856 usedSPI Plan: 3072 total in 2 blocks; 1320 free (0 chunks); 1752 usedSPI Plan: 3072 total in 2 blocks; 928 free (0 chunks); 2144 usedSPI Plan: 7168 total in 3 blocks; 3920 free (0 chunks); 3248 usedSPI Plan: 3072 total in 2 blocks; 1224 free (0 chunks); 1848 usedPL/PgSQL function context: 24576 total in 2 blocks; 12008 free (9 chunks); 12568 usedCFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedRendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedPLpgSQL function cache: 24224 total in 2 blocks; 3744 free (0 chunks); 20480 usedS_1: 1024 total in 1 blocks; 488 free (0 chunks); 536 usedPrepared Queries: 24576 total in 2 blocks; 11888 free (5 chunks); 12688 usedRecord information cache: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 usedMessageContext: 8192 total in 1 blocks; 7744 free (0 chunks); 448 usedOperator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedsmgr relation table: 253952 total in 5 blocks; 40912 free (16 chunks); 213040 usedTransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 usedPortal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 usedPortalMemory: 8192 total in 1 blocks; 7888 free (1 chunks); 304 usedPortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 usedExecutorState: 122880 total in 4 blocks; 61720 free (15 chunks); 61160 usedExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 usedHashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedHashBatchContext: 32888 total in 3 blocks; 14512 free (0 chunks); 18376 usedTupleSort: 822437960 total in 105 blocks; 424201336 free (5267294 chunks); 398236624 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 usedExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 usedExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 usedAggContext: 8192 total in 1 blocks; 8000 free (3 chunks); 192 usedExprContext: 8192 total in 1 blocks; 8000 free (0 chunks); 192 usedExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 usedExprContext: 8192 total in 1 blocks; 8040 free (0 chunks); 152 usedRelcache by OID: 253952 total in 5 blocks; 117488 free (8 chunks); 136464 usedCacheMemoryContext: 17262632 total in 25 blocks; 3573056 free (0 chunks); 13689576 used
_________________________________________________________________
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Schuller 2007-09-28 18:06:50 Non-blocking vacuum full
Previous Message Peter Koczan 2007-09-28 17:33:35 Re: sequence query performance issues