Re: Performance of aggregates over set-returning functions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "John Smith" <sodgodofall(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance of aggregates over set-returning functions
Date: 2008-01-09 03:07:23
Message-ID: 25214.1199848043@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"John Smith" <sodgodofall(at)gmail(dot)com> writes:
> Consider the following query:
> postgres=# select count(*) from generate_series(1,100000000000000000);
> A vmstat while this query was running seems to suggest that the
> generate_series() was being materialized to disk first and then the
> count computed

Yeah, that's what I'd expect given the way nodeFunctionscan.c works.

> A small variation in the query (i.e. placing the generate_series in
> the select list without a from clause) exhibited a different
> behaviour:
> postgres=# select count(*) from (select
> generate_series(1,100000000000000000)) as A;
> This time, Postgres seemed to be trying to build the entire
> generate_series() result set in memory and eventually started to swap
> out until the swap space limit after which it crashed.

Hmm, at worst you should get an "out of memory" error --- that's what
I get. I think you're suffering from the dreaded "OOM kill" disease.

> Interestingly though, when the range in the generate_series() was
> small enough to fit in 4 bytes of memory (e.g.
> generate_series(1,1000000000) ), the above query completed consuming
> only negligible amount of memory. So, it looked like the aggregate
> computation was being pipelined with the tuples returned from
> generate_series().

It's pipelined either way. But int8 is a pass-by-reference data type,
and it sounds like we have a memory leak for this case.

> Third question: more generally, is there a way I can ask Postgres to
> report an Out-of-memory the moment it tries to consume greater than a
> certain percentage of memory (instead of letting it hit the swap and
> eventually die or thrash) ?

First, you want to turn off memory overcommit in your kernel;
second, you might find that running the postmaster under conservative
ulimit settings would make the behavior nicer.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message John Smith 2008-01-09 03:33:33 Re: Performance of aggregates over set-returning functions
Previous Message John Smith 2008-01-09 02:21:46 Performance of aggregates over set-returning functions