Re: Bug? Query plans / EXPLAIN using gigabytes of memory

From: Toby Corkindale <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
To: Willy-Bas Loos <willybas(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Bug? Query plans / EXPLAIN using gigabytes of memory
Date: 2012-04-25 09:42:39
Message-ID: 54e69913-576d-4fef-af95-232b3e3e5cee@dmz03.strategicdata.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

----- Original Message -----
> From: "Willy-Bas Loos" <willybas(at)gmail(dot)com>
> To: "Toby Corkindale" <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
> Cc: "pgsql-general" <pgsql-general(at)postgresql(dot)org>
> Sent: Wednesday, 25 April, 2012 7:16:50 PM
> Subject: Re: [GENERAL] Bug? Query plans / EXPLAIN using gigabytes of memory
>
> would it be possible to reproduce the same query without using any
> views?
> you could see the difference in memory usage.

It is possible to do it in stages instead, and if done that way, memory usage (and explain plans) are quite normal.
However to change everything would be very time consuming to re-code so I would rather avoid that.

> if that doesn't explain, try also without inheritance, by using the
> ONLY keyword (and UNION ALL).

Have tried something similar to that (accessing partition slices directly). It didn't change anything.

> If it's really only a couple of rows, you might as well post a dump
> somewhere? Then i could reproduce.

Well, the data touched by the query is only a handful of rows, but it's a handful of rows in 20GB of *other* rows.
Although the query plan correctly estimates it's only going to touch on a few.
I'm pretty sure we didn't see this crazy memory usage in earlier testing, before the database was starting to get populated.

Toby

>
> WBL
>
>
>
> On Wed, Apr 25, 2012 at 10:51 AM, Toby Corkindale <
> toby(dot)corkindale(at)strategicdata(dot)com(dot)au > wrote:
>
>
> Hi Willy-Bas,
> Thanks for your reply.
>
> I realise that stacking the views up like this complicates matters,
> but the actual views are fairly simple queries, and each one
> individually is only looking at a few dozen rows. (Eg. selecting
> min, max or average value from a small set, grouped by one column)
> From the point of view of creating reporting queries, it's a nice and
> logical way to build up a query, and we didn't think it would
> present any problems.. and even on a well-populated database, the
> query runs very fast. It's just the astounding amount of memory used
> that presents difficulties.
>
> Looking at the postgresql.conf for non-default settings, the notable
> ones are:
>
> max_connections = 200
> ssl = false
> shared_buffers = 256MB
> max_prepared_transactions = 16
> # although they aren't used for the group of queries in question
> maintenance_work_mem = 128MB
> # work_mem is left at default of 1MB
> effective_io_concurrency = 2
> random_page_cost = 3.0
> effective_cache_size = 512MB
> geqo = on
> geqo_threshold = 12
> geqo_effort = 7
>
> Some other things are non-default, like checkpoints,
> streaming-replication stuff, but those shouldn't have any effect.
>
> The memory settings (shared buffers, effective cache) might seem to
> be set quite conservatively at the moment, given the memory
> available in the machine -- but since we can exhaust that memory
> with just a few connections, it seems fair.
>
> Cheers,
> Toby
>
>
> ----- Original Message -----
> From: "Willy-Bas Loos" < willybas(at)gmail(dot)com >
> To: "Toby Corkindale" < toby(dot)corkindale(at)strategicdata(dot)com(dot)au >
> Cc: "pgsql-general" < pgsql-general(at)postgresql(dot)org >
> Sent: Wednesday, 25 April, 2012 6:05:37 PM
> Subject: Re: [GENERAL] Bug? Query plans / EXPLAIN using gigabytes of
> memory
>
>
> Stacking views is a bad practice. It usually means that you are
> making the db do a lot of unnecessary work, scanning tables more
> than once when you don't even need them.
> According to your description, you have 3 layers of views on
> partitioned tables.
> I can imagine that that leaves the planner with a lot of possible
> query plans, a lot of interaction and a lot of statistics to read.
>
> do you have any special settings for the statistics on these tables?
> and could you please post the non-default settings in your
> postgresql.conf file?
> $ grep ^[^#] /etc/ postgresql /9.1/main/ postgresql . conf | grep -e
> ^[^[:space:]]
>
>
>
> Would be helpful to see if you have any statistics or planner stuff
> altered.
>
> Cheers,
>
> WBL
>
>
>
> On Wed, Apr 25, 2012 at 8:18 AM, Toby Corkindale <
> toby(dot)corkindale(at)strategicdata(dot)com(dot)au > wrote:
>
>
> Hi,
> I'm hitting some peculiar behaviour. I'm currently on Pg 9.1.3 on a
> 64bit Debian system.
>
> I have a database which is moderately large - 20 GByte or so - and
> contains that data split up over dozens of tables, which are
> themselves partitioned.
> Queries are usually only run against fairly small, partitioned, sets
> of data.
>
> These queries generally run fairly fast. Performance is not a
> problem.
>
> However Postgres is chewing up huge amounts of memory just to create
> the query plan!
>
> For example, even if I just run
> EXPLAIN SELECT a_column FROM a_view
> WHERE partition_id = 1;
>
> Then the postgres backend process takes several seconds to return,
> and in the worst example here, is hogging more than 3Gbyte once it
> comes back. (It doesn't free that up until you close the connection)
>
> The query plan that comes back does seem quite convoluted, but then,
> the view is a query run over about eight other views, each of which
> is pulling data from a few other views. The actual underlying data
> being touched is only *a few dozen* small rows.
>
> As I said, the query runs fast enough.. however we only need a
> handful of these queries to get run in separate connections, and the
> database server will be exhausted of memory. Especially since the
> memory isn't returned until the end of the connection, yet these
> connections typically stay up for a while.
>
> I wondered if there's anything I can do to reduce this memory usage?
> And, is this a bug?
>
> I've posted the output of the query plan here:
> https://gist.github.com/2487097
>
>
> Thanks in advance,
> Toby
>
> --
> Sent via pgsql-general mailing list ( pgsql-general(at)postgresql(dot)org )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
> --
> "Quality comes from focus and clarity of purpose" -- Mark
> Shuttleworth
>
>
>
>
> --
> "Quality comes from focus and clarity of purpose" -- Mark
> Shuttleworth
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vincenzo Romano 2012-04-25 09:43:50 Re: Fractions of seconds in timestamps
Previous Message Jasen Betts 2012-04-25 09:25:32 Re: Fractions of seconds in timestamps