Re: Bug? Query plans / EXPLAIN using gigabytes of memory

From: Willy-Bas Loos <willybas(at)gmail(dot)com>
To: Toby Corkindale <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Bug? Query plans / EXPLAIN using gigabytes of memory
Date: 2012-04-25 09:16:50
Message-ID: CAHnozTgYhYWrY8PZ3N2pMpaqgpJT3nbDxxHrQAz_YrCnfavQmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

would it be possible to reproduce the same query without using any views?
you could see the difference in memory usage.

if that doesn't explain, try also without inheritance, by using the ONLY
keyword (and UNION ALL).

If it's really only a couple of rows, you might as well post a dump
somewhere? Then i could reproduce.

WBL

On Wed, Apr 25, 2012 at 10:51 AM, Toby Corkindale <
toby(dot)corkindale(at)strategicdata(dot)com(dot)au> wrote:

> Hi Willy-Bas,
> Thanks for your reply.
>
> I realise that stacking the views up like this complicates matters, but
> the actual views are fairly simple queries, and each one individually is
> only looking at a few dozen rows. (Eg. selecting min, max or average value
> from a small set, grouped by one column)
> From the point of view of creating reporting queries, it's a nice and
> logical way to build up a query, and we didn't think it would present any
> problems.. and even on a well-populated database, the query runs very fast.
> It's just the astounding amount of memory used that presents difficulties.
>
> Looking at the postgresql.conf for non-default settings, the notable ones
> are:
>
> max_connections = 200
> ssl = false
> shared_buffers = 256MB
> max_prepared_transactions = 16
> # although they aren't used for the group of queries in question
> maintenance_work_mem = 128MB
> # work_mem is left at default of 1MB
> effective_io_concurrency = 2
> random_page_cost = 3.0
> effective_cache_size = 512MB
> geqo = on
> geqo_threshold = 12
> geqo_effort = 7
>
> Some other things are non-default, like checkpoints, streaming-replication
> stuff, but those shouldn't have any effect.
>
> The memory settings (shared buffers, effective cache) might seem to be set
> quite conservatively at the moment, given the memory available in the
> machine -- but since we can exhaust that memory with just a few
> connections, it seems fair.
>
> Cheers,
> Toby
>
> ----- Original Message -----
> From: "Willy-Bas Loos" <willybas(at)gmail(dot)com>
> To: "Toby Corkindale" <toby(dot)corkindale(at)strategicdata(dot)com(dot)au>
> Cc: "pgsql-general" <pgsql-general(at)postgresql(dot)org>
> Sent: Wednesday, 25 April, 2012 6:05:37 PM
> Subject: Re: [GENERAL] Bug? Query plans / EXPLAIN using gigabytes of memory
>
>
> Stacking views is a bad practice. It usually means that you are making the
> db do a lot of unnecessary work, scanning tables more than once when you
> don't even need them.
> According to your description, you have 3 layers of views on partitioned
> tables.
> I can imagine that that leaves the planner with a lot of possible query
> plans, a lot of interaction and a lot of statistics to read.
>
> do you have any special settings for the statistics on these tables?
> and could you please post the non-default settings in your postgresql.conf
> file?
> $ grep ^[^#] /etc/ postgresql /9.1/main/ postgresql . conf | grep -e
> ^[^[:space:]]
>
> Would be helpful to see if you have any statistics or planner stuff
> altered.
>
> Cheers,
>
> WBL
>
>
>
> On Wed, Apr 25, 2012 at 8:18 AM, Toby Corkindale <
> toby(dot)corkindale(at)strategicdata(dot)com(dot)au > wrote:
>
>
> Hi,
> I'm hitting some peculiar behaviour. I'm currently on Pg 9.1.3 on a 64bit
> Debian system.
>
> I have a database which is moderately large - 20 GByte or so - and
> contains that data split up over dozens of tables, which are themselves
> partitioned.
> Queries are usually only run against fairly small, partitioned, sets of
> data.
>
> These queries generally run fairly fast. Performance is not a problem.
>
> However Postgres is chewing up huge amounts of memory just to create the
> query plan!
>
> For example, even if I just run
> EXPLAIN SELECT a_column FROM a_view
> WHERE partition_id = 1;
>
> Then the postgres backend process takes several seconds to return, and in
> the worst example here, is hogging more than 3Gbyte once it comes back. (It
> doesn't free that up until you close the connection)
>
> The query plan that comes back does seem quite convoluted, but then, the
> view is a query run over about eight other views, each of which is pulling
> data from a few other views. The actual underlying data being touched is
> only *a few dozen* small rows.
>
> As I said, the query runs fast enough.. however we only need a handful of
> these queries to get run in separate connections, and the database server
> will be exhausted of memory. Especially since the memory isn't returned
> until the end of the connection, yet these connections typically stay up
> for a while.
>
> I wondered if there's anything I can do to reduce this memory usage? And,
> is this a bug?
>
> I've posted the output of the query plan here:
> https://gist.github.com/2487097
>
>
> Thanks in advance,
> Toby
>
> --
> Sent via pgsql-general mailing list ( pgsql-general(at)postgresql(dot)org )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
> --
> "Quality comes from focus and clarity of purpose" -- Mark Shuttleworth
>
>

--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jasen Betts 2012-04-25 09:25:32 Re: Fractions of seconds in timestamps
Previous Message Abbas 2012-04-25 09:01:55 Fwd: FW: Really heart touching.........