Re: BUG #7571: Query high memory usage

From: Radovan Jablonovsky <radovan(dot)jablonovsky(at)replicon(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Melese Tesfaye <mtesfaye(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7571: Query high memory usage
Date: 2012-09-27 14:59:07
Message-ID: CAJYcdTtYu4K+N87LK3JjkiWNQZHt3OEamMapRz7sLJBD+SJhCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Pavel,

Here are the test data with set enable_hashagg to off. It does not looks
like improvement. Query was running for 30min without returning result set.

db=> set enable_hashagg=off;
SET
db=> explain
db-> SELECT
db-> schema_name,
db-> sum(table_size)
db-> FROM
db-> (SELECT
db(> pg_catalog.pg_namespace.nspname as schema_name,
db(> pg_relation_size(pg_catalog.pg_class.oid) as table_size,
db(> sum(pg_relation_size(pg_catalog.pg_class.oid)) over () as
database_size
db(> FROM pg_catalog.pg_class
db(> JOIN pg_catalog.pg_namespace
db(> ON relnamespace = pg_catalog.pg_namespace.oid
db(> ) t
db-> GROUP BY schema_name, database_size;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=725540.59..756658.18 rows=40000 width=104)
-> Sort (cost=725540.59..733219.99 rows=3071759 width=104)
Sort Key: pg_namespace.nspname,
(sum(pg_relation_size((pg_class.oid)::regclass, 'main'::text)) OVER (?))
-> WindowAgg (cost=120.98..243838.73 rows=3071759 width=68)
-> Hash Join (cost=120.98..190082.95 rows=3071759 width=68)
Hash Cond: (pg_class.relnamespace = pg_namespace.oid)
-> Seq Scan on pg_class (cost=0.00..143885.59
rows=3071759 width=8)
-> Hash (cost=90.99..90.99 rows=2399 width=68)
-> Seq Scan on pg_namespace (cost=0.00..90.99
rows=2399 width=68)
(9 rows)

Data from top after 30 min of query run with hashagg set off:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2235 postgres 25 0 27.5g 23g 4.6g R 95.1 75.2 31:39.81
postgres: aspuser aspdata 10.0.2.67(52716) SELECT

Radovan

On Wed, Sep 26, 2012 at 10:15 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>wrote:

> Hello
>
> you should to run this query on real data - and if it works now, then
> send EXPLAIN ANALYZE result, please
>
> Pavel
>
> 2012/9/27 Melese Tesfaye <mtesfaye(at)gmail(dot)com>:
> > Thanks Pavel,
> > Setting enable_hashagg to off didn't resolve the issue.
> > Please find the explain as well as query results after "set
> > enable_hashagg=off;"
> >
> > mtesfaye(at)[local](test_db)=# EXPLAIN SELECT DISTINCT(A.*)
> > test_db-# FROM table1_t A LEFT JOIN table2_v B
> > test_db-# ON A.pnr_id=B.pnr_id
> > test_db-# WHERE A.pnr_id IN(1801,2056) AND
> > B.departure_date_time>=DATE('2012-09-26')
> > test_db-# ORDER BY pnr_id ASC,nam_id ASC;
> >
> +-----------------------------------------------------------------------------------------------------------+
> > | QUERY PLAN
> > |
> >
> +-----------------------------------------------------------------------------------------------------------+
> > | Unique (cost=1354.62..1354.66 rows=4 width=13)
> > |
> > | -> Sort (cost=1354.62..1354.63 rows=4 width=13)
> > |
> > | Sort Key: a.pnr_id, a.nam_id, a.pty_num
> > |
> > | -> Merge Join (cost=1084.06..1354.58 rows=4 width=13)
> > |
> > | Merge Cond: (table2_t.pnr_id = a.pnr_id)
> > |
> > | -> Unique (cost=1084.06..1198.67 rows=11461 width=16)
> > |
> > | -> Sort (cost=1084.06..1112.72 rows=11461
> width=16)
> > |
> > | Sort Key: table2_t.pnr_id, table2_t.itn_id,
> > table2_t.departure_date_time |
> > | -> Seq Scan on table2_t (cost=0.00..311.34
> > rows=11461 width=16) |
> > | Filter: (departure_date_time >=
> > '2012-09-26'::date) |
> > | -> Index Scan using table1_t_pnr_id_idx1 on table1_t a
> > (cost=0.00..12.60 rows=4 width=13) |
> > | Index Cond: (pnr_id = ANY
> ('{1801,2056}'::integer[]))
> > |
> >
> +-----------------------------------------------------------------------------------------------------------+
> > (12 rows)
> >
> > Time: 5.889 ms
> >
> > mtesfaye(at)[local](test_db)=# show enable_hashagg;
> > +----------------+
> > | enable_hashagg |
> > +----------------+
> > | on |
> > +----------------+
> > (1 row)
> >
> > Time: 0.136 ms
> >
> > mtesfaye(at)[local](test_db)=# set enable_hashagg=off;
> > SET
> > Time: 0.203 ms
> > mtesfaye(at)[local](test_db)=# show enable_hashagg;
> > +----------------+
> > | enable_hashagg |
> > +----------------+
> > | off |
> > +----------------+
> > (1 row)
> >
> > Time: 0.131 ms
> >
> >
> > mtesfaye(at)[local](test_db)=# SELECT DISTINCT(A.*)
> > test_db-# FROM table1_t A LEFT JOIN table2_v B
> > test_db-# ON A.pnr_id=B.pnr_id
> > test_db-# WHERE A.pnr_id IN(1801,2056) AND
> > B.departure_date_time>=DATE('2012-09-26')
> > test_db-# ORDER BY pnr_id ASC,nam_id ASC;
> > +--------+--------+---------+
> > | pnr_id | nam_id | pty_num |
> > +--------+--------+---------+
> > | 1801 | 3359 | 1 |
> > | 1801 | 3360 | 1 |
> > | 1801 | 3361 | 1 |
> > | 1801 | 3362 | 1 |
> > +--------+--------+---------+
> > (4 rows)
> >
> > Time: 8.452 ms
> >
> >
> > On Thu, Sep 27, 2012 at 3:54 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> > wrote:
> >>
> >> Hello
> >>
> >> this situation is possible, when optimizer use HashAgg where should not
> >> use it.
> >>
> >> Please, try to disable HashAgg - set enable_hashagg to off;
> >>
> >> please, send EXPLAIN result
> >>
> >> Regards
> >>
> >> Pavel Stehule
> >>
> >> 2012/9/26 <radovan(dot)jablonovsky(at)replicon(dot)com>:
> >> > The following bug has been logged on the website:
> >> >
> >> > Bug reference: 7571
> >> > Logged by: Radovan Jablonovsky
> >> > Email address: radovan(dot)jablonovsky(at)replicon(dot)com
> >> > PostgreSQL version: 9.1.5
> >> > Operating system: CentOs 5.8 Linux 2.6.18-308.el5 x86_64
> >> > Description:
> >> >
> >> > During checking our company database size we used query, which was not
> >> > the
> >> > best to find out the tables/db size but should do the job. The query
> was
> >> > tested on server with 32GB of RAM, 2 CPU with 4 cores and it was
> running
> >> > alone without other activity. It consumed almost all RAM forced server
> >> > to
> >> > use swap and after 1hour it was still running. The simplified version
> of
> >> > query used 20% of memory and finished after 1hour 8min.
> >> >
> >> > The size of pg_class is 3mil rows/objects and pg_namespace has 3000
> >> > rows/schemata.
> >> >
> >> > query:
> >> > SELECT
> >> > schema_name,
> >> > sum(table_size)
> >> > FROM
> >> > (SELECT
> >> > pg_catalog.pg_namespace.nspname as schema_name,
> >> > pg_relation_size(pg_catalog.pg_class.oid) as table_size,
> >> > sum(pg_relation_size(pg_catalog.pg_class.oid)) over () as
> >> > database_size
> >> > FROM pg_catalog.pg_class
> >> > JOIN pg_catalog.pg_namespace
> >> > ON relnamespace = pg_catalog.pg_namespace.oid
> >> > ) t
> >> > GROUP BY schema_name, database_size;
> >> >
> >> >
> >> > top - 10:50:44 up 20 days, 19:00, 1 user, load average: 1.15, 1.10,
> >> > 0.84
> >> > Tasks: 239 total, 3 running, 236 sleeping, 0 stopped, 0 zombie
> >> > Cpu(s): 15.1%us, 1.5%sy, 0.0%ni, 83.0%id, 0.5%wa, 0.0%hi, 0.0%si,
> >> > 0.0%st
> >> > Mem: 32946260k total, 32599908k used, 346352k free, 141924k
> buffers
> >> > Swap: 55043952k total, 85216k used, 54958736k free, 14036516k
> cached
> >> >
> >> > Info from top:
> >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> >> > 2016 postgres 25 0 22.8g 17g 3.2g R 96.1 56.0 19:17.01 postgres:
> >> > postgres db 10.0.1.10(49928) SELECT
> >> >
> >> > Simplified version of query uses pg_tables. It has 0.5mil rows/tables.
> >> > Simplified version of query:
> >> > SELECT
> >> > schemaname,
> >> > sum(pg_relation_size(schemaname || '.' || tablename))::bigint
> >> > FROM pg_tables
> >> > GROUP BY schemaname;
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> >> > To make changes to your subscription:
> >> > http://www.postgresql.org/mailpref/pgsql-bugs
> >>
> >>
> >> --
> >> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-bugs
> >
> >
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pavel Stehule 2012-09-27 16:32:27 Re: BUG #7571: Query high memory usage
Previous Message Melese Tesfaye 2012-09-27 11:16:56 Re: BUG #7570: WHERE .. IN bug from 9.2.0 not fixed in 9.2.1