Re: DELETE taking too much memory

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: vincent dephily <vincent(dot)dephily(at)mobile-devices(dot)fr>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: DELETE taking too much memory
Date: 2011-07-07 20:26:45
Message-ID: 1310070406.2046.7.camel@laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
> Hi,
>
> I have a delete query taking 7.2G of ram (and counting) but I do not
> understant why so much memory is necessary. The server has 12G, and
> I'm afraid it'll go into swap. Using postgres 8.3.14.
>
> I'm purging some old data from table t1, which should cascade-delete
> referencing rows in t2. Here's an anonymized rundown :
>
>
> # \d t1
> Table "public.t1"
> Column | Type | Modifiers
> -----------+-----------------------------+---------------------------------
> t1id | integer | not null default
> nextval('t1_t1id_seq'::regclass)
> (...snip...)
> Indexes:
> "message_pkey" PRIMARY KEY, btree (id)
> (...snip...)
>
> # \d t2
> Table "public.t2"
> Column | Type | Modifiers
> -----------------+-----------------------------+-----------------------------
> t2id | integer | not null default
> nextval('t2_t2id_seq'::regclass)
> t1id | integer | not null
> foo | integer | not null
> bar | timestamp without time zone | not null default now()
> Indexes:
> "t2_pkey" PRIMARY KEY, btree (t2id)
> "t2_bar_key" btree (bar)
> "t2_t1id_key" btree (t1id)
> Foreign-key constraints:
> "t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
> RESTRICT ON DELETE CASCADE
>
> # explain delete from t1 where t1id in (select t1id from t2 where
> foo=0 and bar < '20101101');
> QUERY PLAN
> -----------------------------------------------------------------------------
> Nested Loop (cost=5088742.39..6705282.32 rows=30849 width=6)
> -> HashAggregate (cost=5088742.39..5089050.88 rows=30849 width=4)
> -> Index Scan using t2_bar_key on t2 (cost=0.00..5035501.50
> rows=21296354 width=4)
> Index Cond: (bar < '2010-11-01 00:00:00'::timestamp
> without time zone)
> Filter: (foo = 0)
> -> Index Scan using t1_pkey on t1 (cost=0.00..52.38 rows=1 width=10)
> Index Cond: (t1.t1id = t2.t1id)
> (7 rows)
>
>
> Note that the estimate of 30849 rows is way off : there should be
> around 55M rows deleted from t1, and 2-3 times as much from t2.
>
> When looking at the plan, I can easily imagine that data gets
> accumulated below the nestedloop (thus using all that memory), but why
> isn't each entry freed once one row has been deleted from t1 ? That
> entry isn't going to be found again in t1 or in t2, so why keep it
> around ?
>
> Is there a better way to write this query ? Would postgres 8.4/9.0
> handle things better ?
>

Do you have any DELETE triggers in t1 and/or t2?

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Johnston 2011-07-07 20:31:56 Re: Add Foreign Keys To Table
Previous Message Guillaume Lelarge 2011-07-07 20:24:07 Re: Creating temp tables inside read only transactions

Browse pgsql-performance by date

  From Date Subject
Next Message D C 2011-07-07 20:30:39 "VACUUM FULL ANALYZE" vs. Autovacuum Contention
Previous Message D C 2011-07-07 20:23:06 "VACUUM FULL ANALYZE" vs. Autovacuum Contention