Re: [PERFORM] DELETE taking too much memory

From: Jose Ildefonso Camargo Tolosa <ildefonso(dot)camargo(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, vincent dephily <vincent(dot)dephily(at)mobile-devices(dot)fr>, pgsql-general(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] DELETE taking too much memory
Date: 2011-07-08 13:37:45
Message-ID: CAETJ_S_e9qbMpK+vQtsDfS7NRJ_jap8-xTcm51Sxs+Hs-9ZYQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Fri, Jul 8, 2011 at 4:35 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>wrote:

> > On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
> >> Hi,
> >>
> >> I have a delete query taking 7.2G of ram (and counting) but I do not
> >> understant why so much memory is necessary. The server has 12G, and
> >> I'm afraid it'll go into swap. Using postgres 8.3.14.
> >>
> >> I'm purging some old data from table t1, which should cascade-delete
> >> referencing rows in t2. Here's an anonymized rundown :
> >>
> >> # explain delete from t1 where t1id in (select t1id from t2 where
> >> foo=0 and bar < '20101101');
>
> It looks as though you're hitting one of the known issues with
> PostgreSQL and FKs. The FK constraint checks and CASCADE actions are
> implemented using AFTER triggers, which are queued up during the query
> to be executed at the end. For very large queries, this queue of
> pending triggers can become very large, using up all available memory.
>
> There's a TODO item to try to fix this for a future version of
> PostgreSQL (maybe I'll have another go at it for 9.2), but at the
> moment all versions of PostgreSQL suffer from this problem.
>
> The simplest work-around for you might be to break your deletes up
> into smaller chunks, say 100k or 1M rows at a time, eg:
>
> delete from t1 where t1id in (select t1id from t2 where foo=0 and bar
> < '20101101' limit 100000);
>

I'd like to comment here.... I had serious performance issues with a similar
query (planner did horrible things), not sure if planner will do the same
dumb thing it did for me, my query was against the same table (ie, t1=t2).
I had this query:

delete from t1 where ctid in (select ctid from t1 where
created_at<'20101231' limit 10000); <--- this was slooooow. Changed to:

delete from t1 where ctid = any(array(select ctid from t1 where
created_at<'20101231' limit 10000)); <--- a lot faster.

So... will the same principle work here?, doing this?:

delete from t1 where t1id = any(array(select t1id from t2 where foo=0 and
bar
< '20101101' limit 100000)); <-- would this query be faster then original
one?

>
> Regards,
> Dean
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Davis 2011-07-08 16:15:21 Re: [HACKERS] Creating temp tables inside read only transactions
Previous Message Claudio Freire 2011-07-08 11:09:17 Re: [PERFORM] DELETE taking too much memory

Browse pgsql-performance by date

  From Date Subject
Next Message Magnus Hagander 2011-07-08 13:41:54 Re: Infinite Cache
Previous Message Heikki Linnakangas 2011-07-08 13:34:06 Re: Infinite Cache