Skip site navigation (1) Skip section navigation (2)

Best way to delete unreferenced rows?

From: "Tyrrill, Ed" <tyrrill_ed(at)emc(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Best way to delete unreferenced rows?
Date: 2007-06-07 20:02:55
Message-ID: A23190A408F7094FAF446C1538222F7604092F3B@avaexch01.avamar.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hey All,

I have a table, let's call it A, whose primary key, a_id, is referenced
in a second table, let's call it B.  For each unique A.a_id there are
generally many rows in B with the same a_id.  My problem is that I want
to delete a row in A when the last row in B that references it is
deleted.  Right now I just query for rows in A that aren't referenced by
B, and that worked great when the tables were small, but it takes over
an hour now that the tables have grown larger (over 200 million rows in
B and 14 million in A).  The delete has to do a sequential scan of both
tables since I'm looking for what's not in the indexes.

I was going to try creating a trigger after delete on B for each row to
check for more rows in B with the same a_id, and delete the row in A if
none found.  In general I will be deleting 10's of millions of rows from
B and 100's of thousands of rows from A on a daily basis.  What do you
think?  Does anyone have any other suggestions on different ways to
approach this?

Thanks,
Ed

Responses

pgsql-performance by date

Next:From: Dan HarrisDate: 2007-06-07 21:26:56
Subject: Re: reclaiming disk space after major updates
Previous:From: Greg SmithDate: 2007-06-07 19:42:41
Subject: Re: VERY slow queries at random

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group