DELETE ... WHERE ctid IN (...) vs. Iteration

From: Jonathan Gardner <jgardner(at)jonathangardner(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: DELETE ... WHERE ctid IN (...) vs. Iteration
Date: 2003-12-30 16:32:38
Message-ID: 200312300832.44945.jgardner@jonathangardner.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I guess this may have come up before, but now that 7.4 has the IN with
improved performance, it may be time to revisit this topic.

Compare these two algorithms (in plpgsql):

(a)
DELETE FROM foo WHERE ctid IN (
SELECT foo.ctid
FROM ... WHERE ...
);

(b)
FOR result IN SELECT foo.ctid FROM ... WHERE ... LOOP
DELETE FROM foo WHERE ctid = result;
END LOOP;

My poor understanding of how the IN operator works leaves me to believe
that for a large set of data in the IN group, a hash is used and a
tablescan done on foo. However, for a small set of data in the IN group,
no tablescan is performed.

I assume that (a) works at O(ln(N)) for large N, and O(N) for small N,
while (b) works at O(N) universally. Therefore, (a) is the superior
algorithm. I welcome criticism and correction.

- --
Jonathan Gardner
jgardner(at)jonathangardner(dot)net
Live Free, Use Linux!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/8aipWgwF3QvpWNwRAk8GAJoDWISjxG7LMB1FdCFmwlOafsmZTwCePx18
lyHLNBJ8nP0RHzv6WfRzQ+M=
=FPdW
-----END PGP SIGNATURE-----

Browse pgsql-performance by date

  From Date Subject
Next Message Vivek Khera 2003-12-31 17:17:50 Re: deferred foreign keys
Previous Message Nicholas Shanny 2003-12-30 15:00:30 Re: Question about difference in performance of 2 queries