Re: good experience with performance in 8.2 for multi column indexes

From: Thomas Markus <t(dot)markus(at)proventis(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: good experience with performance in 8.2 for multi column indexes
Date: 2008-04-18 12:33:50
Message-ID: 4808952E.6080406@proventis.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

your query cant perform well on 8.1
better use a query like

delete from pluext1 using pluext2 where pluext1.plunmbr =
pluext2.plunmbr and pluext1.pluexttype = pluext2.pluexttype

it should perform much faster. be sure to use indizes

regards
thomas

Michael Enke schrieb:
> Hi lists,
> I want to let you take part in my experience of performance boost for
> delete operations
> where more than one column is part of a primary key.
>
> For my setup, in 8.1 a delete query which deletes 200000 entries
> depending on rows in another table
> runs about 7h, in 8.2 (and later) it runs 9s!
>
> I have two tables looking exactly the same, with two columns in the
> pk, one varchar(20) and one char(1).
> Both tables contain the same contents.
>
> Explain produces the following difference:
> 8.1:
>
> tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in
> (select plunmbr,pluexttype from pluext2);
> QUERY PLAN
> ----------------------------------------------------------------------------------------
>
> Hash Join (cost=24267.10..155886.35 rows=48236 width=6)
> Hash Cond: ("outer".pluexttype = "inner".pluexttype)
> Join Filter: ("outer".plunmbr = "inner".plunmbr)
> -> Seq Scan on pluext1 (cost=0.00..6945.00 rows=138900 width=46)
> -> Hash (cost=24116.37..24116.37 rows=13891 width=40)
> -> Unique (cost=23074.62..24116.37 rows=13891 width=40)
> -> Sort (cost=23074.62..23421.87 rows=138900 width=40)
> Sort Key: pluext2.plunmbr, pluext2.pluexttype
> -> Seq Scan on pluext2 (cost=0.00..6945.00
> rows=138900 width=40)
> (9 rows)
>
> (this delete took 7h)
>
> 8.2:
>
> tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in
> (select plunmbr,pluexttype from pluext2);
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------
>
> Nested Loop IN Join (cost=0.00..13362.14 rows=41106 width=6)
> -> Seq Scan on pluext1 (cost=0.00..6411.25 rows=128225 width=46)
> -> Index Scan using pluext2_pk on pluext2 (cost=0.00..0.50 rows=3
> width=40)
> Index Cond: ((pluext1.plunmbr = pluext2.plunmbr) AND
> (pluext2.pluexttype = pluext1.pluexttype))
> (4 rows)
>
> (this delete took 9s)
>
> I could not find an explanation for this in the release notes for 8.2,
> I thought it was much earlier that multi column indexes could be used.
> Anyway, it saved my life that new version is fast. Many thanks to the
> developer!
>
> BTW I do not understand the output of the 8.2 explain:
> From my understanding it should do a seq scan on pluext2 (and not
> pluext1)
> and an index scan on pluext1 (and not pluext2).
>
> Regards,
> Michael
>

Attachment Content-Type Size
t_markus.vcf text/x-vcard 255 bytes

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Terry Lee Tucker 2008-04-18 12:40:54 Re: How to retore a pg_dumpall dump?
Previous Message Karsten Hilbert 2008-04-18 12:12:44 Re: Which Python library - psycopg2 or pygresql?