Re: how to efficiently update tuple in many-to-many relationship?

From: Drew Wilson <drewmwilson(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: how to efficiently update tuple in many-to-many relationship?
Date: 2007-04-10 01:29:41
Message-ID: 22512E5C-A60F-43E7-A370-257F7AF956FD@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

My apologies. That function call was some test code to verify that my
subselect was only being called once.

Let me try again, please.

Here's the query plan for a SELECT statement that returns 1,207,161
rows in 6 seconds.
MatchBox=# explain select count(translation_pair_id) from
translation_pair_data
join instance i using(translation_pair_id)
join loc_submission ls using(loc_submission_id)
where ls.is_public = true;
QUERY PLAN
------------------------------------------------------------------------
-----------------------------------
Aggregate (cost=299276.72..299276.73 rows=1 width=4)
-> Hash Join (cost=59962.72..294036.83 rows=2095954 width=4)
Hash Cond: (i.translation_pair_id =
translation_pair_data.translation_pair_id)
-> Hash Join (cost=369.15..177405.01 rows=2095954 width=4)
Hash Cond: (i.loc_submission_id = ls.loc_submission_id)
-> Seq Scan on instance i (cost=0.00..99016.16
rows=5706016 width=8)
-> Hash (cost=296.92..296.92 rows=5778 width=4)
-> Index Scan using loc_submission_is_public
on loc_submission ls (cost=0.00..296.92 rows=5778 width=4)
Index Cond: (is_public = true)
Filter: is_public
-> Hash (cost=31861.92..31861.92 rows=1690292 width=4)
-> Seq Scan on translation_pair_data
(cost=0.00..31861.92 rows=1690292 width=4)

And here's the query plan for the UPDATE query that seems to never
complete. (Execution time > 30 minutes.)
MatchBox=# explain update translation_pair_data set is_public = true
where translation_pair_id in
(select translation_pair_id from translation_pair_data
join instance i using(translation_pair_id)
join loc_submission ls using(loc_submission_id)
where ls.is_public = true);
QUERY PLAN
------------------------------------------------------------------------
-------------------------------------------------------------
Hash IN Join (cost=328000.49..453415.65 rows=1690282 width=90)
Hash Cond: (public.translation_pair_data.translation_pair_id =
public.translation_pair_data.translation_pair_id)
-> Seq Scan on translation_pair_data (cost=0.00..31861.82
rows=1690282 width=90)
-> Hash (cost=293067.74..293067.74 rows=2067660 width=8)
-> Hash Join (cost=59958.35..293067.74 rows=2067660 width=8)
Hash Cond: (i.translation_pair_id =
public.translation_pair_data.translation_pair_id)
-> Hash Join (cost=365.00..177117.92 rows=2067660
width=4)
Hash Cond: (i.loc_submission_id =
ls.loc_submission_id)
-> Seq Scan on instance i
(cost=0.00..99016.16 rows=5706016 width=8)
-> Hash (cost=293.75..293.75 rows=5700 width=4)
-> Index Scan using
loc_submission_is_public on loc_submission ls (cost=0.00..293.75
rows=5700 width=4)
Index Cond: (is_public = true)
Filter: is_public
-> Hash (cost=31861.82..31861.82 rows=1690282 width=4)
-> Seq Scan on translation_pair_data
(cost=0.00..31861.82 rows=1690282 width=4)

I figure I must be doing something wrong here. Thanks for the help,

Drew

On Apr 9, 2007, at 2:43 PM, Tom Lane wrote:

> Drew Wilson <drewmwilson(at)gmail(dot)com> writes:
>> I have 2 tables (A,B) joined in a many-to-many relationship via a
>> join table ("membership"), where updating table A based on table B
>> takes a very long time.
>> ...
>> -> Function Scan on a (cost=0.00..12.50 rows=1000 width=4)
>
> I think you've left out some relevant details ... there's nothing
> in what you said about a set-returning function ...
>
> regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-04-10 02:13:05 Re: how to efficiently update tuple in many-to-many relationship?
Previous Message s d 2007-04-10 00:45:53 Re: Beginner Question