Re: MERGE: performance advices

From: Steve Clark <sclark(at)netwolves(dot)com>
To: Richard Broersma <richard(dot)broersma(at)gmail(dot)com>
Cc: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>, PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: MERGE: performance advices
Date: 2008-09-02 16:47:33
Message-ID: 48BD6E25.4050507@netwolves.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard Broersma wrote:
> On Tue, Sep 2, 2008 at 8:10 AM, Steve Clark <sclark(at)netwolves(dot)com> wrote:
>
>
>>Is there a way to do something similar with the following? I am an SQL noob
>>and the
>>following takes longer to run than is reasonable, on the order of hours.
>>
>>insert into myevents select * from t_unit_event_log a where exists
>> (select b.event_log_no from myevents b
>> where a.event_status = 1 and a.event_ref_log_no IS NOT NULL
>> and a.event_ref_log_no = b.event_log_no and a.event_log_no not
>>in
>> (select event_log_no from myevents)
>> )
>
>
>
> To start off with, this SQL statement can be refined a bit. Many of
> the sub-query WHERE clause constraints have nothing to do with the
> Correlated sub-query. The refinement would look like so:
>
> INSERT INTO Myevents
> SELECT *
> FROM T_unit_event_log AS A
> WHERE A.event_status = 1
> AND A.event_ref_log_no IS NOT NULL
> AND A.event_log_no NOT IN ( SELECT event_log_no
> FROM Myevents)
> AND EXISTS ( SELECT B.event_log_no
> FROM Myevents AS B
> WHERE A.event_ref_log_no = B.event_log_no );
>
>
> The next step would be to rework the NOT IN sub-query into a LEFT JOIN
> WHERE IS NULL;
>
> INSERT INTO Myevents
> SELECT *
> FROM T_unit_event_log AS A
> LEFT JOIN Myevents AS C
> ON A.event_log_no = C.event_log_no
> WHERE A.event_status = 1
> AND A.event_ref_log_no IS NOT NULL
> AND C.event_log_no IS NULL
> AND EXISTS ( SELECT B.event_log_no
> FROM Myevents AS B
> WHERE A.event_ref_log_no = B.event_log_no );
>
> There is one possible alteration that may or many not improve
> performance. This would be to replace the EXISTS with a LEFT JOIN
> WHERE IS NOT NULL;
>
> INSERT INTO Myevents
> SELECT *
> FROM T_unit_event_log AS A
> LEFT JOIN Myevents AS C
> ON A.event_log_no = C.event_log_no
> LEFT JOIN Myevents AS B
> ON A.event_ref_log_no = B.event_log_no
> WHERE C.event_log_no IS NULL
> AND B.event_log_no IS NOT NULL
> AND A.event_status = 1
> AND A.event_ref_log_no IS NOT NULL;
>
>

Hi Richard and thanks for the response. When I try the last two queries i get and error. I have listed
the results of explain on all three.

srm2=# explain
srm2-# INSERT INTO Myevents
srm2-# SELECT *
srm2-# FROM T_unit_event_log AS A
srm2-# WHERE A.event_status = 1
srm2-# AND A.event_ref_log_no IS NOT NULL
srm2-# AND A.event_log_no NOT IN ( SELECT event_log_no
srm2(# FROM Myevents)
srm2-# AND EXISTS ( SELECT B.event_log_no
srm2(# FROM Myevents AS B
srm2(# WHERE A.event_ref_log_no = B.event_log_no );
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Index Scan using indx_tuel_usn_ec_es on t_unit_event_log a (cost=31711.73..3607445990.61 rows=51844 width=146)
Index Cond: (event_status = 1)
Filter: ((event_ref_log_no IS NOT NULL) AND (subplan) AND (NOT (subplan)))
SubPlan
-> Materialize (cost=31711.73..42857.85 rows=830612 width=4)
-> Seq Scan on myevents (cost=0.00..28041.12 rows=830612 width=4)
-> Index Scan using indx1myevents on myevents b (cost=0.00..8.37 rows=1 width=4)
Index Cond: ($1 = event_log_no)
(8 rows)

srm2=#
srm2=#
srm2=# explain
srm2-# INSERT INTO Myevents
srm2-# SELECT *
srm2-# FROM T_unit_event_log AS A
srm2-# LEFT JOIN Myevents AS C
srm2-# ON A.event_log_no = C.event_log_no
srm2-# WHERE A.event_status = 1
srm2-# AND A.event_ref_log_no IS NOT NULL
srm2-# AND C.event_log_no IS NULL
srm2-# AND EXISTS ( SELECT B.event_log_no
srm2(# FROM Myevents AS B
srm2(# WHERE A.event_ref_log_no = B.event_log_no );
ERROR: INSERT has more expressions than target columns
srm2=#
srm2=# explain
srm2-# INSERT INTO Myevents
srm2-# SELECT *
srm2-# FROM T_unit_event_log AS A
srm2-# LEFT JOIN Myevents AS C
srm2-# ON A.event_log_no = C.event_log_no
srm2-# LEFT JOIN Myevents AS B
srm2-# ON A.event_ref_log_no = B.event_log_no
srm2-# WHERE C.event_log_no IS NULL
srm2-# AND B.event_log_no IS NOT NULL
srm2-# AND A.event_status = 1
srm2-# AND A.event_ref_log_no IS NOT NULL;
ERROR: INSERT has more expressions than target columns

I really appreciate your help.

Steve

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Broersma 2008-09-02 16:52:07 Re: MERGE: performance advices
Previous Message David Fetter 2008-09-02 16:13:13 Re: pg_catalog forward compatibility