Re: Trouble running nested select - Join query

From: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Trouble running nested select - Join query
Date: 2014-05-26 17:14:32
Message-ID: 1401124472687-5804995.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Rushi wrote
> copy (select * from (select s1.head as h1,s1.tail as t1 ,s2.tail as
> neighbor from miami_2d s1, miami_directednetwork s2 where s1.tail=s2.head
> and s2.tail not in (select tail from miami_2d where head=s1.head)) as O
> where O.h1!=O.neighbor) to '/tmp/tmp.txt'

Some thoughts:

The O sub-query could probably be top-level

A correlated sub-query in the where clause should generally use EXISTS, not
IN ...
... Though if you can get rid of the whole thing that would be better

You might want to use the statement_timeout GUC to kill the query if it
takes too long

You could separate the the select and the copy into two statements via use
of a temporary table - depending on how large the output is that would split
the action into two statements that could be independently monitored.
Though, to be honest now that I re-read I have no clue how this query
inter-plays with PostgresXC...

Can you run a smaller input size and use that estimate how long a 100M^2
query should take to complete?

Can you calculate in smaller block sizes on the input?

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Trouble-running-nested-select-Join-query-tp5804938p5804995.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jack Douglas 2014-05-26 21:37:04 Re: new index type with clustering in mind.
Previous Message Tom Lane 2014-05-26 15:34:36 Re: postgreSQL : duplicate DB names