Skip site navigation (1) Skip section navigation (2)

Re: Query performance

From: S Arvind <arvindwill(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query performance
Date: 2009-10-12 13:23:22
Message-ID: abf9211d0910120623i49d3789ah38d2ac2ff9fe4e2d@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Thanks very much Matthew , its more then my expectation...

Without changing the query is there any way to optimize it, like by changing
the pg configuration for handling these kind queries?

-Arvind S


On Mon, Oct 12, 2009 at 6:31 PM, Matthew Wakeling <matthew(at)flymine(dot)org>wrote:

> On Mon, 12 Oct 2009, S Arvind wrote:
>
>> I can understand left join, actually can any one tell me why sort
>> operation is carried
>> out and wat Materialize means...
>> Can anyone explain me the mentioned plan with reason(s)?
>>
>
>   Merge Left Join  (cost=62451.86..67379.08 rows=286789 width=0)
>>     Merge Cond: (a.id = b.id)
>>     ->  Sort  (cost=18610.57..18923.27 rows=125077 width=8)
>>          Sort Key: a.id
>>          ->  Seq Scan on a  (cost=0.00..6309.77 rows=125077 width=8)
>>     ->  Materialize  (cost=43841.28..47426.15 rows=286789 width=8)
>>          ->  Sort  (cost=43841.28..44558.26 rows=286789 width=8)
>>              Sort Key: b.id
>>             ->  Seq Scan on b (cost=0.00..13920.89 rows=286789 width=8)
>>
>
> This is a merge join. A merge join joins together two streams of data,
> where both streams are sorted, by placing the two streams side by side and
> advancing through both streams finding matching rows. The algorithm can use
> a pointer to a position in both of the streams, and advance the pointer of
> the stream that has the earlier value according to the sort order, and
> therefore get all the matches.
>
> You are performing a query over the whole of both of the tables, so the
> cheapest way to obtain a sorted stream of data is to do a full sequential
> scan of the whole table, bring it into memory, and sort it. An alternative
> would be to follow a B-tree index if one was available on the correct
> column, but that is usually more expensive unless the table is clustered on
> the index or only a small portion of the table is to be read. If you had put
> a "LIMIT 10" clause on the end of the query and had such an index, it would
> probably switch to that strategy instead.
>
> The materialise step is effectively a buffer that allows one of the streams
> to be rewound cheaply, which will be necessary if there are multiple rows
> with the same value.
>
> Does that answer your question?
>
> Matthew
>
> --
> The only secure computer is one that's unplugged, locked in a safe,
> and buried 20 feet under the ground in a secret location...and i'm not
> even too sure about that one.                         --Dennis Huges, FBI

In response to

Responses

pgsql-performance by date

Next:From: Grzegorz JaśkiewiczDate: 2009-10-12 13:29:13
Subject: Re: Query performance
Previous:From: Matthew WakelingDate: 2009-10-12 13:01:55
Subject: Re: Query performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group