Re: Query performance

From: S Arvind <arvindwill(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query performance
Date: 2009-10-12 13:23:22
Message-ID: abf9211d0910120623i49d3789ah38d2ac2ff9fe4e2d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks very much Matthew , its more then my expectation...

Without changing the query is there any way to optimize it, like by changing
the pg configuration for handling these kind queries?

-Arvind S

On Mon, Oct 12, 2009 at 6:31 PM, Matthew Wakeling <matthew(at)flymine(dot)org>wrote:

> On Mon, 12 Oct 2009, S Arvind wrote:
>
>> I can understand left join, actually can any one tell me why sort
>> operation is carried
>> out and wat Materialize means...
>> Can anyone explain me the mentioned plan with reason(s)?
>>
>
> Merge Left Join (cost=62451.86..67379.08 rows=286789 width=0)
>> Merge Cond: (a.id = b.id)
>> -> Sort (cost=18610.57..18923.27 rows=125077 width=8)
>> Sort Key: a.id
>> -> Seq Scan on a (cost=0.00..6309.77 rows=125077 width=8)
>> -> Materialize (cost=43841.28..47426.15 rows=286789 width=8)
>> -> Sort (cost=43841.28..44558.26 rows=286789 width=8)
>> Sort Key: b.id
>> -> Seq Scan on b (cost=0.00..13920.89 rows=286789 width=8)
>>
>
> This is a merge join. A merge join joins together two streams of data,
> where both streams are sorted, by placing the two streams side by side and
> advancing through both streams finding matching rows. The algorithm can use
> a pointer to a position in both of the streams, and advance the pointer of
> the stream that has the earlier value according to the sort order, and
> therefore get all the matches.
>
> You are performing a query over the whole of both of the tables, so the
> cheapest way to obtain a sorted stream of data is to do a full sequential
> scan of the whole table, bring it into memory, and sort it. An alternative
> would be to follow a B-tree index if one was available on the correct
> column, but that is usually more expensive unless the table is clustered on
> the index or only a small portion of the table is to be read. If you had put
> a "LIMIT 10" clause on the end of the query and had such an index, it would
> probably switch to that strategy instead.
>
> The materialise step is effectively a buffer that allows one of the streams
> to be rewound cheaply, which will be necessary if there are multiple rows
> with the same value.
>
> Does that answer your question?
>
> Matthew
>
> --
> The only secure computer is one that's unplugged, locked in a safe,
> and buried 20 feet under the ground in a secret location...and i'm not
> even too sure about that one. --Dennis Huges, FBI

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-10-12 13:29:13 Re: Query performance
Previous Message Matthew Wakeling 2009-10-12 13:01:55 Re: Query performance