Re: Any better plan for this query?..

From: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Any better plan for this query?..
Date: 2009-05-19 12:00:41
Message-ID: 5482c80a0905190500k5119de7em1d9b6d719e6fde7d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I may confirm the issue with hash join - it's repeating both with
prepared and not prepared statements - it's curious because initially
the response time is lowering near ~1ms (the lowest seen until now)
and then once workload growing to 16 sessions it's jumping to 2.5ms,
then with 32 sessions it's 18ms, etc..

I've retested on 24 isolated cores, so any external secondary effects
are avoided.

Rgds,
-Dimitri

On 5/19/09, Dimitri <dimitrik(dot)fr(at)gmail(dot)com> wrote:
> On 5/19/09, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>
>> On Tue, 2009-05-19 at 00:33 +0200, Dimitri wrote:
>>> >
>>> > In particular, running the tests repeatedly using
>>> > H.REF_OBJECT = '0000000001'
>>> > rather than varying the value seems likely to benefit MySQL. The
>>>
>>> let me repeat again - the reference is *random*,
>>> the '0000000001' value I've used just to show a query execution
>>> plan.
>>>
>>> also, what is important - the random ID is chosen in way that no one
>>> user use the same to avoid deadlocks previously seen with PostgreSQL
>>> (see the "Deadlock mystery" note 2 years ago
>>> http://dimitrik.free.fr/db_STRESS_BMK_Part1.html#note_4355 )
>>
>> OK, didn't pick up on that.
>>
>> (Like Tom, I was thinking query cache)
>>
>> Can you comment on the distribution of values for that column? If you
>> are picking randomly, this implies distribution is uniform and so I am
>> surprised we are mis-estimating the selectivity.
>
> yes, the distribution of reference values is uniform between
> '0000000001' to '0010000000' (10M), only one OBJECT row by one
> reference, and only 20 rows with the same reference in HISTORY table.
>
>>
>>> I think yes (but of course I did not try to replay it several times)
>>
>> If you could that would be appreciated. We don't want to go chasing
>> after something that is not repeatable.
>
> I'll retry and let you know.
>
> Rgds,
> -Dimitri
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2009-05-19 12:01:43 Re: Any better plan for this query?..
Previous Message Simon Riggs 2009-05-19 11:41:00 Re: Any better plan for this query?..