Re: Performance differential when 0 values present vs when 1 values present. Planner return 52k rows when 0 expected.

From: Emil Iggland <emil(dot)iggland(at)metrima(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance differential when 0 values present vs when 1 values present. Planner return 52k rows when 0 expected.
Date: 2022-04-28 08:09:55
Message-ID: 48870736-281c-6356-c162-60c4f0bb3a43@metrima.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> I don't think that index can be used for your original query. It could
> only be used if "channel" is unique in "valueseries" and you'd written
> the query as:

Thanks! That explanation I can understand, now I know how to avoid this
in future.

> I guess "channel" must not be the primary key to "valueseries" and
> that's why you use an IN().
Correct. We create a new valueseries in some circumstances, so multiple
valueseries can point to the same channel.

On 2022-04-27 10:22, David Rowley wrote:
> On Wed, 27 Apr 2022 at 19:54, Emil Iggland <emil(dot)iggland(at)metrima(dot)com> wrote:
>>
>> > You've got the wrong column order (for this query anyway) in that
>> > index. It'd work a lot better if dataview were the first column;
>
>> I might be misunderstanding you, but I assume that you are suggesting an
>> index on (dataview, valuetimestamp).
>> We have that index, it is the primary key. For some reason it isn't
>> being selected.
>
> I don't think that index can be used for your original query. It could
> only be used if "channel" is unique in "valueseries" and you'd written
> the query as:
>
> select * from datavalue
> where dataview = (select id from valueseries where channel = 752433)
> ORDER BY VALUETIMESTAMP DESC
> FETCH FIRST ROW only;
>
> that would allow a backwards index scan using the (dataview,
> valuetimestamp) index. Because you're using the IN clause to possibly
> look for multiple "dataview" values matching the given "channel", the
> index range scan does not have a single point to start at. What
> you've done with the LATERAL query allows the index to be scanned once
> for each "valueseries" row with a "channel" value matching your WHERE
> clause.
>
> I guess "channel" must not be the primary key to "valueseries" and
> that's why you use an IN().
>
> The above query would return an error if multiple rows were returned
> by the subquery.
>
> David

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Eser HEUFT [Germany] 2022-04-28 14:28:15 LISTEN NOTIFY sometimes huge delay
Previous Message Tom Lane 2022-04-28 02:08:14 Re: Unworkable plan above certain row count