From: | Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Awkward Join between generate_series and long table |
Date: | 2023-11-09 02:19:34 |
Message-ID: | CABcidk+1h0RidMehRcDL31XxO+o6e1fJTVyAEQmfcRc-tnoyQw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> Maybe doesn't help overall but this can be equivalently written as:
s + '1 day'::interval as period_end
Ah, so I've glossed over a detail here which is that I'm relying on some
timezone specific behavior and not actually generate_series itself. If
you're curious, the details are here:
https://www.postgresql.org/message-id/2582288.1696428710%40sss.pgh.pa.us
I think that makes the window function necessary, or at least something a
little more sophisticated than addition of a day (though I'd be happy to be
wrong about that).
> LEFT JOIN LATERAL (SELECT
Oh wow, this seems to get the index used! That's wonderful news--thank you.
I'd be super curious if anyone has any intuition about why the planner is
so much more successful there--most of what I see online about LATERAL
JOINs is focused as you said on semantics not performance. But in terms of
solving my problem, this seems to do the trick.
Thanks again!
On Wed, Nov 8, 2023 at 5:45 PM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
wrote:
> On Wed, Nov 8, 2023 at 6:26 PM Lincoln Swaine-Moore <
> lswainemoore(at)gmail(dot)com> wrote:
>
>> SELECT
>>
> s at time zone 'utc' AS period_start,
>> LEAD(s) OVER (
>> ORDER BY
>> s
>> ) at time zone 'utc' AS period_end
>>
>
> Maybe doesn't help overall but this can be equivalently written as:
> s + '1 day'::interval as period_end
>
> Resorting to a window function here is expensive waste, the lead() value
> can be computed, not queried.
>
>
>> SELECT
>> p.period_start,
>> p.period_end,
>> COUNT (distinct d.id)
>> FROM
>> periods p
>> LEFT JOIN data d
>> ON
>> d.timestamp >= (p.period_start)
>> AND d."timestamp" < (p.period_end)
>> AND d.sn = 'BLAH'
>>
>
> This seems better written (semantically, not sure about execution
> dynamics) as:
>
> FROM periods AS p
> LEFT JOIN LATERAL (SELECT count(distinct? d.id) FROM data AS d WHERE
> d.timestamp >= p.period_start AND d.timestamp < p.period_end AND d.sn =
> 'BLAH') AS cnt_d
> -- NO grouping required at this query level
>
> David J.
>
>
--
Lincoln Swaine-Moore
From | Date | Subject | |
---|---|---|---|
Next Message | Philip Semanchuk | 2023-11-09 13:54:36 | Re: Awkward Join between generate_series and long table |
Previous Message | David G. Johnston | 2023-11-09 01:44:53 | Re: Awkward Join between generate_series and long table |