Re: Awkward Join between generate_series and long table

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Awkward Join between generate_series and long table
Date: 2023-11-09 01:44:53
Message-ID: CAKFQuwbfrcSWBUuzVoPvnQX9gOW64bUoEK0cxr+AgpHU3udk_g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Nov 8, 2023 at 6:26 PM Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com>
wrote:

> SELECT
>
s at time zone 'utc' AS period_start,
> LEAD(s) OVER (
> ORDER BY
> s
> ) at time zone 'utc' AS period_end
>

Maybe doesn't help overall but this can be equivalently written as:
s + '1 day'::interval as period_end

Resorting to a window function here is expensive waste, the lead() value
can be computed, not queried.

> SELECT
> p.period_start,
> p.period_end,
> COUNT (distinct d.id)
> FROM
> periods p
> LEFT JOIN data d
> ON
> d.timestamp >= (p.period_start)
> AND d."timestamp" < (p.period_end)
> AND d.sn = 'BLAH'
>

This seems better written (semantically, not sure about execution dynamics)
as:

FROM periods AS p
LEFT JOIN LATERAL (SELECT count(distinct? d.id) FROM data AS d WHERE
d.timestamp >= p.period_start AND d.timestamp < p.period_end AND d.sn =
'BLAH') AS cnt_d
-- NO grouping required at this query level

David J.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Lincoln Swaine-Moore 2023-11-09 02:19:34 Re: Awkward Join between generate_series and long table
Previous Message Lincoln Swaine-Moore 2023-11-09 01:26:39 Awkward Join between generate_series and long table