Re: Infinite Interval

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Joseph Koshakow <koshy44(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Infinite Interval
Date: 2023-09-20 10:27:16
Message-ID: CACJufxHFmB6uOWd8Rz6cppqG2-UOdqTZ6_2DG0FJ9FCfvErpHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 19, 2023 at 7:14 PM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
>
> I haven't reviewed this part in any detail yet, but I can confirm that
> there are some impressive performance improvements for avg(). However,
> for me, sum() seems to be consistently a few percent slower with this
> patch.

Now there should be no degradation.

> The introduction of an internal transition state struct seems like a
> promising approach, but I think there is more to be gained by
> eliminating per-row pallocs, and IntervalAggState's MemoryContext
> (interval addition, unlike numeric addition, doesn't require memory
> allocation, right?).

"eliminating per-row pallocs"
I guess I understand. If not , please point it out.

> IntervalAggState's MemoryContext
> (interval addition, unlike numeric addition, doesn't require memory
> allocation, right?).

if I remove IntervalAggState's element: MemoryContext, it will not work.
so I don't understand what the above sentence means...... Sorry. (it's
my problem)

> Also, this needs to include serialization and deserialization
> functions, otherwise these aggregates will no longer be able to use
> parallel workers. That makes a big difference to queryE, if the size
> of the test data is scaled up.
>
I tried, but failed. sum(interval) result is correct, but
avg(interval) result is wrong.

Datum
interval_avg_serialize(PG_FUNCTION_ARGS)
{
IntervalAggState *state;
StringInfoData buf;
bytea *result;
/* Ensure we disallow calling when not in aggregate context */
if (!AggCheckCallContext(fcinfo, NULL))
elog(ERROR, "aggregate function called in non-aggregate context");
state = (IntervalAggState *) PG_GETARG_POINTER(0);
pq_begintypsend(&buf);
/* N */
pq_sendint64(&buf, state->N);
/* Interval struct elements, one by one. */
pq_sendint64(&buf, state->sumX.time);
pq_sendint32(&buf, state->sumX.day);
pq_sendint32(&buf, state->sumX.month);
/* pInfcount */
pq_sendint64(&buf, state->pInfcount);
/* nInfcount */
pq_sendint64(&buf, state->nInfcount);
result = pq_endtypsend(&buf);
PG_RETURN_BYTEA_P(result);
}

SELECT sum(b) ,avg(b)
,avg(b) = sum(b)/count(*) as should_be_true
,avg(b) * count(*) = sum(b) as should_be_true_too
from interval_aggtest_1m; --1million row.
The above query expects two bool columns to return true, but actually
both returns false.(spend some time found out parallel mode will make
the number of rows to 1_000_002, should be 1_000_0000).

> This comment:
>
> + int64 N; /* count of processed numbers */
>
> should be "count of processed intervals".

fixed.

Attachment Content-Type Size
v21-0001-Move-integer-helper-function-to-int.h.patch text/x-patch 3.3 KB
v21-0002-Check-for-overflow-in-make_interval.patch text/x-patch 5.0 KB
v21-0005-doc-for-special-interval-value.patch text/x-patch 1.3 KB
v21-0003-Add-infinite-interval-values.patch text/x-patch 96.9 KB
v21-0004-Revert-Remove-dead-code-in-DecodeInterval.patch text/x-patch 1021 bytes
v21-0006-refactor-avg-interval-sum-interval-aggregate.patch text/x-patch 25.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-09-20 10:49:18 Re: Comment about set_join_pathlist_hook()
Previous Message Amit Kapila 2023-09-20 10:11:29 Re: [PoC] pg_upgrade: allow to upgrade publisher node