From: | Chanukya SDS <chanukyasds(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | pg_running_stats - mergeable running statistics (Welford/Chan) extension for postgresql |
Date: | 2025-10-17 05:55:54 |
Message-ID: | CAB4f4B6ga-bjBAmWu2FEjXiR539M6zV2J4EOyOoO5EtPGaTFKQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi all,
I’d like to share a new PostgreSQL extension called pg_running_stats.
It implements mergeable, numerically stable running statistics using the
Welford and Chan algorithms.
Unlike the built-in aggregates such as avg(), variance(), and stddev(),
which require scanning the entire dataset, pg_running_stats maintains a
compact internal state that can be updated or merged incrementally.
This makes it well-suited for:
1. streaming or real-time analytics where data arrives continuously,
2. incremental computation over large tables,
3. parallel or distributed queries that need to merge partial aggregates
efficiently.
The extension computes:
mean, variance, standard deviation, skewness, kurtosis, and min/max all in
a single pass.
It’s written entirely in C, depends only on PostgreSQL headers, and builds
cleanly on macOS (Homebrew) and Linux using PGXS.
Source and documentation:
https://github.com/chanukyasds/pg_running_stats
Any feedback, testing, or suggestions for improvement would be very welcome.
Thanks,
Chanukya
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-10-17 06:20:07 | Re: Fix an unnecessary cast calling elog in ExecHashJoinImpl |
Previous Message | Michael Paquier | 2025-10-17 05:41:07 | Re: Improved TAP tests by replacing sub-optimal uses of ok() with better Test::More functions |