| From: | Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> |
|---|---|
| To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Allow to collect statistics on virtual generated columns |
| Date: | 2026-03-24 10:57:27 |
| Message-ID: | 20260324195727.43416b69c71b5f63eed3b4bc@sraoss.co.jp |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, 24 Mar 2026 09:18:06 +0000
Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
Thank you for your comments!
> On Wed, 31 Dec 2025 at 10:45, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> >
> > I've attached an updated patch that fixes the broken test since 10c4fe074a.
> >
>
> I took a look at this, and the patch appears to work as intended.
> However, I have my doubts as to whether this is the best approach.
>
> Building stats on a virtual generated column is potentially quite
> expensive, and not something that everyone will want, so I think this
> really should be an optional feature that people can selectively
> enable, if they want.
I agree that collecting statistics on a virtual generated column should
be optional. While it can be disabled or enabled with:
ALTER TABLE ... ALTER COLUMN ... SET STATISTICS ...
this is not very intuitive or discoverable for users, so a more explicit
interface might be preferable.
> Therefore, I think that the previous approach was probably better. If
> I'm understanding it correctly, that allowed
>
> CREATE STATISTICS stat_name ON virt_col FROM tbl;
>
> as well as allowing statistics to be built on expressions including
> virtual generated columns, making it more flexible.
>
> The problem with this previous approach was that it didn't work
> correctly if a virtual generated column's expression was changed using
> ALTER TABLE ... SET EXPRESSION. I think that could be solved by
> expanding generated column expressions at ANALYZE time, rather than at
> CREATE STATISTICS time.
Yes. I think this approach could work.
One concern is that CREATE STATISTICS does not allow creating extended
statistics on a single colmun. If we try this, it raises the following error:
ERROR: extended statistics require at least 2 columns
Therefore, if we take this approach, we would need to add an exception
to this rule for virtual columns.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2026-03-24 10:59:19 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Previous Message | Srinath Reddy Sadipiralla | 2026-03-24 10:55:47 | Re: Teach isolation tester about injection points in background workers |