Re: Allow to collect statistics on virtual generated columns

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow to collect statistics on virtual generated columns
Date: 2026-03-26 17:25:23
Message-ID: CAEZATCXkZwJ_6FCM7RMKFiNC4ui+CLmL-=Y9AiYmDpnPS+ftWw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 26 Mar 2026 at 16:00, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> On Thu, 26 Mar 2026 at 15:09, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> >
> > I've attached an updated patch including the documentation and tests.

Looking at get_relation_statistics(), I think that you need to call
expand_generated_columns_in_expr() *before* ChangeVarNodes() so that
Vars in the expanded expression end up with the correct varno.

This obviously affects queries with more than one table in the FROM
clause, e.g.:

drop table if exists foo;
create table foo (a int, b int generated always as (a*2) virtual);
insert into foo select x from generate_series(1,10) x;
insert into foo select 100 from generate_series(1,500);
create statistics s on b from foo;
analyse foo;
explain select * from foo f1, foo f2 where f1.b = 200 and f2.b = 200;

QUERY PLAN
-------------------------------------------------------------------
Nested Loop (cost=0.00..47.56 rows=1500 width=16)
-> Seq Scan on foo f1 (cost=0.00..10.65 rows=500 width=4)
Filter: ((a * 2) = 200)
-> Materialize (cost=0.00..10.66 rows=3 width=4)
-> Seq Scan on foo f2 (cost=0.00..10.65 rows=3 width=4)
Filter: ((a * 2) = 200)
(6 rows)

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2026-03-26 17:28:12 Re: Adding REPACK [concurrently]
Previous Message Robert Haas 2026-03-26 17:20:21 Re: pg_plan_advice