| From: | Feng Wu <wufengwufengwufeng(at)gmail(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: [PATCH] Avoid collation lookup for "char" statistics |
| Date: | 2026-06-28 13:59:49 |
| Message-ID: | CACK3muq-pzJsWYV_N3OuP3tiJVHMob+dAs-NbFi7KCLUWVPLLA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Tom,
Thanks for looking at it. I should have included the exact reproducer in
the first email.
The failure needs histogram statistics on a "char" column, so simple
catalog joins such as the pg_type example do not necessarily reach the
problematic path. For example, I just rechecked that your query plans as
a hash join for me:
explain select * from pg_type a join pg_type b using (typtype);
To reach the failing path, this reproduces it for me on current master:
create temp table char_stats_1 (c "char");
create temp table char_stats_2 (c "char");
insert into char_stats_1
select v::"char"
from unnest(array['I','S','c','i','m','p','r','t','v']) as v,
generate_series(1,
case when v in ('i','v','r','t') then 50 else 1 end);
insert into char_stats_2
select v::"char"
from unnest(array['a','e','i']) as v,
generate_series(1, case when v = 'i' then 50 else 5 end);
analyze char_stats_1;
analyze char_stats_2;
set enable_hashjoin = off;
set enable_nestloop = off;
explain (costs off)
select count(*)
from char_stats_1 s1
join char_stats_2 s2 on s1.c = s2.c;
After ANALYZE, the first column has histogram statistics, e.g.:
tablename | attname | n_distinct | most_common_vals | histogram_bounds
-------------+---------+------------+------------------+------------------
char_stats_1 | c | 9 | {i,r,t,v} | {I,S,c,m,p}
char_stats_2 | c | 3 | {i,a,e} |
Without the patch, planning fails with:
ERROR: XX000: cache lookup failed for collation 0
LOCATION: pg_newlocale_from_collation, pg_locale.c:1211
The reason is that mergejoinscansel() calls scalarineqsel() while costing
the merge join. That can reach convert_to_scalar() for histogram bounds.
For CHAROID, convert_string_datum() builds a one-byte string, but then
still calls pg_newlocale_from_collation() with the clause input collation,
which is InvalidOid for the non-collatable "char" type.
The patch only skips that collation lookup for CHAROID. For text,
varchar, bpchar, and name, the existing locale handling is unchanged.
Regards,
Feng
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2026-06-28 15:39:00 | Re: [PATCH] Avoid collation lookup for "char" statistics |
| Previous Message | Jeevan Chalke | 2026-06-28 12:56:02 | Re: ON EMPTY clause for aggregate and window functions |