| From: | Steve Chavez <steve(at)supabase(dot)io> |
|---|---|
| To: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] Add hint for misspelled relations |
| Date: | 2025-12-10 06:10:28 |
| Message-ID: | CAGRrpzZ27b8fmpyqZvssgvYGVHYtU19x1PGW340axzX6n-bHng@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Kirill and Daniel, many thanks for your feedback.
> Why do we even do this? Can't we just pass fuzzystate->distance to
varstr_levenshtein_less_equal? It is initialized in outer func to
MAX_FUZZY_DISTANCE + 1
This was an oversight. I've corrected it on the new attached patch.
> Worst case is that it never contains the a close enough match and
we'll spend a scan + hash + calculation of MAX_REL_HINT_CANDIDATES for each
typo.
> Are there ways you can pare down the scan, perhaps a scankey on
relispartition
and only consider base relations?
I've improved the code (new patch attached) and the hash is no
longer necessary.
Now the scan omits partitions as you suggested plus they're now restricted
to each schema on the search_path.
It was also possible to do more scans for each relevant relkind
(RELKIND_RELATION, RELKIND_VIEW, etc) but I wasn't sure if that was more
expensive than doing the filtering on the loop as it's currently done.
It also made the code a bit more complicated with more nested loops, so
I've left it as is.
Perhaps we should limit the amount of schema scans too in case the
search_path has too many schemas?
> Hinting on relations can also give hints for relations the user does not
have
permissions on which further reduce the useability.
I've tested the current column hint and it also doesn't consider privileges:
```
create role new nosuperuser;
grant usage on schema public to new;
create table subitems (id int, name text);
grant select (id) on subitems to new;
begin;
set local role to new;
select nam from subitems;
2025-12-09 18:16:38.017 -05 [542091] ERROR: column "nam" does not exist at
character 8
2025-12-09 18:16:38.017 -05 [542091] HINT: Perhaps you meant to reference
the column "subitems.name".
2025-12-09 18:16:38.017 -05 [542091] STATEMENT: select nam from subitems ;
ERROR: column "nam" does not exist
LINE 1: select nam from subitems ;
HINT: Perhaps you meant to reference the column "subitems.name".
```
So perhaps that can be a separate enhancement for both hints?
> The tab-completion in psql is one tool offered to avoid misspellings which
reduce the need.
True, but it's still useful to offer a hint for other clients.
What do you think?
Best regards,
Steve Chavez
On Tue, 2 Dec 2025 at 04:27, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
> > On 2 Dec 2025, at 03:46, Steve Chavez <steve(at)supabase(dot)io> wrote:
>
> > Currently misspelled columns offer a hint but not misspelled relations.
>
> The tab-completion in psql is one tool offered to avoid misspellings which
> reduce the need.
>
> > One problem is that scanning all pg_class entries can get expensive on
> big catalogs, so the number of searches is capped by
> MAX_REL_HINT_CANDIDATES. I've set this to 4096 arbitrarily, any guidance on
> what would be a good number is appreciated. Personally I've seen a catalog
> that contains 125K tables, with mostly auto generated names. For these
> cases I don't think the hint helps that much anyway, so it seemed fine to
> bail here.
>
> What makes hints for columns appealing is that it's a pretty contained
> problem
> across data we've already accumulated, relations are quite different as
> they
> require a catalog lookup making it a lot less appealing. The number of
> relations can easily become quite large, especially when large partitioning
> hierarchies are involved, and while capping prevents large scans there is
> no
> guarantee that the MAX_REL_HINT_CANDIDATES set contains the most likely
> entries. Worst case is that it never contains the a close enough match and
> we'll spend a scan + hash + calculation of MAX_REL_HINT_CANDIDATES for each
> typo.
>
> Are there ways you can pare down the scan, perhaps a scankey on
> relispartition
> and only consider base relations?
>
> Hinting on relations can also give hints for relations the user does not
> have
> permissions on which further reduce the useability.
>
> --
> Daniel Gustafsson
>
>
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-add-a-hint-for-a-missing-relation.patch | text/x-patch | 12.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bertrand Drouvot | 2025-12-10 06:39:40 | Re: Mark function arguments of type "T *" as "const T *" where possible |
| Previous Message | Nico Williams | 2025-12-10 05:57:46 | Re: [oauth] SASL mechanisms |