Re: [PATCH] Add hint for misspelled relations

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Steve Chavez <steve(at)supabase(dot)io>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Add hint for misspelled relations
Date: 2025-12-02 03:32:44
Message-ID: CALdSSPgcJ=zueMiR3su7Dh4YE4TSY4dG_F8FDszbbZC2XqPeUg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2 Dec 2025 at 07:46, Steve Chavez <steve(at)supabase(dot)io> wrote:
>
> Hello hackers,
>
> Currently misspelled columns offer a hint but not misspelled relations.
>
> This patch enables that, the result is like:
>
> -- having this table
> create table clients (id int);
> -- we misspell the table name
> select * from cliants;
> ERROR: relation "cliants" does not exist
> LINE 1: select * from cliants;
> HINT: Perhaps you meant to reference the table "public.clients".
>
> The possible matches are searched in pg_class for the schemas present in search_path (or if the relation is qualified, it only searches matches in that schema). The logic reuses the `varstr_levenshtein_less_equal` function similar to how the column matching is done.
>
> If there's a tie in the fuzzy match, it's solved by preferring the schema that appears first on the search path. If that fails, then the lexicographic order is used to break the tie.
>
> One problem is that scanning all pg_class entries can get expensive on big catalogs, so the number of searches is capped by MAX_REL_HINT_CANDIDATES. I've set this to 4096 arbitrarily, any guidance on what would be a good number is appreciated. Personally I've seen a catalog that contains 125K tables, with mostly auto generated names. For these cases I don't think the hint helps that much anyway, so it seemed fine to bail here.
>
> The changes are split into two commits, one refactoring some reusable functions for easier review and another one implementing the relation hint.
>
> Any feedback is welcomed.
>
> Best regards,
> Steve Chavez
>
> [1]:

Hi! I did not reviewed this patch closely, but I stopped a this:

> + reldistance = varstr_levenshtein_less_equal(target->relname, strlen(target->relname),
> + candidate, candidatelen,
> + 1, 1, 1,
> + Min(fuzzystate->distance + 1, MAX_FUZZY_DISTANCE + 1),
> + true);
> +
> + /* The above can return MAX_FUZZY_DISTANCE + 1 results, skip these */
> + if (reldistance > MAX_FUZZY_DISTANCE)
> + continue;

Why do we even do this? Can't we just pass fuzzystate->distance to
varstr_levenshtein_less_equal? It is initialized in outer func to
MAX_FUZZY_DISTANCE + 1

--
Best regards,
Kirill Reshke

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chengpeng Yan 2025-12-02 03:48:26 Add a greedy join search algorithm to handle large join problems
Previous Message Corey Huinker 2025-12-02 03:18:24 Re: Is there value in having optimizer stats for joins/foreignkeys?