Re: bug of pg_trgm?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bug of pg_trgm?
Date: 2012-08-10 16:39:55
Message-ID: CAHGQGwF2oqOOaFikCY2EzJvkRjZrTjLrKiqTUC+zn6THGQgu1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 9, 2012 at 3:05 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> ... btw, I think there is another problem here, which is that
> generate_wildcard_trgm will restart get_wildcard_part at the
> same place that the second loop exits, which means it would
> do the wrong thing if what it returns is a pointer to the
> second char of an escape pair. Consider for instance
>
> foo\\%bar
>
> The first call of get_wildcard_part will correctly extract "foo",
> but then return a pointer to the second backslash. So the second
> call will think that the % is escaped, which it is not, leading to
> a wrong decision about whether to pad "bar".

Good catch!

> Probably a minimal fix for this could be made by backing up "endword"
> one byte before returning it if in_escape is true when the second
> loop exits. That would not scale up to preserving the state of
> in_wildcard_meta, but since the second loop never advances past a
> meta char, that's okay for the moment.

Or what about extending get_wildcard_part() so that it accepts the pointer
to in_escape as an argument? generate_wildcard_trgm() can know the last
value of in_escape and specify it the next call of get_wildcard_part(). Looks
very simple.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2012-08-10 16:57:25 Re: WIP patch for consolidating misplaced-aggregate checks
Previous Message Fujii Masao 2012-08-10 16:29:39 Re: bug of pg_trgm?