Quick Links

Re: t1.col like '%t2.col%'

From:	"Dan Kaplan" <dkaplan(at)citizenhawk(dot)com>
To:
Cc:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: t1.col like '%t2.col%'
Date:	2008-02-29 23:52:31
Message-ID:	001401c87b2e$26386000$1d00a8c0@dan
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I learned a little about pg_trgm here:
http://www.sai.msu.su/~megera/postgres/gist/pg_trgm/README.pg_trgm

But this seems like it's for finding similarities, not substrings. How can
I use it to speed up t1.col like '%t2.col%'?

Thanks,
Dan

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Oleg Bartunov
Sent: Wednesday, February 27, 2008 9:47 PM
To: Dan Kaplan
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] t1.col like '%t2.col%'

On Wed, 27 Feb 2008, Dan Kaplan wrote:

> I've got a lot of rows in one table and a lot of rows in another table. I
> want to do a bunch of queries on their join column. One of these is like
> this: t1.col like '%t2.col%'

We have an idea how to speedup wildcard search at the expense of the size -
we have to index all permutation of the original word. Then we could
use GIN for quieries like a*b.

>
>
>
> I know that always sucks. I'm wondering how I can make it better. First,
I
> should let you know that I can likely hold both of these tables entirely
in
> ram. Since that's the case, would it be better to accomplish this with my
> programming language? Also you should know that in most cases, t1.col and
> t2.col is 2 words or less. I'm not sure if that matters, I mention it
> because it may make tsearch2 perform badly.
>

contrib/pg_trgm should help you.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

In response to

Re: t1.col like '%t2.col%' at 2008-02-28 05:47:22 from Oleg Bartunov

Responses

Re: t1.col like '%t2.col%' at 2008-02-29 23:56:42 from Joshua D. Drake
Re: t1.col like '%t2.col%' at 2008-03-01 02:10:47 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Joshua D. Drake	2008-02-29 23:56:42	Re: t1.col like '%t2.col%'
Previous Message	Mark Kirkwood	2008-02-29 22:33:10	Re: 12 disks raid setup