Re: ltree + gist index performance degrades significantly over a night

From: CG <cgg007(at)yahoo(dot)com>
To: Bernhard Weisshuhn <bkw(at)weisshuhn(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ltree + gist index performance degrades significantly over a night
Date: 2006-02-27 18:27:20
Message-ID: 20060227182720.48012.qmail@web32502.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


That would do the job, wouldn't it? :)

I don't think it's a naive question at all. Its quite a good question, and the
solution you suggest is a good option to have, and would probably work better
than the single-vector ltree index for simple substring matching. In my case,
the ltree+gist index table actually contains more pages than the table of data
itself. I'd need to see if the space required for the varchar+btree tables are
comparible, better, or worse than the ltree+gist tables with regards to size.
Now that I think about it, building substrings out of ltree nodes would be
incredible overkill comapred to the effetiveness of the varchar+btree.

The extra advantages of ltree are the ability to match and extract nodes in a
path based not only on contents but also proximity, and aggregate on those
characteristics.

In my case this might be good for serial numbers where each digit or grouping
of digits have special values which would be used to aggregate on.

The ltree method was suggested to me a while back when I was frustrated with
the performance of "like '%something%'" ...

--- Bernhard Weisshuhn <bkw(at)weisshuhn(dot)de> wrote:

> On Mon, Feb 27, 2006 at 09:14:40AM -0800, CG <cgg007(at)yahoo(dot)com> wrote:
>
> > I could probably get even better performance out of the table, at the cost
> of a
> > significant increase in table and index size, by chopping up the columns
> into
> > smaller chunks.
> >
> > "Hello World" would yield
> >
> > 'h.e.l.l.o.w.o.r.l.d'
> > 'e.l.l.o.w.o.r.l.d'
> > 'l.l.o.w.o.r.l.d'
> > 'l.o.w.o.r.l.d'
> > 'o.w.o.r.l.d'
> > 'w.o.r.l.d'
> > 'o.r.l.d'
> > 'r.l.d'
> >
> > and using a wildcard search "search_vector ~ 'o.r.l.*'" would jump right to
> the
> > vectors which start with "o.r.l" ...
>
> But with this approch you'd be fine with a normal varchar_ops btree index
> for textfields and searching using "like 'world%'", wouldn't you?
> Or is the ltree approch more efficient?
>
> I'm not trying to be smart-assed, it's a naive question, since I'm
> looking for an efficient substring search solution in postgresql myself.
>
> regards,
> bkw
>
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Volkan YAZICI 2006-02-27 18:41:52 Breaking Path/Polygon Data into Pieces
Previous Message Emi Lu 2006-02-27 17:48:33 Re: A question about Vacuum analyze