Skip site navigation (1) Skip section navigation (2)

Re: ~* + LIMIT => infinite time?

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: typea(at)l-i-e(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-performance(at)postgresql(dot)org
Subject: Re: ~* + LIMIT => infinite time?
Date: 2002-12-16 00:03:54
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
typea(at)l-i-e(dot)com kirjutas P, 15.12.2002 kell 05:41:
> It occurred to me last night that the actual data *MIGHT* be involved --
> It's some OCR text, and there are a few scattered non-ASCII characters
> involved...  So *MAYBE* the actual text getting scanned could also be
> important.
> It seems unlikely, since the non-LIMIT query returns all the data just
> fine, but just in case...

Have you tried using DECLARE CURSOR...; FETCH 1; CLOSE CURSOR; instead
of LIMIT ?

> Here's a schema and a full dump for anybody that wants to dig in:

gzipping the data could make sense - data.sql goes from 200M to 60M ;)

> I could provide PHP source as well, or the query posted in this thread can
> serve as the test case.
> At the moment, I've altered the application to not use LIMIT when I have
> ~* in the query, and it "works" -- only the paging of results is broken,
> and the whole page takes twice as long to load as it should in those
> cases, since it's doing the same query twice and snarfing all the monster
> data and then throwing away the majority of rows in both cases.  I need
> the first row to get the highest score, and the rows for paging in the
> real application...
> Anyway, my point is that the queries seem fine without the LIMIT clause,
> and "hang" with both "~*" and LIMIT, and I've even gone so far as to
> incorporate that into the application logic for now, just to have a page
> that loads at all instead of one that hangs.
> Meanwhile, I guess I should flail at it and try 7.3 in the hopes the bug
> disappeared.

I tested (part of) it on 7.3 , had to manually change ::int to
case-when-then-else-end as there is no cast from bool to int in7.3

This ran fine:

  0 + case when (title ilike '%albert einstein%') then 10 else 0 end
    + case when  (    title iLIKE '%einstein%'
            AND title iLIKE '%albert%'
            AND (    (title ~* 'einstein.{0,20}albert')
                  OR (title ~* 'albert.{0,20}einstein'))) then 8 else 0
  as points
 FROM article
   OR (title iLIKE '%albert%')
   OR (author_flattened iLIKE '%albert%')
   OR (subject_flattened iLIKE '%albert%')
   OR (title iLIKE '%einstein%')
   OR (author_flattened iLIKE '%einstein%')
   OR (subject_flattened iLIKE '%einstein%')
ORDER BY points desc, volume, number, article.article

I also changed 
   "lower(field) like '%albert%'"
   "field ilike '%albert%'"

and got about 20% speed boost - EXPLAIN ANALYZE reported 0.189 insead of
0.263 sec as actual time.

>   I was hoping to know for sure that it was a fixed bug in
> that upgrade path.
> Boss actually said we should go ahead and upgrade just on principle
> anyway.  It's nice to have a smart boss. :-)
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
Hannu Krosing <hannu(at)tm(dot)ee>

In response to


pgsql-performance by date

Next:From: typeaDate: 2002-12-16 08:21:14
Subject: Re: ~* + LIMIT => infinite time?
Previous:From: Josh BerkusDate: 2002-12-15 20:22:25
Subject: Re: ~* + LIMIT => infinite time?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group