Re: ~* + LIMIT => infinite time?

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: typea(at)l-i-e(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-performance(at)postgresql(dot)org
Subject: Re: ~* + LIMIT => infinite time?
Date: 2002-12-16 00:03:54
Message-ID: 1039997033.12952.6.camel@rh72.home.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

typea(at)l-i-e(dot)com kirjutas P, 15.12.2002 kell 05:41:
> It occurred to me last night that the actual data *MIGHT* be involved --
> It's some OCR text, and there are a few scattered non-ASCII characters
> involved... So *MAYBE* the actual text getting scanned could also be
> important.
>
> It seems unlikely, since the non-LIMIT query returns all the data just
> fine, but just in case...

Have you tried using DECLARE CURSOR...; FETCH 1; CLOSE CURSOR; instead
of LIMIT ?

> Here's a schema and a full dump for anybody that wants to dig in:
> http://bulletinarchive.org/pg_dump/

gzipping the data could make sense - data.sql goes from 200M to 60M ;)

> I could provide PHP source as well, or the query posted in this thread can
> serve as the test case.
>
> At the moment, I've altered the application to not use LIMIT when I have
> ~* in the query, and it "works" -- only the paging of results is broken,
> and the whole page takes twice as long to load as it should in those
> cases, since it's doing the same query twice and snarfing all the monster
> data and then throwing away the majority of rows in both cases. I need
> the first row to get the highest score, and the rows for paging in the
> real application...
>
> Anyway, my point is that the queries seem fine without the LIMIT clause,
> and "hang" with both "~*" and LIMIT, and I've even gone so far as to
> incorporate that into the application logic for now, just to have a page
> that loads at all instead of one that hangs.
>
> Meanwhile, I guess I should flail at it and try 7.3 in the hopes the bug
> disappeared.

I tested (part of) it on 7.3 , had to manually change ::int to
case-when-then-else-end as there is no cast from bool to int in7.3

This ran fine:

SELECT DISTINCT
*,
0 + case when (title ilike '%albert einstein%') then 10 else 0 end
+ case when ( title iLIKE '%einstein%'
AND title iLIKE '%albert%'
AND ( (title ~* 'einstein.{0,20}albert')
OR (title ~* 'albert.{0,20}einstein'))) then 8 else 0
end
as points
FROM article
WHERE FALSE
OR (title iLIKE '%albert%')
OR (author_flattened iLIKE '%albert%')
OR (subject_flattened iLIKE '%albert%')
OR (title iLIKE '%einstein%')
OR (author_flattened iLIKE '%einstein%')
OR (subject_flattened iLIKE '%einstein%')
ORDER BY points desc, volume, number, article.article
LIMIT 1 OFFSET 1;

I also changed
"lower(field) like '%albert%'"
to
"field ilike '%albert%'"

and got about 20% speed boost - EXPLAIN ANALYZE reported 0.189 insead of
0.263 sec as actual time.

> I was hoping to know for sure that it was a fixed bug in
> that upgrade path.
>
> Boss actually said we should go ahead and upgrade just on principle
> anyway. It's nice to have a smart boss. :-)
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
--
Hannu Krosing <hannu(at)tm(dot)ee>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message typea 2002-12-16 08:21:14 Re: ~* + LIMIT => infinite time?
Previous Message Josh Berkus 2002-12-15 20:22:25 Re: ~* + LIMIT => infinite time?