Re: Planner should use index on a LIKE 'foo%' query

From: Moritz Onken <onken(at)houseofdesign(dot)de>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Planner should use index on a LIKE 'foo%' query
Date: 2008-06-30 16:42:07
Message-ID: 35332CE1-8962-40D2-99DA-16BA80087979@houseofdesign.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>
> The thing here is that you are effectively causing Postgres to run a
> sub-select for each row of the "result" table, each time generating
> either an empty list or a list with one or more identical URLs. This
> is effectively forcing a nested loop. In a way, you have two
> constraints where you only need one.
>
> You can safely take out the constraint in the subquery, so it is
> like this:
>
> SELECT COUNT(*) FROM result WHERE url IN (SELECT shorturl FROM item);
>
> This will generate equivalent results, because those rows that
> didn't match the constraint wouldn't have affected the IN anyway.
> However, it will alter the performance, because the subquery will
> contain more results, but it will only be run once, rather than
> multiple times. This is effectively forcing a hash join (kind of).
>
> Whereas if you rewrite the query as I demonstrated earlier, then you
> allow Postgres to make its own choice about which join algorithm
> will work best.
>
> Matthew

Thank you! I learned a lot today :-)
I thought the subquery will be run on every row thus I tried to make
it as fast as possible by using a where clause. I didn't try your
first query on the hole table so it could be faster than mine approach.

greetings,

moritz

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Franck Routier 2008-06-30 16:57:58 Does max size of varchar influence index size
Previous Message Alvaro Herrera 2008-06-30 16:23:06 Re: VACUUM ANALYZE blocking both reads and writes to a table