Re: directory tree query with big planner variation

From: Axel Rau <Axel(dot)Rau(at)Chaos1(dot)DE>
To: Michael Stone <mstone+postgres(at)mathom(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: directory tree query with big planner variation
Date: 2006-07-31 11:54:24
Message-ID: FA56C9FF-86FC-4AC5-B442-2C139167A331@Chaos1.DE
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Am 31.07.2006 um 13:15 schrieb Michael Stone:

> On Mon, Jul 31, 2006 at 12:48:11PM +0200, Axel Rau wrote:
>> WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC
>
> This can't be indexed. You might try something like WHERE P.path
> LIKE '%(at)%' AND P.path ~ '^%@/[^/]*/$'
Why does it quite well in this case:
---------------------------------------
-> Index Scan using path_name_idx on path p (cost=0.00..3.02 rows=1
width=97) (actual time=15.480..56.935 rows=27 loops=1)
Index Cond: ((path >= '/Users/axel/Library/
Preferences/'::text) AND (path < '/Users/axel/Library/
Preferences0'::text))
Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/
$'::text) AND (rtrim("replace"(path, '/Users/axel/Library/
Preferences/'::text, ''::text), '/'::text) <> ''::text))
---------------------------------------
as compared to this case(ignoring the index on path):
---------------------------------------
-> Index Scan using path_pkey on path p (cost=0.00..2567.57
rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1)
Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim
("replace"(path, '/Users/axel/'::text, ''::text), '/'::text) <>
''::text))
---------------------------------------
? With all longer path names, I get the above (good)result.
Should I put the rtrim/replace on the client side?
>
> The schema could be a lot more intelligent here. (E.g., store path
> seperately from file/directory name, store type (file or directory)
> seperately, etc.) Without improving the schema I don't think this
> will ever be a speed demon.
PATH holds complete pathnames of directories, FILENAME holds
filenames and pathname components.
Currently the schema is the lowest common denominator between SQLite,
MySQL and pg and the bacula people will stay with that (-;).
Axel
Axel Rau, ☀Frankfurt , Germany +49-69-951418-0

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rod Taylor 2006-07-31 12:09:42 Re: sub select performance due to seq scans
Previous Message Michael Stone 2006-07-31 11:15:42 Re: directory tree query with big planner variation