Re: contrib/ltree patches

From: "Dan Langille" <dan(at)langille(dot)org>
To: Teodor Sigaev <teodor(at)stack(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, oleg(at)sai(dot)msu(dot)su
Subject: Re: contrib/ltree patches
Date: 2002-11-27 15:24:05
Message-ID: 3DE49D45.28806.DD0519D0@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27 Nov 2002 at 12:16, Teodor Sigaev wrote:

> Dan Langille wrote:
> > I have been looking at contrib/ltree in the PostgreSQL repository.
> > I've modified the code to allow / as a node delimiter instead of .
> > which is the default.

> What is the reason for changing delimiter?

My tree represents a file system. Here are some entries:

# select id, pathname from element_pathnames order by pathname;
77024 | doc/de_DE.ISO8859-1
77028 | doc/de_DE.ISO8859-1/books
84590 | doc/de_DE.ISO8859-1/books/Makefile.inc
77029 | doc/de_DE.ISO8859-1/books/faq
84591 | doc/de_DE.ISO8859-1/books/faq/Makefile
77030 | doc/de_DE.ISO8859-1/books/faq/book.sgml
77691 | doc/de_DE.ISO8859-1/books/handbook
77704 | doc/de_DE.ISO8859-1/books/handbook/Makefile
110592 | doc/de_DE.ISO8859-1/books/handbook/advanced-networking

> > Below are the patches to make this change. I have also moved the
> > delimiter to a DEFINE so that other customizations are easily done.
> > This is a work in progress.

> It's good.

Thank you. More patches will follow as I get closer to my objective.

> > -#define ISALNUM(x) ( isalnum((unsigned int)(x)) || (x) == '_' )
> > +#define ISALNUM(x) ( isalnum((unsigned int)(x)) || (x) == '_' ||
> > +#(x) == NODE_DELIMITER )

> It seems to me that it's mistake. ISALNUM shoud define correct
> character in name of node (level). Try to test with incorrect ltree
> value 'a..b'.

I just did some simple tests and I see what you mean:

ltree_test=# select * from tree;
id | pathname
----+------------------
1 | /ports
2 | ports/security
2 | ports//security
2 | /ports//security
2 | a..b
(5 rows)

Then I removed NODE_DELIMITER from ISALNUM and tried again:

ltree_test=# insert into tree values (2, '/ports//security');
ERROR: Syntax error in position 0 near '/'
ltree_test=# insert into tree values (2, 'ports//security');
ERROR: Syntax error in position 6 near '/'
ltree_test=# insert into tree values (2, 'ports/security');
INSERT 29955201 1
ltree_test=# insert into tree values (2, 'ports/security/');
ERROR: Unexpected end of line
ltree_test=# insert into tree values (2, 'ports/security/things');
INSERT 29955202 1

ltree_test=# select * from tree;
id | pathname
----+-----------------------
1 | /ports
2 | ports/security
2 | ports//security
2 | /ports//security
2 | a..b
2 | ports/security
2 | ports/security/things
(7 rows)

Removing NODE_DELIMITER from ISALNUM makes sense. Thank you. Here
is the reason why NODE_DELIMITER was added. My initial data sample
was of the form "/usr/local/" (i.e. it started with a
NODE_DELIMITER). I have since changed my data so it does not start
with a leading / because queries were not working.

Based upon the sample data I was using (approximately 120,000 nodes
as taken from a real file system), I had to change ISALNUM as I went
along. Here is the current definition for ISALNUM:

#define ISALNUM(x) ( isalnum((unsigned int)(x)) || (x) == '_' ||
(x) == '-' || (x) == '.' || (x) == '+' || (x) == ':' || (x) == '~' ||
(x) == '%' || (x) == ',' || (x) == '#')

Given that I am trying to allow any valid filename, I think ISALNUM
needs to allow any ASCII character.

I also think I will need to modify the parsing within lquery_in to
allow escaping of characters it recognizes but which may be part of a
file name (e.g. :%~ may be part of a file name, but these are special
characters to lquery_in). That I think will be the biggest change.

Thank you for your interest and help.

--
Dan Langille : http://www.langille.org/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2002-11-27 15:27:35 Re: Need Quote for 7.3
Previous Message Tom Lane 2002-11-27 15:23:39 Re: Boolean casting in 7.3 -> changed?