Re: WIP: SP-GiST, Space-Partitioned GiST

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: oleg(at)sai(dot)msu(dot)su
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: SP-GiST, Space-Partitioned GiST
Date: 2011-09-01 12:14:21
Message-ID: CAPpHfdtKdjaXdY93=O+EYht4wFgE2A2B8Sf3AErN4=p7gn+FBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Ie expect some problems in support of comparison operators for text, because
locale string comparison can have unexpected behaviour.
Let's see the example. Create table with words and add extra leading space
to some of them.

test=# create table dict(id serial, word text);
NOTICE: CREATE TABLE will create implicit sequence "dict_id_seq" for serial
column "dict.id"
CREATE TABLE
test=# \copy dict(word) from '/usr/share/dict/american-english';
test=# update dict set word = ' '||word where id%2=0;
UPDATE 49284

I use Ubuntu 11.04 with ru_RU.utf8 locale. So, comparison operators ignores
leading spaces.

test=# select * from dict where word between 'cart' and 'cary';
id | word
-------+----------------
3029 | Carter
3031 | Cartesian
3033 | Carthage's
3035 | Cartier
3037 | Cartwright
3039 | Caruso
3041 | Carver
28419 | cart
28421 | carted
28423 | cartel's
28425 | cartilage
28427 | cartilages
28429 | carting
28431 | cartographer's
28433 | cartography
28435 | carton
28437 | cartons
28439 | cartoon's
28441 | cartooning
28443 | cartoonist's
28445 | cartoons
28447 | cartridge's
28449 | carts
28451 | cartwheel's
28453 | cartwheeling
28455 | carve
28457 | carver
28459 | carvers
28461 | carving
28463 | carvings
3030 | Carter's
3032 | Carthage
3034 | Carthaginian
3036 | Cartier's
3038 | Cartwright's
3040 | Caruso's
3042 | Carver's
28420 | cart's
28422 | cartel
28424 | cartels
28426 | cartilage's
28428 | cartilaginous
28430 | cartographer
28432 | cartographers
28434 | cartography's
28436 | carton's
28438 | cartoon
28440 | cartooned
28442 | cartoonist
28444 | cartoonists
28446 | cartridge
28448 | cartridges
28450 | cartwheel
28452 | cartwheeled
28454 | cartwheels
28456 | carved
28458 | carver's
28460 | carves
28462 | carving's
(59 rows)

But if I create spgist index query result differs.

test=# create index dict_idx on dict using spgist (word);
CREATE INDEX
test=# select * from dict where word between 'cart' and 'cary';
id | word
-------+----------------
28419 | cart
28421 | carted
28423 | cartel's
28425 | cartilage
28427 | cartilages
28429 | carting
28431 | cartographer's
28433 | cartography
28435 | carton
28437 | cartons
28439 | cartoon's
28441 | cartooning
28443 | cartoonist's
28445 | cartoons
28447 | cartridge's
28449 | carts
28451 | cartwheel's
28453 | cartwheeling
28455 | carve
28457 | carver
28459 | carvers
28461 | carving
28463 | carvings
(23 rows)

------
With best regards,
Alexander Korotkov.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-09-01 12:40:31 Re: [v9.1] sepgsql - userspace access vector cache
Previous Message Oleg Bartunov 2011-09-01 12:10:24 Re: WIP: SP-GiST, Space-Partitioned GiST