Custom index structure and strange count problem

From: Carsten Kropf <ckropf2(at)fh-hof(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Custom index structure and strange count problem
Date: 2010-06-09 08:35:01
Message-ID: 965BE573-CEE9-455E-9800-1617EAC89990@fh-hof.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi *,
during the last few months I've been building a new index structure as part of a research project.
Everything seems to work properly, however I have some strange issues with the count sql command.
I introduced some custom structures (mainly document and hybrid_query) with which my index access method is supposed to work.
There is an operator "&&" which is supposed to use my index structure (what also works properly).
The function that maps to the operator "&&" is called hybrid_index_query, which I use to compare my results given from the index with the real results that are supposed to appear in the final result set.
Having described the outer circumstances (in a very short way), I will now show the strange stuff that happens:

test=# select id from documents where hybrid_index_query(to_document(words, points), row('radio pleas news'::tsvector, '[(-90,-180),(90, 180)]')) order by id;
id
------
2137
2151
2168
2207
2208
2209
2210
2211
2266
2296
(10 rows)

This query takes a sequential scan and works properly (returning 10 rows).

test=# select id from documents where to_document(words, points) && row('pleas radio news'::tsvector, '[(-90,-180),(90,180)]') order by id;
id
------
2137
2151
2168
2207
2208
2209
2210
2211
2266
2296
(10 rows)

This query uses my index structure and returns the same result as in the sequential scan above.
Until here, everything seems to work fine. However, if I issue the same queries using the count aggregate function in SQL, there are some odd results:
test=# select count(*) from documents where hybrid_index_query(to_document(words, points), row('radio pleas news'::tsvector, '[(-90,-180),(90, 180)]'));
count
-------
10
(1 row)

Using the sequential scan, still, everything seems fine.
However, if I now do the index scan (my function will be called 11 times, returning false at the end), I get the following result:
test=# select count(*) from documents where to_document(words, points) && row('pleas radio news'::tsvector, '[(-90,-180),(90,180)]');
count
-------
7
(1 row)

This seems strange, because the same query returned 10 rows (when I didn't use the aggregate). If I issue queries that count the id column, I receive the following:
test=# select count(id) from documents where hybrid_index_query(to_document(words, points), row('radio pleas news'::tsvector, '[(-90,-180),(90, 180)]'));
count
-------
10
(1 row)

test=# select count(id) from documents where to_document(words, points) && row('pleas radio news'::tsvector, '[(-90,-180),(90,180)]');
count
-------
10
(1 row)

These two queries do again return the same results.
Thus, I don't know, what's wrong here, does anybody know about that behaviour, or is it my fault that the results are wrong, somehow?
Thanks in advance

Carsten Kropf

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-06-09 08:44:01 Re: [BUGS] Server crash while trying to read expression using pg_get_expr()
Previous Message Florian Pflug 2010-06-09 07:59:47 Re: LLVM / clang