Quick Links

Re: GIN improvements part2: fast scan

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: GIN improvements part2: fast scan
Date:	2013-06-24 22:20:59
Message-ID:	CAPpHfduJ-X9WKR9h-vyTrNEVNR37ZDx05qOLkA3ys3j-14uKXA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Jun 21, 2013 at 11:43 PM, Heikki Linnakangas <
hlinnakangas(at)vmware(dot)com> wrote:

> On 19.06.2013 11:56, Alexander Korotkov wrote:
>
>> On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas<
>> hlinnakangas(at)vmware(dot)com> wrote:
>>
>> On 19.06.2013 11:30, Alexander Korotkov wrote:
>>>
>>> On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas<
>>>> hlinnakangas(at)vmware(dot)com> wrote:
>>>>
>>>> On 18.06.2013 23:59, Alexander Korotkov wrote:
>>>>
>>>>>
>>>>> I would like to illustrate that on example. Imagine you have fulltext
>>>>>
>>>>>> query
>>>>>> "rare_term& frequent_term". Frequent term has large posting tree
>>>>>> while
>>>>>>
>>>>>> rare term has only small posting list containing iptr1, iptr2 and
>>>>>> iptr3.
>>>>>> At
>>>>>> first we get iptr1 from posting list of rare term, then we would like
>>>>>> to
>>>>>> check whether we have to scan part of frequent term posting tree where
>>>>>> iptr
>>>>>> < iptr1. So we call pre_consistent([false, true]), because we know
>>>>>> that
>>>>>> rare term is not present for iptr< iptr2. pre_consistent returns
>>>>>> false.
>>>>>> So
>>>>>> we can start scanning frequent term posting tree from iptr1. Similarly
>>>>>> we
>>>>>> can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to
>>>>>> maximum possible pointer.
>>>>>>
>>>>>>
>>>>>> Thanks, now I understand the rare-term& frequent-term problem.
>>>>> Couldn't
>>>>>
>>>>> you do that with the existing consistent function? I don't see why you
>>>>> need
>>>>> the new pre-consistent function for this.
>>>>>
>>>>>
>>>> In the case of two entries I can. But in the case of n entries things
>>>> becomes more complicated. Imagine you have "term_1& term_2& ...&
>>>> term_n"
>>>>
>>>> query. When you get some item pointer from term_1 you can skip all the
>>>> lesser item pointers from term_2, term_3 ... term_n. But if all you have
>>>> for it is consistent function you have to call it with following check
>>>> arguments:
>>>> 1) [false, false, false, ... , false]
>>>> 2) [false, true, false, ... , false]
>>>> 3) [false, false, true, ... , false]
>>>> 4) [false, true, true, ..., false]
>>>> ......
>>>> i.e. you have to call it 2^(n-1) times. But if you know the query
>>>> specific
>>>> (i.e. in opclass) it's typically easy to calculate exactly what we need
>>>> in
>>>> single pass. That's why I introduced pre_consistent.
>>>>
>>>>
>>> Hmm. So how does that work with the pre-consistent function? Don't you
>>> need to call that 2^(n-1)-1 times as well?
>>>
>>
>>
>> I call pre-consistent once with [false, true, true, ..., true].
>> Pre-consistent knows that each true passed to it could be false positive.
>> So, if it returns false it guarantees that consistent will be false for
>> all
>> possible combinations.
>>
>
> Ok, I see.
>
> I spent some time pondering this. I'd like to find a way to do something
> about this without requiring another user-defined function. A couple of
> observations:
>
> 1. The profile of that rare-term & frequent-term quest, without any patch,
> looks like this:
>
> 28,55% postgres ginCompareItemPointers
> 19,36% postgres keyGetItem
> 15,20% postgres scanGetItem
> 7,75% postgres checkcondition_gin
> 6,25% postgres gin_tsquery_consistent
> 4,34% postgres TS_execute
> 3,85% postgres callConsistentFn
> 3,64% postgres FunctionCall8Coll
> 3,19% postgres check_stack_depth
> 2,60% postgres entryGetNextItem
> 1,35% postgres entryGetItem
> 1,25% postgres MemoryContextReset
> 1,12% postgres MemoryContextSwitchTo
> 0,31% libc-2.17.so __memcpy_ssse3_back
> 0,24% postgres collectMatchesForHeapRow
>
> I was quite surprised by seeing ginCompareItemPointers at the top. It
> turns out that it's relatively expensive to do comparisons in the format we
> keep item pointers, packed in 6 bytes, in 3 int16s. I hacked together a
> patch to convert ItemPointers into uint64s, when dealing with them in
> memory. That helped quite a bit.
>
> Another important thing in the above profile is that calling
> consistent-function is taking up a lot of resources. And in the example
> test case you gave, it's called with the same arguments every time. Caching
> the result of consistent-function would be a big win.
>
> I wrote a quick patch to do that caching, and together with the
> itempointer hack, I was able to halve the runtime of that test case. That's
> impressive, we probably should pursue that low-hanging fruit, but it's
> still slower than your "fast scan" patch by a factor of 100x. So clearly we
> do need an algorithmic improvement here, along the lines of your patch, or
> something similar.
>

For sure, many advantages can be achieved without "fast scan". For example,
sphinx is known to be fast, but it straightforwardly scan each posting list
just like GIN now.

> 2. There's one trick we could do even without the pre-consistent function,
> that would help the particular test case you gave. Suppose that you have
> two search terms A and B. If you have just called consistent on a row that
> matched term A, but not term B, you can skip any subsequent rows in the
> scan that match A but not B. That means that you can skip over to the next
> row that matches B. This is essentially the same thing you do with the
> pre-consistent function, it's just a degenerate case of it. That helps as
> long as the search contains only one frequent term, but if it contains
> multiple, then you have to still stop at every row that matches more than
> one of the frequent terms.
>

Yes, two terms case is confluent and there is no direct need of
preConsistent.

> 3. I'm still not totally convinced that we shouldn't just build the "truth
> table" by calling the regular consistent function with all the combinations
> of matching keys, as discussed above. I think that would work pretty well
> in practice, for queries with up to 5-10 frequent terms. Per the profiling,
> it probably would make sense to pre-compute such a table anyway, to avoid
> calling consistent repeatedly.
>

Why do you mention 5-10 _frequent_ items? If we have 5-10 frequent items
and 20 rare items we would have to create "truth table" of frequent items
for each new combination of rare items. For 20 rare items truth
combinations can be unique for each item pointer, in that case you would
have to calculate small "truth table" of frequent items for each item
pointers. And then it can appear to be not so small. I mean it seems to me
that we should take into account both "frequent" and "rare" items when
talking about "truth table".

> 4. If we do go with a new function, I'd like to just call it "consistent"
> (or consistent2 or something, to keep it separate form the old consistent
> function), and pass it a tri-state input for each search term. It might not
> be any different for the full-text search implementation, or any of the
> other ones for that matter, but I think it would be a more understandable
> API.

Understandable API makes sense. But for now, I can't see even potentional
usage of third state (exact false). Also, with preConsistent interface "as
is" in some cases we can use old consistent method as both consistent and
preConsistent when it implements monotonous boolean function. For example,
it's consistent function for opclasses of arrays.

Revised version of patch is attaches with more comments and docs.

------
With best regards,
Alexander Korotkov.

Attachment	Content-Type	Size
gin_fast_scan.4.patch.gz	application/x-gzip	8.2 KB

In response to

Re: GIN improvements part2: fast scan at 2013-06-21 19:43:01 from Heikki Linnakangas

Responses

Re: GIN improvements part2: fast scan at 2013-06-28 19:31:03 from Alexander Korotkov
Re: GIN improvements part2: fast scan at 2013-07-06 16:48:00 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Korotkov	2013-06-24 22:24:37	Re: GIN improvements part 3: ordering in index
Previous Message	Michael Paquier	2013-06-24 22:19:40	Re: Support for REINDEX CONCURRENTLY