Re: GIN improvements part2: fast scan

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part2: fast scan
Date: 2013-06-21 19:43:01
Message-ID: 51C4ACC5.2090707@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19.06.2013 11:56, Alexander Korotkov wrote:
> On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas<
> hlinnakangas(at)vmware(dot)com> wrote:
>
>> On 19.06.2013 11:30, Alexander Korotkov wrote:
>>
>>> On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas<
>>> hlinnakangas(at)vmware(dot)com> wrote:
>>>
>>> On 18.06.2013 23:59, Alexander Korotkov wrote:
>>>>
>>>> I would like to illustrate that on example. Imagine you have fulltext
>>>>> query
>>>>> "rare_term& frequent_term". Frequent term has large posting tree while
>>>>>
>>>>> rare term has only small posting list containing iptr1, iptr2 and iptr3.
>>>>> At
>>>>> first we get iptr1 from posting list of rare term, then we would like to
>>>>> check whether we have to scan part of frequent term posting tree where
>>>>> iptr
>>>>> < iptr1. So we call pre_consistent([false, true]), because we know
>>>>> that
>>>>> rare term is not present for iptr< iptr2. pre_consistent returns
>>>>> false.
>>>>> So
>>>>> we can start scanning frequent term posting tree from iptr1. Similarly
>>>>> we
>>>>> can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to
>>>>> maximum possible pointer.
>>>>>
>>>>>
>>>> Thanks, now I understand the rare-term& frequent-term problem. Couldn't
>>>>
>>>> you do that with the existing consistent function? I don't see why you
>>>> need
>>>> the new pre-consistent function for this.
>>>>
>>>
>>> In the case of two entries I can. But in the case of n entries things
>>> becomes more complicated. Imagine you have "term_1& term_2& ...&
>>> term_n"
>>>
>>> query. When you get some item pointer from term_1 you can skip all the
>>> lesser item pointers from term_2, term_3 ... term_n. But if all you have
>>> for it is consistent function you have to call it with following check
>>> arguments:
>>> 1) [false, false, false, ... , false]
>>> 2) [false, true, false, ... , false]
>>> 3) [false, false, true, ... , false]
>>> 4) [false, true, true, ..., false]
>>> ......
>>> i.e. you have to call it 2^(n-1) times. But if you know the query specific
>>> (i.e. in opclass) it's typically easy to calculate exactly what we need in
>>> single pass. That's why I introduced pre_consistent.
>>>
>>
>> Hmm. So how does that work with the pre-consistent function? Don't you
>> need to call that 2^(n-1)-1 times as well?
>
>
> I call pre-consistent once with [false, true, true, ..., true].
> Pre-consistent knows that each true passed to it could be false positive.
> So, if it returns false it guarantees that consistent will be false for all
> possible combinations.

Ok, I see.

I spent some time pondering this. I'd like to find a way to do something
about this without requiring another user-defined function. A couple of
observations:

1. The profile of that rare-term & frequent-term quest, without any
patch, looks like this:

28,55% postgres ginCompareItemPointers
19,36% postgres keyGetItem
15,20% postgres scanGetItem
7,75% postgres checkcondition_gin
6,25% postgres gin_tsquery_consistent
4,34% postgres TS_execute
3,85% postgres callConsistentFn
3,64% postgres FunctionCall8Coll
3,19% postgres check_stack_depth
2,60% postgres entryGetNextItem
1,35% postgres entryGetItem
1,25% postgres MemoryContextReset
1,12% postgres MemoryContextSwitchTo
0,31% libc-2.17.so __memcpy_ssse3_back
0,24% postgres collectMatchesForHeapRow

I was quite surprised by seeing ginCompareItemPointers at the top. It
turns out that it's relatively expensive to do comparisons in the format
we keep item pointers, packed in 6 bytes, in 3 int16s. I hacked together
a patch to convert ItemPointers into uint64s, when dealing with them in
memory. That helped quite a bit.

Another important thing in the above profile is that calling
consistent-function is taking up a lot of resources. And in the example
test case you gave, it's called with the same arguments every time.
Caching the result of consistent-function would be a big win.

I wrote a quick patch to do that caching, and together with the
itempointer hack, I was able to halve the runtime of that test case.
That's impressive, we probably should pursue that low-hanging fruit, but
it's still slower than your "fast scan" patch by a factor of 100x. So
clearly we do need an algorithmic improvement here, along the lines of
your patch, or something similar.

2. There's one trick we could do even without the pre-consistent
function, that would help the particular test case you gave. Suppose
that you have two search terms A and B. If you have just called
consistent on a row that matched term A, but not term B, you can skip
any subsequent rows in the scan that match A but not B. That means that
you can skip over to the next row that matches B. This is essentially
the same thing you do with the pre-consistent function, it's just a
degenerate case of it. That helps as long as the search contains only
one frequent term, but if it contains multiple, then you have to still
stop at every row that matches more than one of the frequent terms.

3. I'm still not totally convinced that we shouldn't just build the
"truth table" by calling the regular consistent function with all the
combinations of matching keys, as discussed above. I think that would
work pretty well in practice, for queries with up to 5-10 frequent
terms. Per the profiling, it probably would make sense to pre-compute
such a table anyway, to avoid calling consistent repeatedly.

4. If we do go with a new function, I'd like to just call it
"consistent" (or consistent2 or something, to keep it separate form the
old consistent function), and pass it a tri-state input for each search
term. It might not be any different for the full-text search
implementation, or any of the other ones for that matter, but I think it
would be a more understandable API.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2013-06-21 20:04:28 Re: Review [was Re: MD5 aggregate]
Previous Message ian link 2013-06-21 19:30:19 Re: Support for RANGE ... PRECEDING windows in OVER