Re: Remove 1MB size limit in tsvector

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove 1MB size limit in tsvector
Date: 2017-09-07 21:08:14
Message-ID: f240a088-aab8-832c-f4e5-f6fdf2624ac8@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 08/17/2017 12:23 PM, Ildus Kurbangaliev wrote:
> In my benchmarks when database fits into buffers (so it's measurement of
> the time required for the tsvectors conversion) it gives me these
> results:
>
> Without conversion:
>
> $ ./tsbench2 -database test1 -bench_time 300
> 2017/08/17 12:04:44 Number of connections: 4
> 2017/08/17 12:04:44 Database: test1
> 2017/08/17 12:09:44 Processed: 51419
>
> With conversion:
>
> $ ./tsbench2 -database test1 -bench_time 300
> 2017/08/17 12:14:31 Number of connections: 4
> 2017/08/17 12:14:31 Database: test1
> 2017/08/17 12:19:31 Processed: 43607
>
> I ran a bunch of these tests, and these results are stable on my
> machine. So in these specific tests performance regression about 15%.
>
> Same time I think this could be the worst case, because usually data
> is on disk and conversion will not affect so much to performance.
>

That seems like a fairly significant regression, TBH. I don't quite
agree we can simply assume in-memory workloads don't matter, plenty of
databases have 99% cache hit ratio (particularly when considering not
just shared buffers, but also page cache).

Can you share the benchmarks, so that others can retry running them?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-07 21:30:57 Re: [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Previous Message Tomas Vondra 2017-09-07 20:53:17 Re: Hooks to track changed pages for backup purposes