Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: kskim(at)bitnine(dot)net, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
Date: 2016-08-15 07:04:01
Message-ID: ae92d93a-e00d-4fe7-ece3-f452bab73979@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 13/08/16 05:44, Jeff Janes wrote:
> On Fri, Aug 12, 2016 at 1:40 AM, Mark Kirkwood
>> However your index rebuild gets you from 5 to 3 GB - does that really help
>> performance significantly?
> It can make a big difference, depending on how much RAM you have.
>

Yeah - I suspect this is the issue - loading up a similar type of schema
with records with a primary key of form 'userxxxxx' for (uniformly)
randomly distributed xxxxx... (I was gonna use the Yahoo benchmark but
it is soooo slow...). Also I'm using 10000000 rows instead of 100000000
to avoid waiting a long time (10000000 should be enough to show the point):

prefix=# \d prefix
Table "public.prefix"
Column | Type | Modifiers
--------+-----------------------+-----------
uid | character varying(30) | not null
filler | character(255) |
Indexes:
"prefix_pkey" PRIMARY KEY, btree (uid)

Doing an uncached indexed read by forcing a buffer cache clear:

# echo 3 > /proc/sys/vm/drop_caches

prefix=# SELECT
relfilenode,relname,reltuples,pg_relation_size(oid)/1024/1024 AS mb
FROM pg_class WHERE relname LIKE 'prefix%';
relfilenode | relname | reltuples | mb
-------------+-------------+-----------+-----
6017817 | prefix | 1e+07 | 422
6017819 | prefix_pkey | 1e+07 | 391
(2 rows)

prefix=# EXPLAIN ANALYZE SELECT count(*)
FROM prefix WHERE uid='user10000';
QUERY PLAN

--------------------------------------------------------------------------------
-----------------------------------------------
Aggregate (cost=8.46..8.46 rows=1 width=0) (actual time=3.408..3.408
rows=1 lo
ops=1)
-> Index Only Scan using prefix_pkey on prefix (cost=0.43..8.45
rows=1 widt
h=0) (actual time=3.406..3.406 rows=0 loops=1)
Index Cond: (uid = 'user10000'::text)
Heap Fetches: 0
Planning time: 19.362 ms
Execution time: 3.429 ms
(6 rows)

Repeating this after REINDEX:

# echo 3 > /proc/sys/vm/drop_caches

prefix=# SELECT
relfilenode,relname,reltuples,pg_relation_size(oid)/1024/1024 AS mb
FROM pg_class WHERE relname LIKE 'prefix%';
relfilenode | relname | reltuples | mb
-------------+-------------+-----------+-----
6017817 | prefix | 1e+07 | 422
6017819 | prefix_pkey | 1e+07 | 300
(2 rows)

prefix=# EXPLAIN ANALYZE SELECT count(*)
FROM prefix WHERE uid='user10000';
QUERY PLAN

--------------------------------------------------------------------------------
-----------------------------------------------
Aggregate (cost=8.46..8.46 rows=1 width=0) (actual time=3.868..3.868
rows=1 lo
ops=1)
-> Index Only Scan using prefix_pkey on prefix (cost=0.43..8.45
rows=1 widt
h=0) (actual time=3.866..3.866 rows=0 loops=1)
Index Cond: (uid = 'user10000'::text)
Heap Fetches: 0
Planning time: 19.366 ms
Execution time: 3.889 ms
(6 rows)

So certainly not significantly *slower* with the physically bigger
index. This suggests that Jeff's analysis was spot on - likely that the
larger index didn't fix in RAM.

Cheers

Mark

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Artur Zakirov 2016-08-15 08:24:07 Re: to_date_valid()
Previous Message Venkata B Nagothi 2016-08-15 06:33:27 patch proposal