Re: char() overhead on read-only workloads not so insignifcant as the docs claim it is...

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Subject: Re: char() overhead on read-only workloads not so insignifcant as the docs claim it is...
Date: 2009-06-15 12:28:07
Message-ID: 65937bea0906150528h497b8c4ax98876789f713181e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Comments?

On Sat, Jun 13, 2009 at 3:44 PM, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:

> I'm currently doing some benchmarking on a Nehalem box(
> http://www.kaltenbrunner.cc/blog/index.php?/archives/26-Benchmarking-8.4-Chapter-1Read-Only-workloads.html)
> with 8.4 and while investigating what looks like issues in pgbench I also
> noticed that using char() has more than a negligable overhead on some (very
> special) readonly(!) workloads.
>
> for example running sysbench in read-only mode against 8.4 results in a
> profile(for the full run) that looks similiar to:
>
> samples % symbol name
> 981690 11.0656 bcTruelen
> 359183 4.0487 index_getnext
> 311128 3.5070 AllocSetAlloc
> 272330 3.0697 hash_search_with_hash_value
> 258157 2.9099 LWLockAcquire
> 195673 2.2056 _bt_compare
> 190303 2.1451 slot_deform_tuple
> 168101 1.8948 PostgresMain
> 164191 1.8508 _bt_checkkeys
> 126110 1.4215 FunctionCall2
> 123965 1.3973 SearchCatCache
> 120629 1.3597 LWLockRelease
>
> the default sysbench mode actually uses a number of different queries and
> the ones dealing with char() are actually only a small part of the full set
> of queries sent.
> The specific query is causing bcTruelen to show up in the profile is:
>
> "SELECT c from sbtest where id between $1 and $2 order by c" where the
> parameters are for example
> $1 = '5009559', $2 = '5009658' - ie ranges of 100.
>
>
> benchmarking only that query results in:
>
> samples % symbol name
> 2148182 23.5861 bcTruelen
> 369463 4.0565 index_getnext
> 362784 3.9832 AllocSetAlloc
> 284198 3.1204 slot_deform_tuple
> 185279 2.0343 _bt_checkkeys
> 180119 1.9776 LWLockAcquire
> 172733 1.8965 appendBinaryStringInfo
> 144158 1.5828 internal_putbytes
> 141040 1.5486 AllocSetFree
> 138093 1.5162 printtup
> 124255 1.3643 hash_search_with_hash_value
> 117054 1.2852 heap_form_minimal_tuple
>
> at around 46000 queries/s
>
> changing the fault sysbench schema from:
>
> Table "public.sbtest"
> Column | Type | Modifiers
>
> --------+----------------+-----------------------------------------------------
> id | integer | not null default
> nextval('sbtest_id_seq'::regclass)
> k | integer | not null default 0
> c | character(120) | not null default ''::bpchar
> pad | character(60) | not null default ''::bpchar
> Indexes:
> "sbtest_pkey" PRIMARY KEY, btree (id)
> "k" btree (k)
>
>
> to
> Table "public.sbtest"
> Column | Type | Modifiers
>
> --------+-------------------+-----------------------------------------------------
> id | integer | not null default
> nextval('sbtest_id_seq'::regclass)
> k | integer | not null default 0
> c | character varying | not null default ''::character varying
> pad | character(60) | not null default ''::bpchar
> Indexes:
> "sbtest_pkey" PRIMARY KEY, btree (id)
> "k" btree (k)
>
> results in a near 50%(!) speedup in terms of tps to around 67000 queries/s.
> This is however an extreme case because the c column actually contains no
> data at all (except for an empty string).
>
> the profile for the changed testcase looks like:
> 430797 5.2222 index_getnext
> 396750 4.8095 AllocSetAlloc
> 345508 4.1883 slot_deform_tuple
> 228222 2.7666 appendBinaryStringInfo
> 227766 2.7610 _bt_checkkeys
> 193818 2.3495 LWLockAcquire
> 179925 2.1811 internal_putbytes
> 168871 2.0471 printtup
> 152026 1.8429 AllocSetFree
> 146333 1.7739 heap_form_minimal_tuple
> 144305 1.7493 FunctionCall2
> 128320 1.5555 hash_search_with_hash_value
>
>
> at the very least we should reconsider this part of our docs:
>
> " There is no performance difference between these three types, apart from
> increased storage space when using the blank-padded type, and a few extra
> CPU cycles to check the length when storing into a length-constrained
> column."
>
> from http://www.postgresql.org/docs/8.4/static/datatype-character.html
>
>
>
> regards
>
> Stefan
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Lets call it Postgres

EnterpriseDB http://www.enterprisedb.com

gurjeet[(dot)singh](at)EnterpriseDB(dot)com
singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com
Mail sent from my BlackLaptop device

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-06-15 13:51:47 Re: machine-readable explain output
Previous Message Robert Haas 2009-06-15 12:01:13 Re: machine-readable explain output