Quick Links

Performance difference in accessing differrent columns in a Postgres Table

From:	Dinesh Kumar <dns98944(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Performance difference in accessing differrent columns in a Postgres Table
Date:	2018-07-29 05:38:31
Message-ID:	CAEe=mRnNNL3RDKJDmY=_mpcpAb5ugYL9NcchELa6Qgtoz2NjCw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hello All,

I created a table with 200 bigint column, 200 varchar column. (Postgres
10.4)

create table i200c200 ( pk bigint primary key, int1 bigint, int2
bigint,....., int200 bigint, char1 varchar(255),......, char200
varchar(255)) ;

Inserted values only in pk,int1,int200 columns with some random data ( from
generate series) and remaining columns are all null. The table has 1000000
rows.

I found performance variance between accessing int1 and int200 column which
is quite large.

Reports from pg_stat_statements:

query | total_time | min_time |
max_time | mean_time | stddev_time
-----------------------------------------+------------+----------+----------+-----------+--------------------
select pk,int1 from i200c200 limit 200 | 0.65 | 0.102 |
0.138 | 0.13 | 0.0140142784330839
select pk,int199 from i200c200 limit $1 | 1.207 | 0.18 |
0.332 | 0.2414 | 0.0500583659341773
select pk,int200 from i200c200 limit 200| 1.67 | 0.215 |
0.434 | 0.334 | 0.0697825193010399

Explain Analyse:

explain analyse select pk,int1 from i200c200 limit 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..23.33 rows=1000 width=16) (actual
time=0.014..0.390 rows=1000 loops=1)
-> Seq Scan on i200c200 (cost=0.00..23334.00 rows=1000000
width=16) (actual time=0.013..0.268 rows=1000 loops=1)
Planning time: 0.066 ms
Execution time: 0.475 ms

explain analyse select pk,int200 from i200c200 limit 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..23.33 rows=1000 width=16) (actual
time=0.012..1.001 rows=1000 loops=1)
-> Seq Scan on i200c200 (cost=0.00..23334.00 rows=1000000
width=16) (actual time=0.011..0.894 rows=1000 loops=1)
Planning time: 0.049 ms
Execution time: 1.067 ms

I am curious in getting this postgres behaviour and its internals.

Note: I have the tried the same query with int199 column which is null in
all rows,it is still performance variant.Since,postgres doesn't store null
values in data instead it store in null bit map,there should not be this
variation(because i'm having data only for pk,int1,int200).I am wondering
that this null bit map lookup is slowing down this , because each row in my
table is having a null bit map of size (408 bits).As newbie I am wondering
whether this null bit map lookup for non-earlier column is taking too much
time (for scanning the null bit map itself).Am i thinking in right way?

Thanks in advance,

Dineshkumar.P

Postgres Newbie.

Responses

Re: Performance difference in accessing differrent columns in a Postgres Table at 2018-07-29 21:53:55 from David Rowley

Browse pgsql-performance by date

	From	Date	Subject
Next Message	David Rowley	2018-07-29 21:53:55	Re: Performance difference in accessing differrent columns in a Postgres Table
Previous Message	Nicolas Even	2018-07-26 19:32:27	Re: Query with "ILIKE ALL" does not use the index