Performance problems with postgres and null Values?

From: "Sven Kerkling" <kerkling(at)bds-online(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Performance problems with postgres and null Values?
Date: 2016-04-21 09:49:54
Message-ID: 003501d19bb3$2b1458a0$813d09e0$@bds-online.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

After remodeling a table we have some performance problems.

The Original tables have much more fields and we thought it came from these
many fields. After some testing I tried these test layout and the
performance problem is not solved.

Postgresql 9.3.12

Former DB-Layout was table _masterOld_ with 2 tables inherits from
_masterOld_: _part1Old_ and _part2Old_. In _masterOld_ were 7 million rows
(_part1Old_: 5 millions, _part2Old_: 2 millions).

Now we have only one new table _masterNew_ with 7 million rows.

DDL:

export:

CREATE TABLE public.export (

id_firma BIGINT,

status VARCHAR(32)

)

WITH (oids = false);

CREATE INDEX export_idx ON public.export

USING btree (id_firma);

masterNew:

CREATE TABLE public."masterNew" (

id_firma BIGINT,

id_bestand BIGINT NOT NULL,

status VARCHAR(32),

sperre VARCHAR(32),

CONSTRAINT "masterNew_2016_pkey" PRIMARY KEY(id_bestand)

)

WITH (oids = false);

CREATE INDEX "masterNew_2016_pi_idx" ON public."masterNew"

USING btree (id_firma)

WHERE ((status IS NULL) AND (sperre IS NULL));

CREATE INDEX "masterNew_sperre_2016" ON public."masterNew"

USING btree (sperre COLLATE pg_catalog."default");

CREATE INDEX "masterNew_status_2016" ON public."masterNew"

USING btree (status COLLATE pg_catalog."default");

masterOld:

CREATE TABLE public."masterOld" (

id_firma BIGINT,

id_bestand BIGINT NOT NULL,

status VARCHAR(32),

sperre VARCHAR(32),

CONSTRAINT "masterOld_pkey" PRIMARY KEY(id_bestand)

)

WITH (oids = false);

CREATE INDEX "masterOld_idx" ON public."masterOld"

USING btree (id_firma);

CREATE INDEX "masterOld_sperre" ON public."masterOld"

USING btree (sperre COLLATE pg_catalog."default");

CREATE INDEX "masterOld_status" ON public."masterOld"

USING btree (status COLLATE pg_catalog."default");

part1Old:

CREATE TABLE public."part1Old" (

CONSTRAINT "part1Old_idx" PRIMARY KEY(id_bestand)

) INHERITS (public."masterOld")

WITH (oids = false);

CREATE INDEX "part1Old_idx1" ON public."part1Old"

USING btree (id_firma);

CREATE INDEX "part1Old_idx2" ON public."part1Old"

USING btree (status COLLATE pg_catalog."default");

CREATE INDEX "part1Old_idx3" ON public."part1Old"

USING btree (sperre COLLATE pg_catalog."default");

part2Old:

CREATE TABLE public."part2Old" (

CONSTRAINT "part2Old_idx" PRIMARY KEY(id_bestand)

) INHERITS (public."masterOld")

WITH (oids = false);

CREATE INDEX "part2Old_idx1" ON public."part2Old"

USING btree (id_firma);

CREATE INDEX "part2Old_idx2" ON public."part2Old"

USING btree (status COLLATE pg_catalog."default");

CREATE INDEX "part2Old_idx3" ON public."part2Old"

USING btree (sperre COLLATE pg_catalog."default");

In the _export_ table are 1.2 million rows.

Old:

EXPLAIN

SELECT b.id, b.status

FROM export b, masterOld mb

WHERE mb.sperre IS NULL

AND mb.status IS NULL

AND b.id_firma = mb.id_firma

LIMIT 100;

<a href="http://explain.depesz.com/s/SCBo">Plan on explain.depesz.com</a>

- Plan:

Node Type: "Limit"

Startup Cost: 0.00

Total Cost: 0.09

Plan Rows: 100

Plan Width: 90

Plans:

- Node Type: "Nested Loop"

Parent Relationship: "Outer"

Join Type: "Inner"

Startup Cost: 0.00

Total Cost: 118535034.59

Plan Rows: 126126068850

Plan Width: 90

Plans:

- Node Type: "Seq Scan"

Parent Relationship: "Outer"

Relation Name: "export"

Alias: "b"

Startup Cost: 0.00

Total Cost: 79129.80

Plan Rows: 5485680

Plan Width: 90

- Node Type: "Append"

Parent Relationship: "Inner"

Startup Cost: 0.00

Total Cost: 21.56

Plan Rows: 3

Plan Width: 8

Plans:

- Node Type: "Seq Scan"

Parent Relationship: "Member"

Relation Name: "masterOld"

Alias: "mb"

Startup Cost: 0.00

Total Cost: 1.10

Plan Rows: 1

Plan Width: 8

Filter: "((sperre IS NULL) AND (status IS NULL) AND (b.id =
id))"

- Node Type: "Index Scan"

Parent Relationship: "Member"

Scan Direction: "Forward"

Index Name: "part1Old_idx9"

Relation Name: "part1Old"

Alias: "mb_1"

Startup Cost: 0.43

Total Cost: 12.20

Plan Rows: 1

Plan Width: 8

Index Cond: "(id = b.id)"

Filter: "((sperre IS NULL) AND (status IS NULL))"

- Node Type: "Index Scan"

Parent Relationship: "Member"

Scan Direction: "Forward"

Index Name: "part2Old_idx"

Relation Name: "part2Old"

Alias: "mb_2"

Startup Cost: 0.43

Total Cost: 8.26

Plan Rows: 1

Plan Width: 8

Index Cond: "(id = b.id)"

Filter: "((sperre IS NULL) AND (status IS NULL))"

There were no speed problems.

New:

EXPLAIN

SELECT b.id, b.status

FROM export b, masterNew mb

WHERE mb.sperre IS NULL

AND mb.status IS NULL

AND b.id = mb.id

LIMIT 100;

<a href="http://explain.depesz.com/s/eAqG">Plan on explain.depesz.com</a>

- Plan:

Node Type: "Limit"

Startup Cost: 5.38

Total Cost: 306.99

Plan Rows: 100

Plan Width: 90

Plans:

- Node Type: "Nested Loop"

Parent Relationship: "Outer"

Join Type: "Inner"

Startup Cost: 5.38

Total Cost: 14973468.06

Plan Rows: 4964540

Plan Width: 90

Join Filter: "(b.id = mb.id)"

Plans:

- Node Type: "Seq Scan"

Parent Relationship: "Outer"

Relation Name: "export"

Alias: "b"

Startup Cost: 0.00

Total Cost: 79129.80

Plan Rows: 5485680

Plan Width: 90

- Node Type: "Materialize"

Parent Relationship: "Inner"

Startup Cost: 5.38

Total Cost: 717.51

Plan Rows: 181

Plan Width: 8

Plans:

- Node Type: "Bitmap Heap Scan"

Parent Relationship: "Outer"

Relation Name: "masterNew"

Alias: "mb"

Startup Cost: 5.38

Total Cost: 716.61

Plan Rows: 181

Plan Width: 8

Recheck Cond: "((status IS NULL) AND (sperre IS NULL))"

Plans:

- Node Type: "Bitmap Index Scan"

Parent Relationship: "Outer"

Index Name: "masterNew_2016_pi_idx"

Startup Cost: 0.00

Total Cost: 5.34

Plan Rows: 181

Plan Width: 0

There we have our problem.

We have tried to fix it using a partial Index on _id_ with `status is null
and sperre is null` .

If we don't use `sperre is null` in this query it is quick. I think we have
these problems because _sperre_ and _status_ have much null values.
_status_: 67% null and _sperre_: 97% null .

On each table there are btree indexes on _id_, _sperre_ and _status_.

On _masterNew_ there is a partial Index on _id_ with `sperre is null and
status is null`.

Can somebody help me with these performance Problem.

What can I try to solve this?

Best regards,

Sven Kerkling

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Teodor Sigaev 2016-04-21 10:04:13 Re: Performant queries on table with many boolean columns
Previous Message David G. Johnston 2016-04-20 21:49:10 Re: Performant queries on table with many boolean columns