Re: Yet another vectorized engine

From: Hubert Zhang <hzhang(at)pivotal(dot)io>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Gang Xiong <gxiong(at)pivotal(dot)io>, Asim R P <apraveen(at)pivotal(dot)io>, Ning Yu <nyu(at)pivotal(dot)io>
Subject: Re: Yet another vectorized engine
Date: 2020-02-27 03:01:27
Message-ID: CAB0yrekS3ydO3NgRToj18xB1Ez0zCrMsBoMK+rD2JXyKUyjJqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 26, 2020 at 7:59 PM Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> wrote:

>
>
> On 26.02.2020 13:11, Hubert Zhang wrote:
>
>
>
>> and with JIT:
>>
>> 13.88% postgres postgres [.] tts_buffer_heap_getsomeattrs
>> 7.15% postgres vectorize_engine.so [.] vfloat8_accum
>> 6.03% postgres postgres [.] HeapTupleSatisfiesVisibility
>> 5.55% postgres postgres [.] bpchareq
>> 4.42% postgres vectorize_engine.so [.] VExecStoreColumns
>> 4.19% postgres postgres [.] hashbpchar
>> 4.09% postgres vectorize_engine.so [.] vfloat8pl
>>
>>
> I also tested Q1 with your latest code. Result of vectorized is still slow.
> PG13 native: 38 secs
> PG13 Vec: 30 secs
> PG13 JIT: 23 secs
> PG13 JIT+Vec: 27 secs
>
>
> It is strange that your results are much slower than my and profile is
> very different.
> Which postgres configuration you are using?
>
>
./configure CFLAGS="-O3 -g -march=native" --prefix=/usr/local/pgsql/
--disable-cassert --enable-debug --with-llvm
I also use `PGXS := $(shell $(PG_CONFIG) --pgxs)` to compile
vectorized_engine. So it will share the same compile configuration.

My perf result is as belows. There are three parts:
> 1. lookup_hash_entry(43.5%) this part is not vectorized yet.
>
> It is vectorized in some sense: lookup_hash_entry performs bulk of hash
> lookups and pass array with results of such lookups to aggregate transmit
> functions.
> It will be possible to significantly increase speed of HashAgg if we store
> data in order of grouping attributes and use RLE (run length encoding) to
> peform just one
> hash lookup for group of values. But it requires creation of special
> partitions (like it is done in Vertica and VOPS).
>
>
Yes, Vertica's partition needed to be pre-sorted on user defined columns.
So for TPCH Q1 on Postgres, we could not have that assumption. And my Q1
plan uses HashAgg instead of GroupAgg based on cost.

> 2. scan part: fetch_input_tuple(36%)
> 3. vadvance_aggregates part(20%)
> I also perfed on PG96 vectorized version and got similar perf results and
> running time of vectorized PG96 and PG13 are also similar. But PG13 is much
> faster than PG96. So I just wonder whether we merge all the latest executor
> code of PG13 into the vectorized PG13 branch?
>
>
> Sorry, I do not understand the question. vectorize_executor contains
> patched versions of nodeSeqscan and nodeAgg from standard executor.
> When performing porting to PG13, I took the latest version of nodeAgg and
> tried to apply your patches to it. Certainly not always it was possible and
> I have to rewrite a lt of places. Concerning nodeSeqscan - I took old
> version from vectorize_executor and port it to PG13.
>

> It is strange that I am not seeing lookup_hash_entry in profile in my
> case.
>
>
So you already have the PG13 nodeAgg, that is good.
Yes, it is strange. Hash table probing is always the costly part.
My perf command `perf record --call-graph dwarf -p pid`
Could you share your lineitem schema and Q1 query?
My schema and Q1 query are:
CREATE TABLE lineitem (
l_orderkey BIGINT NOT NULL,
l_partkey INTEGER NOT NULL,
l_suppkey INTEGER NOT NULL,
l_linenumber INTEGER NOT NULL,
l_quantity double precision NOT NULL,
l_extendedprice double precision NOT NULL,
l_discount double precision NOT NULL,
l_tax double precision NOT NULL,
l_returnflag CHAR(1) NOT NULL,
l_linestatus CHAR(1) NOT NULL,
l_shipdate DATE NOT NULL,
l_commitdate DATE NOT NULL,
l_receiptdate DATE NOT NULL,
l_shipinstruct CHAR(25) NOT NULL,
l_shipmode CHAR(10) NOT NULL,
l_comment VARCHAR(44) NOT NULL
);
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(l_discount) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '106 day'
group by
l_returnflag,
l_linestatus
;

--
Thanks

Hubert Zhang

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-02-27 03:10:14 Re: Collation versioning
Previous Message Michael Paquier 2020-02-27 02:39:42 Re: Commit fest manager for 2020-03