Re: PG-Strom - A GPU optimized asynchronous executor module

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PG-Strom - A GPU optimized asynchronous executor module
Date: 2012-01-23 06:38:54
Message-ID: CADyhKSWA3nSbokM2TFGzNB1rKYubfws1ZJzFpNjP5Kue1EdU-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/1/23 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Sun, Jan 22, 2012 at 10:48 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> I tried to implement a fdw module that is designed to utilize GPU
>> devices to execute
>> qualifiers of sequential-scan on foreign tables managed by this module.
>>
>> It was named PG-Strom, and the following wikipage gives a brief
>> overview of this module.
>>    http://wiki.postgresql.org/wiki/PGStrom
>>
>> In our measurement, it achieves about x10 times faster on
>> sequential-scan with complex-
>> qualifiers, of course, it quite depends on type of workloads.
>
> That's pretty neat.  In terms of tuning the non-GPU based
> implementation, have you done any profiling?  Sometimes that leads to
> an "oh, woops" moment.
>
Not yet, except for \timing.

What options are available to see rate of workloads of components
within a particular query?
I tried to google some keywords, but does not hit to me.

As an aside, I also tries to modify is_device_executable_qual() always
return false to disable qualifiers pushed-down.
In this case, 2100ms of 7679ms was consumed within this module, thus,
I guess rest of 5500ms was mostly consumed by ExecQual(), although
it is just an estimation...

postgres=# SET pg_strom.exec_profile = on;
SET
Time: 1.075 ms
postgres=# SELECT count(*) FROM ftbl WHERE sqrt((x-25.6)^2 + (y-12.8)^2) < 10;
INFO: PG-Strom Exec Profile on "ftbl"
INFO: Total PG-Strom consumed time: 2100.898 ms
INFO: Time to JIT Compile GPU code: 0.000 ms
INFO: Time to initialize devices: 0.000 ms
INFO: Time to Load column-stores: 7.013 ms
INFO: Time to Scan column-stores: 1219.746 ms
INFO: Time to Fetch virtual tuples: 874.095 ms
INFO: Time of GPU Synchronization: 0.000 ms
INFO: Time of Async memcpy: 0.000 ms
INFO: Time of Async kernel exec: 0.000 ms
count
-------
3159
(1 row)

Time: 7679.342 ms

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-01-23 07:58:17 Re: New replication mode: write
Previous Message Tom Lane 2012-01-23 06:00:40 Re: Inline Extension