Re: asynchronous and vectorized execution

From: Bert <biertie(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: asynchronous and vectorized execution
Date: 2016-05-10 21:00:27
Message-ID: CAFCtE1mNh9wDSZb_YnKZCPazwPFp+fo-FHBcV=8OaMxKU_SqoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hmm, the morsels paper looks really interesting at first sight.
Let's see if we can get a poc working in PostgreSQL? :-)

On Tue, May 10, 2016 at 9:42 PM, Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> wrote:

> On 05/10/2016 08:26 PM, Robert Haas wrote:
>
>> On Tue, May 10, 2016 at 3:00 AM, konstantin knizhnik
>> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>>
>>> What's wrong with it that worker is blocked? You can just have more
>>> workers
>>> (more than CPU cores) to let other of them continue to do useful work.
>>>
>> Not really. The workers are all running the same plan, so they'll all
>> make the same decision about which node needs to be executed next. If
>> that node can't accommodate multiple processes trying to execute it at
>> the same time, it will have to block all of them but the first one.
>> Adding more processes just increases the number of processes sitting
>> around doing nothing.
>>
>
> Doesn't this actually mean that we need to have normal job scheduler which
> is given queue of jobs and having some pool of threads will be able to
> orginize efficient execution of queries? Optimizer can build pipeline
> (graph) of tasks, which corresponds to execution plan nodes, i.e. SeqScan,
> Sort, ... Each task is splitted into several jobs which can be concurretly
> scheduled by task dispatcher. So you will not have blocked worker waiting
> for something and all system resources will be utilized. Such approach with
> dispatcher allows to implement quotas, priorities,... Also dispatches can
> care about NUMA and cache optimizations which is especially critical on
> modern architectures. One more reference:
> http://db.in.tum.de/~leis/papers/morsels.pdf
>
> Sorry, may be I wrong, but I still think that async.ops is "multitasking
> for poor":)
> Yes, maintaining threads and especially separate processes adds
> significant overhead. It leads to extra resources consumption and context
> switches are quite expensive. And I know from my own experience that
> replacing several concurrent processes performing some IO (for example with
> sockets) with just one process using multiplexing allows to increase
> performance. But still async. ops. is just a way to make programmer
> responsible for managing state machine instead of relying on OS tomake
> context switches. Manual transmission is still more efficient than
> automatic transmission. But still most drives prefer last one;)
>
> Seriously, I carefully read your response to Kochei, but still not
> convinced that async. ops. is what we need. Or may be we just understand
> different thing by this notion.
>
>
>
>
>> But there are some researches, for example:
>>>
>>> http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
>>>
>>> showing that the same or even better effect can be achieved by generation
>>> native code for query execution plan (which is not so difficult now,
>>> thanks
>>> to LLVM).
>>> It eliminates interpretation overhead and increase cache locality.
>>> I get similar results with my own experiments of accelerating SparkSQL.
>>> Instead of native code generation I used conversion of query plans to C
>>> code
>>> and experiment with different data representation. "Horisontal model"
>>> with
>>> loading columns on demands shows better performance than columnar store.
>>>
>> Yes, I think this approach should also be considered.
>>
>> At this moment (February) them have implemented translation of only few
>>> PostgreSQL operators used by ExecQuals and do not support aggregates.
>>> Them get about 2 times increase of speed at synthetic queries and 25%
>>> increase at TPC-H Q1 (for Q1 most critical is generation of native code
>>> for
>>> aggregates, because ExecQual itself takes only 6% of time for this
>>> query).
>>> Actually these 25% for Q1 were achieved not by using dynamic code
>>> generation, but switching from PULL to PUSH model in executor.
>>> It seems to be yet another interesting PostgreSQL executor
>>> transformation.
>>> As far as I know, them are going to publish result of their work to open
>>> source...
>>>
>> Interesting. You may notice that in "asynchronous mode" my prototype
>> works using a push model of sorts. Maybe that should be taken
>> further.
>>
>>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Bert Desmet
0477/305361

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2016-05-10 21:03:43 Re: what to revert
Previous Message Andres Freund 2016-05-10 21:00:13 Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)