Re: [PATCH] Introduce array_shuffle() and array_sample()

From: Martin Kalcher <martin(dot)kalcher(at)aboutsource(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Introduce array_shuffle() and array_sample()
Date: 2022-07-19 07:29:03
Message-ID: 8617d454-0cd6-93bb-b18d-23a97ba40cb4@aboutsource.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Am 19.07.22 um 00:52 schrieb Martin Kalcher:
>
> On the contrary! I am pretty sure there are people out there wanting
> sampling-without-shuffling. I will think about that.

I gave it some thought. Even though there might be use cases, where a
stable order is desired, i would consider them edge cases, not worth the
additional complexity. I personally would not expect array_sample() to
return elements in any specific order. I looked up some sample()
implementations. None of them makes guarantees about the order of the
resulting array or explicitly states that the resulting array is in
random or selection order.

- Python random.sample [0]
- Ruby Array#sample [1]
- Rust rand::seq::SliceRandom::choose_multiple [2]
- Julia StatsBase.sample [3] stable order needs explicit request

[0] https://docs.python.org/3/library/random.html#random.sample
[1] https://ruby-doc.org/core-3.0.0/Array.html#method-i-sample
[2]
https://docs.rs/rand/0.6.5/rand/seq/trait.SliceRandom.html#tymethod.choose_multiple
[3] https://juliastats.org/StatsBase.jl/stable/sampling/#StatsBase.sample

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ganesh Korde 2022-07-19 08:35:29 Re: pg_receivewal/xlog to ship wal to cloud
Previous Message Srinivasa T N 2022-07-19 07:22:31 Setting up a server with previous day data

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-07-19 07:30:34 Re: NAMEDATALEN increase because of non-latin languages
Previous Message osumi.takamichi@fujitsu.com 2022-07-19 07:28:15 RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns