| From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
|---|---|
| To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
| Cc: | "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Trying out read streams in pgvector (an extension) |
| Date: | 2025-11-11 23:19:26 |
| Message-ID: | CA+hUKG+zLmkD9zus=JOjjC+j5p9R1+CSXNZgd5=exZ01ZTaKoA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Nov 12, 2025 at 11:52 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> On Tue, Nov 11, 2025 at 4:22 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > But for now, to fix pgvector's woes, I wonder if it might make sense
> > to call this a bug in v18, and back-patch the tiniest possible change.
> > Something like what I posted[2] in this thread almost two years ago.
> > I don't think it really affects any core code: we use
> > read_stream_reset() only in very minimal ways there (I could
> > elaborate), and it's quite arguable that the existing policy is wrong
> > for them too, but we'd need to confirm that and perhaps think about
> > other extensions that might be using it.
>
> If we are worried about regressing other extensions using
> read_stream_reset(), we could make the read stream reset which
> preserves the distance a different function in backbranches.
Hmm, yeah, interesting idea. Candidate names might include
read_stream_restart() and read_stream_continue(). The point being
that the block number callback reported end-of-stream, but that was
only temporary, and now it has more information and would like to
continue. Those are some of the names I bounced around for a new
read_stream_reset() flag argument for v19 (I rather liked "continue"),
but I also like this separate function idea. Back-patching a new
function would certainly remove all doubt about unintended
consequences for existing callers of read_stream_reset(), so yeah,
that wins on pure conservative safety grounds. As for the future,
hmm, it might even be better to have an explicit separate API for this
operation in master too, as it is turning out to be quite a common
requirement and the naming is much clearer like that. We don't
usually design new APIs while back-patching though, that's probably
why I didn't think of that, but if we view this as a design bug that
folded too many jobs into read_stream_reset() that we now want to fix
by splitting one off, maybe that's OK? Seems pretty risk-free,
anyway.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Treat | 2025-11-11 23:22:36 | Re: another autovacuum scheduling thread |
| Previous Message | Rohit Prasad | 2025-11-11 23:17:05 | Re: Include extension path on pg_available_extensions |