Re: Trying out read streams in pgvector (an extension)

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Trying out read streams in pgvector (an extension)
Date: 2025-11-20 15:27:59
Message-ID: CAAKRu_Zwj83zCJhahhMO578-+JdfTbqMV_ktxr-XjiE8BHLo9g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 19, 2025 at 2:28 AM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> > To make sure 1) distance isn't reset to a resume_distance from
> > read_stream_begin_relation() and 2) unexpected buffers aren't returned
> > from the read stream, we could error out in read_stream_resume() if
> > pinned_buffers > 0. And in read_stream_reset(), we would save distance
> > in resume_distance before clearing distance. That would allow calling
> > read_stream_resume() either if you called read_stream_reset() or if
> > you exhausted the stream yourself. See rough attached patch for a
> > sketch of this.
>
> This looks correct to me. What do you think about using an assert
> instead of erroring out?

I'm not totally opposed to this. My rationale for making it an error
is that the developer could have test cases where all the buffers are
consumed but the code is written such that that won't always happen.
Then if a real production query doesn't consume all the buffers, it
could return wrong results (I think). That will mean the user can't
complete their query until the extension author releases a new version
of their code. But I'm not sure what the right answer is here.

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-11-20 15:43:58 Re: Remove useless casts to (void *)
Previous Message jian he 2025-11-20 15:26:51 Re: ON CONFLICT DO SELECT (take 3)