Evolute - Luis Flores

Kris Jurka wrote:

On Wed, 23 Aug 2006, Luis Vilar Flores wrote:

To all that already forgot the first emails, I developed an modified version of the method toBytes from the org.postgresql.util.PGbytea class. The old method uses 3 buffers to translate the data from the nework to the client, this uses too much memory. My method only uses 2 buffers, but does one more pass through the original buffer (to calculate it's final size).

I'm not super impressed with these timing results. They are certainly showing some effects due to GC, consider the rise in time here at 10.5MB.

Well, thanks a lot for the attention. My main purpose was to reduce the memory footprint. But, before I did the tests, I had the idea that the new method would be slower than the older ... So it would only be better on large files, i.e. where the reduced memory usage was more important than raw speed. This was because of the extra cycle through the array.

OLD method:
size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB
size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB
size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB
size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB

I came up with my own contrived benchmark (attached) that attempts to focus solely on the getBytes() call and avoid the time of fetching results, but it doesn't give really consistent results and I haven't been able to come up with a case that actually shows the new method was faster even with 30MB of data. This is on Debian Linux / 2xOpteron 246 / jdk 1.5.0-05.

The new method is very similar to the old, but it just computes the final size before the copy. The old method does less instructions to convert an array, the new method is only faster when the older is slowed down by garbage collection/memory allocation.

I've committed this to CVS HEAD with a rather arbitrarily set MAX_3_BUFF_SIZE value of 2MB. Note that this is also the escaped size, so we may actually be dealing with output data a quarter of that size. If anyone could do some more testing of what a good crossover point would be that would be a good thing.

I think the old option should be there for a while, but I hope that the new method proves to be as fast as the old, so we can just discard the MAX_3_BUFF_SIZE and always compute the final size - as the method code would be clearer that way.

Thanks for your patience with this item.

It's me who thanks for such a great product ...
I will check the new benchmark, the see the memory usage, and garbage collection ...

Kris Jurka


import java.sql.*;

public class ByteaTest2 {

	public static void main(String args[]) throws Exception {
		Class.forName("org.postgresql.Driver");
		Connection conn = DriverManager.getConnection("jdbc:postgresql://localhost:5432/jurka","jurka","");

		for (int k=0; k<5; k++) {
			long t1 = System.currentTimeMillis();
			long total = 0;

			for (int j=0; j<10; j++) {
				PreparedStatement pstmt = conn.prepareStatement("SELECT varcharsend(repeat(?,?))");
				pstmt.setString(1, "a\\001");
				pstmt.setInt(2, 150000);
				ResultSet rs = pstmt.executeQuery();
				rs.next();
				for (int i=0; i<100; i++) {
					byte b[] = rs.getBytes(1);
					total += b.length;
				}

				rs.close();
				pstmt.close();
			}
			long t2 = System.currentTimeMillis();
			System.out.println(t2-t1);
		}
	}
}

--
Evolute - Luis Flores

Luis Flores

Analista de Sistemas

Evolute - Consultoria Informática

Email: lflores@evolute.pt

Tel: (+351) 212949689

AVISO DE CONFIDENCIALIDADE
Esta mensagem de correio electrónico e eventuais ficheiros anexos são confidenciais e destinados apenas à(s) pessoa(s) ou entidade(s) acima referida(s), podendo conter informação privilegiada e confidencial, a qual não poderá ser divulgada, copiada, gravada ou distribuída nos termos da lei vigente. Caso não seja o destinatário da mensagem, ou se ela lhe foi enviada por engano, agradecemos que não faça uso ou divulgação da mesma. A distribuição ou utilização da informação nela contida é interdita. Se recebeu esta mensagem por engano, por favor notifique o remetente e apague este e-mail do seu sistema. Obrigado.

CONFIDENTIALITY NOTICE
This e-mail transmission and eventual attached files are intended only for the use of the individual(s) or entity(ies) named above and may contain information that is both privileged and confidential and is exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of any of the information contained in this transmission is strictly restricted. If by any means you have received this transmission in error, please immediately notify the sender and delete this e-mail from your system. Thank you.