Wednesday, October 10, 2007

statement.setMaxrows in JDBC vs Oracle rownum in SQL

If you want to limit the number of rows retrieved from an Oracle database in a Java/Servlet application using JDBC, you can accomplish this in two ways. One is using statement.setMaxRows in the JDBC or using rownum in the Oracle SQL.

For example, lets say there are 1000 employee records in the EMPLOYEES table which has the columns: empid, first_name and last_name. In order to retreive the first 200 rows, in java you can say

....

Connection dbConn = db.getConnection();
Statement stmt = dbConn.createStatement();
stmt.setMaxRows(200);
String sql = "select empid, first_name, last_name from employees";
ResultSet rs = stmt.executeQuery(sql);

...............

This gets you a resultset with 200 rows and everybody is happy. You could have alternatively done that using rownum on your sql query itself without using setMaxRows() on statement like below:

....

Connection dbConn = db.getConnection();
Statement stmt = dbConn.createStatement();
String sql = "select empid, first_name, last_name from employees where rownum<201";
ResultSet rs = stmt.executeQuery(sql);
...............

Even this time, the resultset has 200 rows and everybody is happy. Now the question is, which one should you use and why. I may not be an expert in talking about this, but here are my observations...

If you are dealing with a table which has relatively less records like in thousands, then either way works just fine. But if you are dealing with potentially huge database which has like millions of records, then I guess using rownum in the sql makes more sense. This is because the impact on your Java heap space would be lesser.

According to JDBC, the implementation of setMaxRows is as given below:
-----------------------------------
The maxRows limit is set to limit the number of rows that any ResultSet can contain. If the limit is exceeded, the excess rows are silently dropped.
------------------------------------

For example, lets say there are 14 million records in a table, from which you want to retrieve the first 3000 rows. If you use statement.setMaxRows(), the JDBC call retreives the entire 14 million records from the database table and sets the first 3000 records on the resultset object. But the retrieval of such huge number of records puts a huge lump of data on your JVM, and you are dependent on the Garbage Collector for retrieval of this space. But if you use rownum in the SQL, then the number of rows retrieved from the DB by JDBC itself is limited and there won't be any heap space issues. I ran a trial session with our DBA and it was confirmed that the statement.setMaxRows() doesn't modify the sql using rownum automatically. We traced the session and it was retrieving all the million records. And I came to a conclusion that the heap space issues I was facing was due to the usage of statement.setMaxRows() instead of using rownum on the SQL. Also the speed of retrieval of rows when you limit by rownum could be more, though I don't have substantial evidence to state it as a fact.

If you are an expert in Java / Oracle Databases and feel that part/total of this post doesn't make any sense or if it looks like complete bullshit, then please feel to share your expertise on this matter / bash me in the Comments:)

3 comments:

Roy said...

I think your statement that the JDBC driver fetches all the rows and discards more than maxRows number of rows might not be accurate. I tested with Oracle JDBC driver by setting maxRows as 10 on a table which has 3000 rows. When I sniffed the packets sent back and forth, I figured that only 10 rows were sent from the database, not 3000 rows.

So what you are saying might be true for some specific database, but not a general rule.

- Roy

Anonymous said...

rownum is oracle specific, limit is mysql specific. setMaxRows is portable.

Matt Neilson said...

I know this is a little late, but I found this article about 2 months ago and thought my environment was doing the same thing, the JDBC driver was fetching all rows and just passing whatever my setMaxRows limit was on to the result set. I found after thorough testing that this is not the case for me. The JDBC driver only retrieves up to the maxRow value. I'm using the Oracle JDBC driver in my scenario.