Welcome!

Microservices Expo Authors: Elizabeth White, Liz McMillan, PagerDuty Blog, Yeshim Deniz, JP Morgenthal

Related Topics: Java IoT

Java IoT: Article

Best Practices for JDBC Programming

Best Practices for JDBC Programming

As a consultant, developer and database administrator, I've often been asked to provide coding guidelines and tuning assistance for Java code that utilizes JDBC. Over time, I've been introduced to or developed standard coding practices that make JDBC code faster and less error-prone, and easier to read, understand and use. This article documents some of the more important "best practices" for using JDBC libraries to perform database access. As most of my clients are using Oracle database technologies, I've included several practices that are Oracle-specific.

For the purposes of this article the goals of best practices for JDBC programming are maintainability, portability and performance.

  • Maintainability refers to the ease with which developers can understand, debug and modify JDBC code that they didn't write.
  • Portability refers to the ease with which JDBC code can be used with multiple databases. It turns out that JDBC doesn't make database programming as platform independent as I'd like. In addition, I consider portability a noble goal even if you have no current plans to support multiple databases. Who knows how long your code will be around and what kinds of changes will have to be made to it?
  • Performance refers to optimizing the speed and/or memory needed to run JDBC code.
While I've labeled my recommendations best practices, these recommendations change as technology changes and as I discover even better coding practices. In addition, I'm always annoyed by articles that make recommendations and then don't explain the rationale for making them. I'll try not to make that mistake here.

Best Practices for JDBC Programming
The most common recommendations I make to Java programmers using JDBC are the following (discussed individually later):

  • Use host variables for literals - avoid hard-coding them (Oracle specific).
  • Always close statements, prepared statements and connections.
  • Consolidate formation of SQL statement strings.
  • Use the delegate model for database connection.
  • Use Date, Time and Timestamp objects as host variables for temporal fields (avoid using strings).
  • Limit use of column functions.
  • Always specify a column list with an select statement (avoid "select *").
  • Always specify a column list with an insert statement.
Use Host Variables for Literals in SQL Statements (Oracle Specific)
I recommend that developers use host variables in SQL statements instead of hard-coding literals in SQL strings. As a convenience, many developers embed literals in SQL statements instead. I've provided an example of embedding literals in the following code. While the performance benefits of using host variables greatly improve Oracle performance, it won't hurt performance for other database platforms that I'm aware of. Note that this example places a user ID directly in the SQL statement. (As an aside, note that this example uses the "+" operator for string concatenation. While this is convenient, using StringBuffers and the StringBuffer.append() method is a faster way to concatenate strings.)

Statement stmt;
ResultSet rst;
Connection dbconnection;
...
stmt = dbconnection.createStatement();
rst = stmt.executeQuery("select count(*) from portfolio_info where
USER_ID = " + userID);
if(rst.next()){
count = rst.getInt(1);
}

To get the benefit of Oracle's optimizations, we need to use PreparedStatements instead of statements for SQL that will be executed multiple times. Furthermore, we need to use host variables instead of literals for literals that will change between executions. In the code above the SQL statement for User id 1 will be different than for User Id 2 ("where USER_ID = 1" is different from "where USER_ID = 2"). A better way to approach this SQL statement is the following:

ResultSet rst;
PreparedStatement pstmt;
Connection dbconnection;
...
pstmt = dbconnection.prepareStatement("select count(*) from portfolio_info where USER_ID = ? "); pstmt.setDouble(1,userID);
rst = pstmt.executeQuery();
if(rst.next()){
count = rst.getInt(1);
}

In this code, because we're using host variables instead of literals, the SQL statement is identical no matter what the qualifying user ID is. Furthermore, we used a PreparedStatement instead of a statement. So that we can better understand the source of the performance benefit, let's walk through how SQL statements are processed by the Oracle optimizer. When SQL statements are executed, Oracle will execute (roughly speaking) the following steps:

  1. Look up the statement in the shared pool to see if it has already been parsed or interpreted. If yes, Oracle will go directly to step 4.
  2. Parse (or interpret) the statement.
  3. Figure out how it will get the data you want; record that information in a portion of memory called the shared pool.
  4. Get your data.
A flowchart of this decision process can be found in Figure 1.

When an Oracle user looks up a SQL statement to see if it's already been executed (step 1), he or she attempts a character-by-character match of the SQL statement. If the user finds a match, he or she can use the parse information already in the shared pool and doesn't have to do steps 2 and 3 above because the work has already been done. If you hard-code literals in your SQL statements, the probability of finding a match is very low ("where USER_ID = 1" isn't the same as "where USER_ID = 2"). This means that Oracle will have to reparse the second code example for each portfolio selected. Had the code used host variables, that statement (which would look something like "where USER_ID = :1" in the shared pool) would have been parsed once and only once.

I've experienced anywhere from a 5% to a 25% performance increase by writing SQL statements that are reusable (results vary with transaction volume, number of users, network latency and many other things). More information on this can be found in the Oracle Tuning manual. Within this manual look at the "Writing Identical SQL Statements" subheading within the "Tuning the Shared Pool" section.

While this best practice is Oracle-specific, many database platforms optimize preparing and reusing similar SQL statements. Most database platforms do this by optimizing reuse of PreparedStatement objects. Some databases, such as Cloudscape, optionally will store prepared statements in the database so they can be reused and shared by many users. Following this practice won't hurt performance with any database platform I'm aware of.

Always Close Statements, Prepared Statements and Connections
Many databases allocate resources to servicing statements, prepared statements and connections. Many database platforms continue to allocate those resources for a period of time if these objects aren't closed after use. With Oracle databases it's possible to get a "max cursors exceeded" error message when you don't close statements or prepared statements. In addition, with Oracle databases, the connections stay around on the server. This practice improves time and resources spent on maintenance to keep errors from happening.

An example can be found in Listing 1. Note that I use a "finally" block to close the PreparedStatement. I don't close the connection in the example method as it is used elsewhere in the application. Note also that I call a utility to close the PreparedStatement for me. The code for this utility can be found in Listing 2. I use a utility to do the close so I don't have to replicate the exception-catching code everywhere.

Consolidate Formation of SQL Statement Strings
As a database administrator, a substantial portion of my time is spent reading the code of others and suggesting ways to improve performance. As you might expect, looking at the SQL statements being issued is of particular interest to me. It's hard to follow SQL statements that are constructed by string manipulation scattered over several methods. Developers who maintain this kind of code must have the same problem. It greatly enhances readability if you consolidate the logic that forms the SQL statement in one place.

Listing 2 is a good example of this point. The string manipulation to form the SQL statement is located in one place, and the SQL statement logic is in a separate static block instead of within the method itself. This is done to reduce the number of times this string concatenation happens. Also note that StringBuffers are used for the string manipulation, not Strings. StringBuffers are more efficient at string concatenation than Strings are. In a project I recently completed the development team adopted this convention of consolidating SQL statements in static blocks directly above the method in which they were used. We found this practice quite readable and maintainable.

Use Delegate Model for Database Connection
I recently had the task of making the same application runnable on Oracle 8i, Cloudscape and Oracle Lite with as few modifications to existing code as possible. The development team wanted to avoid making JDBC-related classes platform-aware. In addition, the team wanted to take advantage of some platform-specific features, such as array processing and write batching in Oracle 8i, in special cases.

I was able to port the application to multiple environments largely through manipulation of one class responsible for managing our database connection. We had the foresight to create a delegate class for the java.sql.connection that manages needed connection functions and allows us to take advantage of platform-specific performance-tuning enhancements. All of our code used the delegate, not a native JDBC connection, as illustrated in Figure 2. While the specific class used for the project is proprietary, I've created another delegate, dvt.util.db.Connection, that illustrates the concept for the purposes of this article. The source for this delegate can be found in Listing 3.

Note that dvt.util.db.Connection determines that the database platform is being used. If the platform is Oracle 8i, I establish array processing by setting the default row prefetch size (available with Oracle database connections) to improve the performance of our "select" statements. I also establish write batching to improve performance of update, insert and delete statements.

Since I consolidate the platform-specific code in my connection object delegate, classes that use my connection delegate don't need to be platform specific. In case they do, however, developers can use getPlatform() to get information about the database platform being used. Furthermore, I can add support for additional database platforms (e.g., Cloudscape and Sybase) largely by changing this class. The connection delegate won't solve all portability issues, but it will solve a good percentage of them.

I recommend using a connection delegate even for projects that current supporting only one database platform. As we saw from recent Y2K efforts, you may find that your code is used for longer than you think, and used in other applications down the road.

Use Date, Time and Timestamp Objects as Host Variables for Temporal Fields (Avoid Using Strings)
For convenience, I've seen many developers use strings as host variables to represent dates, times and timestamps. I think they consider Java.sql.Date, Time and Timestamp awkward. I agree with from a coding perspective. Unfortunately, using strings as host variables for temporal fields can affect data access performance.

The following code snippet contains a SQL statement meant for an Oracle platform that uses a string variable to represent a DATE field. Without an understanding of how the database optimizers work, this appears to be an acceptable coding technique. For the small inconvenience of using a "to_char" function in the SQL statement, we avoid the Java work of converting a java.sql.Date or Timestamp into a more easily displayable data type elsewhere in the code.

Select sum(sale_price)
From order_sales
Where to_char(sale_dt,'YYYY-MM-DD') >= ?

Unfortunately, Oracle and most database optimizers can't use an index to speed up performance of the query in this snippet. Developers will have to read all rows of the order_sales table and convert the sale_dt of all rows to a string before they can do the comparison to see which rows satisfy the where clause of the query.

If we rewrite the query in the snippet to use a java.sql.Timestamp hostvariable, Oracle (and most of the common database platforms) will use an index and significantly improve performance in most cases, as follows:

Select sum(sale_price)
From order_sales
Where sale_dt >= ?

For applications that use Oracle exclusively, I recommend using java.sql.Timestamp exclusively. Oracle's DATE data type actually contains time information (hours, minutes, seconds) as well as date information. Most other database platforms would call this type of field a TIMESTAMP. Oracle has no direct counterpart for a DATE (which has year, month and day only) and TIME data type offered by other platforms.

Limit Use of Column Functions
I generally recommend that developers limit use of column functions to the select lists of select statements. Moreover, I tend to stick to aggregate functions (e.g., count, sum, average) needed for select statements that use a "group by" clause. I make this recommendation for two reasons: performance and portability. Limiting function use to select lists (and keeping it out of where clauses) means that the use of a function won't block the use of an index. In the same way that the use of the "to_char" function prohibited the database from using an index in the earlier code snippet, column functions in where clauses likely prohibit the database from using an index.

In addition, many of the operations for which developers use SQL column functions (data type conversion, value formatting, etc.) are faster in Java than if the database did them. I've had between a 5% and a 20% performance improvement in many applications by opting to avoid some column functions and implementing the logic in Java instead. Another way to look at it is that column functions aren't tunable as we don't control the source code. Implementing that logic in Java makes it code that we can tune if need be.

Moreover, using non-ANSI—standard column functions can also cause portability problems. There are large differences in which column functions are implemented by the database vendors. For instance, one of my favorite Oracle column functions, "decode", which allows you to translate one set of values into another, isn't implemented in many of the other major database platforms. In general, column function use such as the use of "decode" has the potential to become a portability issue.

Always Specify a Column List with a Select Statement (Avoid "Select *")
A common shortcut for developers is to use the "*" in select statements to avoid having to type out a column list. The line below illustrates this shortcut while the snippet immediate following illustrates the alternative where desired columns are explicitly listed.

Select * from customer

Select last_nm, first_nm, address, city, state, customer_nbr from customer

I recommend that developers explicitly list columns in select statements as illustrated above. The reason is that if the columns in any of the tables in the select are reordered or new columns are added, the results obtained with the select-asterisk shortcut will change and the class will have to be modified. For example, suppose a database administrator changes the order of the columns and puts column customer_nbr first (there are valid reasons why a DBA could reorder columns). In addition, suppose the DBA adds a column called country. The developer who used the shortcut select * from customer will have to change code. All the offset references used in processing the Resultset will change. The developer who explicitly listed all columns can be oblivious to the change because the code will still work.

Explicitly listing columns in a select statement is a best practice because it prevents the need for maintenance in some cases.

Always Specify a Column List with an Insert Statement
A common shortcut for developers is to omit the column list in insert statements to avoid having to type out a column list. By default, the column order is the same as physically defined in the table. The first snippet below illustrates this shortcut while the next one illustrates the alternative where desired columns are explicitly listed.

Insert into customer
Values ('Ashmore','Derek','3023 N. Clark','Chicago','IL', 555555)

Insert into customer
(last_nm, first_nm, address, city, state, customer_nbr)
Values (?,?,?,?,?,?)

I recommend that developers explicitly list columns in insert statements as illustrated in the second snippet above. The reason is the same as why we should explicitly list columns in select statements. If the columns in any of the tables in the select are reordered or new columns are added, the insert could generate an exception and insert in class will have to be modified. For example, suppose a DBA, as in the previous example, changes the order of the columns, puts column customer_nbr first and adds a column called country. The developer who used the first shortcut above will have to change code. The developer who explicitly listed all columns may be oblivious to the change because the code may still work. In addition, note that the version in second snippet above uses host variables so the same PreparedStatement can be used for all inserts if there are multiple inserts.

Explicitly listing columns in an insert statement is a best practice because it prevents the need for maintenance in many cases.

Recommendations for Stored Procedure Usage
Stored procedure programming languages (such as Oracle's PL/SQL) are handy and in many cases very convenient. I use them often for utility scripts and data-cleansing activities. I'm often asked about recommendations for stored procedure use in applications, but as their capabilities differ greatly among the major database platforms, I can't give platform-independent advice on the subject. I can, however, provide some thoughts on stored procedure use as it relates to portability and performance.

As these languages differ so greatly, their use within applications causes portability issues. For instance, some stored procedure languages allow procedures to return result sets, some do not. Some stored procedure languages allow temporary tables (usable within the current session only), some do not. We could find many more differences, but I think the point is clear. If portability is a concern, I recommend avoiding use of stored procedures except for database triggers.

Performance is a tougher issue because it differs radically between database vendors. Stored procedure use for some database platforms enhances performance; in others it degrades it. For Oracle platforms I advocate stored procedures within Java applications for database triggers only. For most other situations their use provides no benefit. If you want a more detailed discussion on when and how to use stored procedures, functions and packages within Oracle databases, see my article in JDJ December 1999 (Vol. 4, issue 12).

Summary
This article has discussed several ways to make JDBC code more performance-, maintenance- and portability-friendly on an individual basis. I always recommend team code reviews and documented coding standards as ways to develop more best practices and consistently apply existing practices. Furthermore, team code reviews help further the goals of best practices by improving the maintainability and general quality of code within an application.

More Stories By Derek Ashmore

Derek Ashmore is a consultant and the author of the J2EE
Architect's Handbook, available at www.dvtpress.com.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@MicroservicesExpo Stories
After more than five years of DevOps, definitions are evolving, boundaries are expanding, ‘unicorns’ are no longer rare, enterprises are on board, and pundits are moving on. Can we now look at an evolution of DevOps? Should we? Is the foundation of DevOps ‘done’, or is there still too much left to do? What is mature, and what is still missing? What does the next 5 years of DevOps look like? In this Power Panel at DevOps Summit, moderated by DevOps Summit Conference Chair Andi Mann, panelists l...
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership abi...
The essence of cloud computing is that all consumable IT resources are delivered as services. In his session at 15th Cloud Expo, Yung Chou, Technology Evangelist at Microsoft, demonstrated the concepts and implementations of two important cloud computing deliveries: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). He discussed from business and technical viewpoints what exactly they are, why we care, how they are different and in what ways, and the strategies for IT to transi...
Thanks to Docker and the DevOps revolution, microservices have emerged as the new way to build and deploy applications — and there are plenty of great reasons to embrace the microservices trend. If you are going to adopt microservices, you also have to understand that microservice architectures have many moving parts. When it comes to incident management, this presents an important difference between microservices and monolithic architectures. More moving parts mean more complexity to monitor an...
All organizations that did not originate this moment have a pre-existing culture as well as legacy technology and processes that can be more or less amenable to DevOps implementation. That organizational culture is influenced by the personalities and management styles of Executive Management, the wider culture in which the organization is situated, and the personalities of key team members at all levels of the organization. This culture and entrenched interests usually throw a wrench in the work...
Microservices (μServices) are a fascinating evolution of the Distributed Object Computing (DOC) paradigm. Initial design of DOC attempted to solve the problem of simplifying developing complex distributed applications by applying object-oriented design principles to disparate components operating across networked infrastructure. In this model, DOC “hid” the complexity of making this work from the developer regardless of the deployment architecture through the use of complex frameworks, such as C...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
TechTarget storage websites are the best online information resource for news, tips and expert advice for the storage, backup and disaster recovery markets. By creating abundant, high-quality editorial content across more than 140 highly targeted technology-specific websites, TechTarget attracts and nurtures communities of technology buyers researching their companies' information technology needs. By understanding these buyers' content consumption behaviors, TechTarget creates the purchase inte...
The IT industry is undergoing a significant evolution to keep up with cloud application demand. We see this happening as a mindset shift, from traditional IT teams to more well-rounded, cloud-focused job roles. The IT industry has become so cloud-minded that Gartner predicts that by 2020, this cloud shift will impact more than $1 trillion of global IT spending. This shift, however, has left some IT professionals feeling a little anxious about what lies ahead. The good news is that cloud computin...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore t...
We've all had that feeling before: The feeling that you're missing something that everyone else is in on. For today's IT leaders, that feeling might come up when you hear talk about cloud brokers. Meanwhile, you head back into your office and deal with your ever-growing shadow IT problem. But the cloud-broker whispers and your shadow IT issues are linked. If you're wondering "what the heck is a cloud broker?" we've got you covered.
What if you could build a web application that could support true web-scale traffic without having to ever provision or manage a single server? Sounds magical, and it is! In his session at 20th Cloud Expo, Chris Munns, Senior Developer Advocate for Serverless Applications at Amazon Web Services, will show how to build a serverless website that scales automatically using services like AWS Lambda, Amazon API Gateway, and Amazon S3. We will review several frameworks that can help you build serverle...
Everyone wants to use containers, but monitoring containers is hard. New ephemeral architecture introduces new challenges in how monitoring tools need to monitor and visualize containers, so your team can make sense of everything. In his session at @DevOpsSummit, David Gildeh, co-founder and CEO of Outlyer, will go through the challenges and show there is light at the end of the tunnel if you use the right tools and understand what you need to be monitoring to successfully use containers in your...
In today's enterprise, digital transformation represents organizational change even more so than technology change, as customer preferences and behavior drive end-to-end transformation across lines of business as well as IT. To capitalize on the ubiquitous disruption driving this transformation, companies must be able to innovate at an increasingly rapid pace. Traditional approaches for driving innovation are now woefully inadequate for keeping up with the breadth of disruption and change facing...
In his General Session at 16th Cloud Expo, David Shacochis, host of The Hybrid IT Files podcast and Vice President at CenturyLink, investigated three key trends of the “gigabit economy" though the story of a Fortune 500 communications company in transformation. Narrating how multi-modal hybrid IT, service automation, and agile delivery all intersect, he will cover the role of storytelling and empathy in achieving strategic alignment between the enterprise and its information technology.
Microservices are a very exciting architectural approach that many organizations are looking to as a way to accelerate innovation. Microservices promise to allow teams to move away from monolithic "ball of mud" systems, but the reality is that, in the vast majority of organizations, different projects and technologies will continue to be developed at different speeds. How to handle the dependencies between these disparate systems with different iteration cycles? Consider the "canoncial problem" ...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
The rise of containers and microservices has skyrocketed the rate at which new applications are moved into production environments today. While developers have been deploying containers to speed up the development processes for some time, there still remain challenges with running microservices efficiently. Most existing IT monitoring tools don’t actually maintain visibility into the containers that make up microservices. As those container applications move into production, some IT operations t...
For organizations that have amassed large sums of software complexity, taking a microservices approach is the first step toward DevOps and continuous improvement / development. Integrating system-level analysis with microservices makes it easier to change and add functionality to applications at any time without the increase of risk. Before you start big transformation projects or a cloud migration, make sure these changes won’t take down your entire organization.
In recent years, containers have taken the world by storm. Companies of all sizes and industries have realized the massive benefits of containers, such as unprecedented mobility, higher hardware utilization, and increased flexibility and agility; however, many containers today are non-persistent. Containers without persistence miss out on many benefits, and in many cases simply pass the responsibility of persistence onto other infrastructure, adding additional complexity.