Sunday, July 8, 2018

Building a high performance Java application

Most of the time developers expect that performance optimization is a complicated topic that requires a lot of experience and knowledge.Optimizing an application to get the best performance possible isn’t an easy task. But that doesn’t mean that you can’t do anything if you haven’t acquired that knowledge. There are several easy ways to follow recommendations and best practices which help you to create a well-performing application.

System resources like threads, database connections, socket connections, File IO streams, Socket IO streams, JNI calls are the real bottleneck of a java application. If you will reuse them in a proper way then there will be a big performance impact into an application.

Java application bottleneck (CPU, IO, Heap usage) factors 

  • Excessive big data encryption/decryption with strongest algorithm 
  • Excessive remote web service call and creating/closing socket connections
  • Frequent File IO operations and Reading/Scanning a physical directory to lookup specific file
  • Huge loggers writing into log file (writing tons and tons of data to disk)
  • Regular expression and compiling regex Patterns
  • JNI Java Native method calls like C/C++/.Net
  • Too many application active threads
  • Excessive GC cycles going on & Large war size with maximum number of jars
  • Uncontrolled pooling (connections/objects/threads) &  complicated SQL query execution
  • Code problems for excessive back-end calls like 'infinite loops'




How would you improve performance of a Java application

  • Pool valuable system resources like threads, database connections, socket connections etc. Emphasize on reuse of threads from a pool of threads. Creating new threads and discarding them after use can adversely affect performance. Also consider using multi-threading in your single-threaded applications where possible to enhance performance. Optimize the pool sizes based on system and application specifications and requirements. Having too many threads in a pool also can result in performance and scalability problems due to consumption of memory stacks (i.e. each thread has its own stack.) and CPU context switching (i.e. switching between threads as opposed to doing real computation.)
  • Minimize network overheads by retrieving several related items simultaneously in one remote invocation if possible. Remote method invocations involve a network round-trip, marshaling and unmarshaling of parameters, which can cause huge performance problems if the remote interface is poorly designed.
  • Distributed cache(Infinispan  in-memory data grid platform for fast data access , In-memory data grids are commonly used as low-latency, highly available and elastic data storage backends, often as NoSQL solutions i,e for static data storage use 2nd level cache mechanism to avoid the network database round trips and db remote connections bottleneck.
  • High-performance asynchronous messaging libraryZeroMQ is used in distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ZeroMQ system can run without a dedicated message broker.Connect your code in any language, on any platform.It carries messages across inproc, IPC, TCP, TIPC, multicast ,smart patterns like pub-sub, push-pull, and router-dealer and high-speed asynchronous I/O engines, in a tiny library.
  • Persistent connections (A persistent connection (HTTP persistent connection) is a network communication channel that remains open for further HTTP requests and responses rather than closing after a single exchange) , It also called HTTP keep-alive, or HTTP connection reuse, is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair. The newer HTTP/2 protocol uses the same idea and takes it further to allow multiple concurrent requests/responses to be multiplexed over a single connection.
  • Non blocking Asynchronous IO(Log4j2) mechanism , LMAX Disruptor technology. Asynchronous Loggers internally use the Disruptor, a lock-free inter-thread communication library, instead of queues, resulting in higher throughput and lower latency.
  • Streams for IO Operation (Google Protocol Buffer for object serialization and deserialization) Protocol buffers, usually referred as Protobuf, is a protocol developed by Google to allow serialization and deserialization of structured data. this protocol even surpassed JSON with better performance, better maintainability and smaller size
  • Java Design Patterns to manage the objects efficiently - Flyweight design pattern to create a pool of shared objects , static factory methods instead of constructors to recycle immutable objects , visitor pattern to avoid “instanceof” constructs in frequently accessed methods. 
  • Choosing the Right Garbage Collector - However, the current generation of garbage collectors has mostly solved that issue and, with proper tuning and sizing, can lead to having no noticeable collection cycles. That being said, it does take an in-depth understanding of both GC on the JVM as a whole, but also the specific profile of the application – to get there
  • JDBC Performances - JDBC Connection pooling (The creation of a new connection takes time, which you can avoid if you reuse an existing connection.) ,JDBC Batching (we handle persistence is trying to batch operations wherever possible. JDBC batching allows us to send multiple SQL statements in a single database roundtrip.),Statement Caching (Depending on the underlying JDBC Driver, you can cache PreparedStatement both on the client-side (the Driver) or databases-side.
  • Architectural Improvements - If you write applications with poor architecture but performs well for the current requirements, what will happen if the requirements grow and your architecture is not flexible enough to extend and creates a maintenance nightmare where fixing a code in one area would break your code in another area. This will cause your application to be re-written.
Most applications need to retrieve data from and save/update data into one or more databases. Database calls are remote calls over the network. In general data should be lazily loaded (i.e. load only when required as opposed to pre-loading from the database with a view that it can be used later) from a database to conserve memory but there are use cases (i.e. need to make several database calls) where eagerly loading data and caching can improve performance by minimizing network trips to the database. Data can be eagerly loaded with a help of SQL scripts with complex joins or stored procedures and cached using third party frameworks or building your own framework.

How would you refresh Infinispan 2nd level remote cache?

  • Timed cache strategy where the cache can be replenished periodically (i.e. every 30 minutes, every hour etc). This is a simple strategy applicable when it is acceptable to show dirty data at times and also the data in the database does not change very frequently.
  • Dirty check strategy where your application is the only one which can mutate (i.e. modify) the data in the database. You can set a “isDirty” flag to true when the data is modified in the database through your application and consequently your cache can be refreshed based on the “isDirty” flag.

How would you refresh your cache(distributed data grid ) if your database is shared by more than one application ?

  • Database triggers: You could use database triggers to communicate between applications sharing the same database and write pollers which polls the database periodically to determine when the cache should be refreshed.
  • XML messaging (Enterprise – JMS) to communicate between other applications sharing the same database or separate databases to determine when the cache should be refreshed.
  • Distributed platform : Infinispan distributing data evenly across the cluster and use it from different language i,e language-independent service accessed remotely over a variety of protocols (Hot Rod, REST, Memcached and WebSockets),The purpose of Infinispan is to expose a data structure that is distributed, highly concurrent and designed ground-up to make the most of modern multi-processor and multi-core architectures. It is often used as a distributed cache, but also as a NoSQL key/value store or object database.

Optimize your I/O operations: 

Use buffering when writing to and reading from files and/or streams. Avoid writers/readers if you are dealing with only ASCII characters. You can use streams instead, which are faster. Avoid premature flushing of buffers. Also make use of the performance and scalability enhancing features such as non-blocking and asynchronous I/O, mapping of file to memory etc offered by the NIO (New I/O).

Establish whether you have a potential memory problem and manage your objects efficiently: 
1). Remove references to the short-lived objects from long-lived objects like Java collections etc to minimize any potential memory leaks. Also reuse objects where possible. It is cheaper to recycle objects than creating new objects each time. 

2). Avoid creating extra objects unnecessarily. For example use mutable StringBuffer/StringBuilder classes instead of immutable String objects in computation expensive loops  and use static factory methods instead of constructors to recycle immutable objects. 

3). Automatic garbage collection is one of the most highly touted conveniences of Java. However, it comes at a price. Creating and destroying objects occupies a significant chunk of the JVM's time. Wherever possible, you should look for ways to minimize the number of objects created in your code:
  • For complex objects that are used frequently, consider creating a pool of recyclable objects rather than always instantiating new objects. This adds additional burden on the programmer to manage the pool, but in selected cases it can represent a significant performance gain. Use flyweight design pattern to create a pool of shared objects. Flyweights are typically instantiated by a flyweight factory that creates a limited number of flyweights based on some criteria. Invoking object does not directly instantiate flyweights. It gets it from the flyweight factory, which checks to see if it has a flyweight that fits a specific criteria (e.g. with or without GST etc) in the pool (e.g. HashMap). If the flyweight exists then return the reference to the flyweight. If it does not exist, then instantiate one for the specific criteria and add it to the pool (e.g. HashMap) and then return it to the invoking object.
  • If repeating code within a loop, avoid creating new objects for each iteration. Create objects before entering the loop (i.e. outside the loop) and reuse them if possible.
  • Use lazy initialization when you want to distribute the load of creating large amounts of objects. Use lazy initialization only when there is merit in the design. 

Where applicable apply the following performance tips in your code: 

  • Use ArrayLists, HashMap etc as opposed to Vector, Hashtable etc where possible. This is because the methods in ArrayList, HashMap etc are not synchronized. Even better is to use just arrays where possible. 
  • Using StringBuilder for String Concatenation String concatenation is a very common operation, and also an inefficient one. Simply put, the problem with using += to append Strings is that it will cause an allocation of a new String with every new operation.
  • Use + to concatenate Strings in in one statementWhen you implemented your first application in Java, someone probably told you that you shouldn’t concatenate Strings with +. And that’s correct if you’re concatenating Strings in your application logic. Strings are immutable, and the result of each String concatenation is stored in a new String object. That requires additional memory and slows down your application, especially if you’re concatenating multiple Strings within a loop.
  • Use primitives where possibleAnother quick and easy way to avoid any overhead and improve the performance of your application is to use primitive types instead of their wrapper classes. So, it’s better to use an int instead of an Integer, or a double instead of a Double. That allows your JVM to store the value in the stack instead of the heap to reduce memory consumption and overall handle it more efficiently.
  • Avoid RecursionRecursive code logic leading to StackOverFlowError is another common scenario in Java applications.If we cannot do away with recursive logic, tail recursive as an alternative is better.
  • Use Regular Expressions Carefully , Regular expressions are useful in a lot of scenarios, but they do, more often than not, have a very high performance cost. It’s also important to be aware of a variety of JDK String methods, which use regular expressions, such as String.replaceAll(), or String.split().
  • Avoid Creating and Destroying Too Many Threads (uses a pool of threads called the ForkJoinPool, which manages the worker threads)- Creating and disposing of threads is a common cause of performance issues on the JVM, as thread objects are relatively heavy to create and destroy.If your application uses a large number of threads, using a thread pool makes a lot of sense, to allow these expensive objects to be reused.To that end, the Java ExecutorService is the foundation here and provides a high-level API to define the semantics of the thread pool and interact with it.
  • Set the initial capacity of a collection (e.g. ArrayList, HashMap etc) and StringBuffer/StringBuilder appropriately. This is because these classes must grow periodically to accommodate new elements. So, if you have a very large ArrayList or a StringBuffer, and you know the size in advance then you can speed things up by setting the initial size appropriately.
  • Minimize the use of casting or runtime type checking like instanceof in frequently executed methods or in loops. The “casting” and “instanceof” checks for a class marked as final will be faster. Using “instanceof” construct is not only ugly but also unmaintainable. Look at using visitor pattern to avoid “instanceof” constructs in frequently accessed methods. 
  • Do not compute constants inside a large loop. Compute them outside the loop. For applets compute it in the init() method. Avoid nested loops (i.e. a “for” loop within another “for” loop etc) where applicable and make use of a Collection class. 
  • Exception creation can be expensive because it has to create the full stack trace. The stack trace is obviously useful if you are planning to log or display the exception to the user. But if you are using your exception to just control the flow, which is not recommended, then throw an exception, which is precreated. An efficient way to do this is to declare a public static final Exception in your exception class itself.
  • Avoid using System.out.println and use logging frameworks like Log4J2 etc, which uses Asynchronous I/O buffers.
  • Minimize calls to Date, Calendar, etc related classes.
  • Minimize JNI calls in your code

When in the development process should you consider performance issues? 

Set performance requirements in the specifications, include a performance focus in the analysis and design and also create a performance test environment.

When designing your new code, what level of importance would you give to the following attributes? 

  • Performance
  • Maintainability
  • Extendibility
  • Ease of use
  • Scalability
You should not compromise on architectural principles for just performance. You should make effort to write architecturally sound programs as opposed to writing only fast programs. If your architecture is sound enough then it would allow your program not only to scale better but also allows it to be optimized for performance if it is not fast enough. So you should think about extendibility (i.e. ability to evolve with additional requirements), maintainability, ease of use, performance and scalability (i.e. ability to run in multiple servers or machines) during the design phase. List all possible design alternatives and pick the one which is conducive to sound design architecturally (i.e. scalable, easy to use, maintain and extend) and will allow it to be optimized later if not fast enough.

How would you detect and minimize memory leaks in Java?

In Java, memory leaks are caused by poor program design where object references are long lived and the garbage collector is unable to reclaim those objects.

Detecting memory leaks:

  • Use tools like JProbe, OptimizeIt etc to detect memory leaks.
  • Use operating system process monitors like task manager on NT systems, ps, vmstat, iostat, netstat etc on UNIX systems.
  • Write your own utility class with the help of totalMemory() and freeMemory() methods in the Java Runtime class. Place these calls in your code strategically for pre and post memory recording where you suspect to be causing memory leaks. An even better approach than a utility class is using dynamic proxies or Aspect Oriented Programming (AOP) for pre and post memory recording where you have the control of activating memory measurement only when needed.

Minimizing memory leaks:


In Java, typically memory leak occurs when an object of a longer lifecycle has a reference to objects of a short life cycle. This prevents the objects with short life cycle being garbage collected. The developer must remember to remove the references to the short-lived objects from the long-lived objects. Objects with the same life cycle do not cause any issues because the garbage collector is smart enough to deal with the circular references.

  • Design applications with an object’s life cycle in mind, instead of relying on the clever features of the JVM. Letting go of the object’s reference in one’s own class as soon as possible can mitigate memory problems. Example: myRef = null;
  • Unreachable collection objects can magnify a memory leak problem. In Java it is easy to let go of an entire collection by setting the root of the collection to null. The garbage collector will reclaim all the objects (unless some objects are needed elsewhere). 
  • Use weak references if you are the only one using it. The WeakHashMap is a combination of HashMap and WeakReference. This class can be used for programming problems where you need to have a HashMap of information, but you would like that information to be garbage collected if you are the only one referencing it. 
  • Free native system resources like AWT frame, files, JNI etc when finished with themExample: Frame, Dialog, and Graphics classes require that the method dispose() be called on them when they are no longer used, to free up the system resources they reserve.

Why does the JVM crash with a core dump or a Dr.Watson error?

Any problem in pure Java code throws a Java exception or error. Java exceptions or errors will not cause a core dump (on UNIX systems) or a Dr.Watson error (on WIN32systems). Any serious Java problem will result in an OutOfMemoryError thrown by the JVM with the stack trace and consequently JVM will exit. These Java stack traces are very useful for identifying the cause for an abnormal exit of the JVM. So is there a way to know that OutOfMemoryError is about to occur? The Java J2SE 5.0 has a package called java.lang.management which has useful JMX beans that we can use to manage the JVM. One of these beans is the MemoryMXBean. 

An OutOfMemoryError can be thrown due to one of the following 4 reasons:

  • JVM may have a memory leak due to a bug in its internal heap management implementation. But this is highly unlikely because JVMs are well tested for this.
  • The application may not have enough heap memory allocated for its running. You can allocate more JVM heap size (with –Xmx parameter to the JVM) or decrease the amount of memory your application takes to overcome this. To increase the heap space: java -Xms1024M -Xmx1024M Care should be taken not to make the –Xmx value too large because it can slow down your application. The secret is to make the maximum heap size value the right size.
  • Another not so prevalent cause is the running out of a memory area called the “perm” which sits next to the heap. All the binary code of currently running classes is archived in the “perm” area. The ‘perm’ area is important if your application or any of the third party jar files you use dynamically generate classes. For example: “perm” space is consumed when XSLT templates are dynamically compiled into classes, J2EE application servers, JasperReports, JAXB etc use Java reflection to dynamically generate classes and/or large amount of classes in your application. To increase perm space: java -XX:PermSize=256M -XX:MaxPermSize=256M
  • The fourth and the most common reason is that you may have a memory leak in your application when an object of a longer lifecycle has a reference to objects of a short life cycle.

So why does the JVM crash with a core dump or Dr.Watson error ? 


Both the core dump on UNIX operating system and Dr.Watson error on WIN32 systems mean the same thing. The JVM is a process like any other and when a process crashes a core dump is created. A core dump is a memory map of a running process. This can happen due to one of the following reasons:

  • Using JNI (Java Native Interface) code, which has a fatal bug in its native code. Example: using Oracle OCI drivers, which are written partially in native code or JDBC-ODBC bridge drivers, which are written in non Java code. Using 100% pure Java drivers (communicates directly with the database instead of through client software utilizing the JNI) instead of native drivers can solve this problem. We can use Oracle thin driver, which is a 100% pure Java driver.
  • The operating system on which your JVM is running might require a patch or a service pack. 
  • The JVM implementation you are using may have a bug in translating system resources like threads, file handles, sockets etc from the platform neutral Java byte code into platform specific operations. If this JVM’s translated native code performs an illegal operation then the operating system will instantly kill the process and mostly will generate a core dump file, which is a hexadecimal file indicating program’s state in memory at the time of error. The core dump files are generated by the operating system in response to certain signals. Operating system signals are responsible for notifying certain events to its threads and processes. The JVM can also intercept certain signals like SIGQUIT which is kill -3 < process id > from the operating system and it responds to this signal by printing out a Java stack trace and then continue to run. The JVM continues to run because the JVM has a special built-in debug routine, which will trap the signal -3. On the other hand signals like SIGSTOP (kill -23 ) and SIGKILL (kill -9 ) will cause the JVM process to stop or die. The following JVM argument will indicate JVM not to pause on SIGQUIT signal from the operating system.
          java –Xsqnopause

List of performance measurement tools in java:


No.
Tool  Name
Purpose
1
jVisualVM
JVM Profiler
2
GCeasy
Universal Garbage Collection Log Analysis
3
fastThread
Java Thread Dump Analyzer
4
HeapHero
Java Heap Dump Analyzer
5
sar
Real time monitoring Linux System Performance
6
ksar
Java based frontend tool which plots a nice easy to understand graph over a period of time
7
JMeter
A pure Java application designed to load test functional behavior and measure performance.
8
jvmtop
Provide JVM internals (e.g. memory information) of running JVMs / java processes.
9
MAT
Eclipse Memory Analyzer is a fast and feature-rich Java heap analyzer
10
jstack
Thread Dump - Java HotSpot VM to provide information about performance and resource consumption of running applications
11
jmap
Heap Dump - Take histogram and heap dump from running java process.
12
jstat
Runtime JVM statistics monitoring using command line
13
Visual GC + jstatd
Visual Garbage Collection Monitoring Tool using server side jstatd binding port
14
JMH
JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM


Reference URL’s:

  • http://gceasy.io/
  • http://fastthread.io/
  • http://heaphero.io/
  • https://www.eclipse.org/mat/
  • https://code.google.com/archive/p/jvmtop/
  • https://visualvm.github.io/
  • http://openjdk.java.net/projects/code-tools/jmh/

7 comments:

  1. It is an informative post.

    ReplyDelete
  2. I feel really happy to have seen your webpage and look forward to so many more entertaining times reading here. Thanks once more for all the details. Building surveillance camera system upgrade

    ReplyDelete
  3. Hi to everybody, here everyone is sharing such knowledge, so it’s fastidious to see this site, and I used to visit this blog daily Latest & Bugs Free Version

    ReplyDelete
  4. I see the greatest contents on your blog and I extremely love reading them. Download Tubidy APK for Android Free Latest Version

    ReplyDelete
  5. I found Hubwit as a transparent s ite, a social hub which is a conglomerate of Buyers and Sellers who are ready to offer online digital consultancy at decent cost. המלצות בעלי מקצוע

    ReplyDelete