Monday, November 12, 2018

Micro Benchmarking with JMH - "Measure, don’t guess"


Java Micro Benchmark with JMH

  • Java Microbenchmark Harness (JMH) is a Java toolkit by OpenJDK for creating benchmarks.  
  • It is an open source framework provides benchmarking for measuring the performance of your Java code.

"Measure, don’t guess" :-


We’ve all faced performance problems in our projects and were asked to tune random bits in our source code and  hoping that performance will get improved. Instead, we should set up a stable performance environment  (operating system, JVM, application server, database), measure continuously, set some performance goals then, take action when our goals are not achieved. Continuous delivery, continuous testing is one thing,  but continuous measuring is another step.


What is JMH ?

  • JMH is a Java harness for building, running, and analyzing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM.

When to use Micro Benchmarking ?


  • You identified the code segment that eats most of the resources in your application and the improvement can be tested by micro benchmarks.
  • You can not identify the code segment that will eat most of the resources in an application but you suspect it.

How to do a Micro Benchmark ?

  • We often do not worry about the performance requirements. We start with building functionality and concentrate on making things work and don’t focus on how. Once the software goes to production, we will be facing the inevitable. 
  • It is good to have the performance objectives written down before writing the code.
    We should weigh the importance of performance with respect to the functional requirements and should come up with a balance between them.
  • Just because your code runs in a certain way in an extremely isolated artificial situation does not mean it will run in the  same way inside your production code. To name but a few issues, in a real program the CPU caches will be subject to pressures from other parts of your code, any object creation will have a downstream effect on GC and the JIT may have inlined and compiled code from other parts of your code that conflict with the code you have benchmarked. 

Why are Java Microbenchmarks Hard ?

  • Writing benchmarks that correctly measure the performance of a small part of a larger application is hard. There are many optimizations that the JVM or underlying hardware may apply to your component when the benchmark executes that component in isolation. These optimizations may not be possible to apply when the component is running as part of a larger application. Badly implemented microbenchmarks may thus make you believe that your component's performance is better than it will be in reality.
  • Writing a correct Java microbenchmark typically entails preventung the optimizations the JVM and hardware may apply during microbenchmark execution which could not have been applied in a real production system. That is what JMH - the Java Microbenchmark Harness - is helping you do.  

Purpose

  • Benchmark is the process of recording the performance of a system.
  • JMH is a Java harness for building, running, and analyzing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM.
  • it takes care of warm up iterations, forking JVM processes so that benchmarks don't interfere with each other, collating results and presenting then in a uniform manner.

Micro benchmarks are generally done for two reasons.

  • To compare different approaches of code which implements the same logic and choose the best one to use.
  • To identify any bottlenecks in a suspected area of code during performance optimization.

Benchmarking Modes:

At a basic level, JMH has two main types of measure:  throughput and time-based.

Throughput Measuring

  • Throughput is the amount of operations that can be completed per the unit of time. JMH maintains a collection of successful and failed operations as the framework increases the amount of load on the test.  
  • Ensure the method or test is well isolated and dependencies like test object creation is done outside of the method or pre-test in a setup method.  
  • With Throughput, the higher the value, the better as it indicates that more operations can be run per unit-time.

Time-Based Measuring

  • Time-based measuring is the counter-partner to throughput. 
  • The goal of time-based measuring is to identify how long a particular operation takes to run per unit-time.

JMH Commands

  • i - Number of measurement iterations to do. Measurement iterations are counted towards the benchmark score.
  • bs - Batch size: number of benchmark method calls per operation.
  • r - Minimum time to spend at each measurement iteration.
  • wi - Number of warmup iterations to do.
  • wbs - Warmup batch size: number of benchmark method calls per operation.
  • w - Minimum time to spend at each warmup iteration.
  • to - Timeout for benchmark iteration.
  • t - Number of worker threads to run with.
  • bm - Benchmark mode.
  • si - Should JMH synchronize iterations?
  • gc - Should JMH force GC between iterations?
  • foe - Should JMH fail immediately if any benchmark had experienced an unrecoverable error?
  • v - Verbosity mode.
  • f - How many times to fork a single benchmark. Use 0 to disable forking altogether.
  • wf - How many warmup forks to make for a single benchmark. 

 There are two way to run a benchmark:

  1. The recommended way is to generate a pom file and use that to create a jar. The mvn install uses the shade plugin to create a jar file so that you don't have create a main method.
  2. Add the JMH maven dependencies to your pom file and then add a main method to your code using the Runner object.This is useful if you want to run in your IDE. 

 Method 1 - Using command line argument


 Generate a pom file using this mvn command.

mvn archetype:generate
          -DinteractiveMode=false
          -DarchetypeGroupId=org.openjdk.jmh
          -DarchetypeArtifactId=jmh-java-benchmark-archetype
          -DgroupId=com.jenkov
          -DartifactId=first-benchmark
          -Dversion=1.0

  • This will create a project called test with an empty benchmark in it called MyBenchmark.
  • To build the project just use mvn clean install. This will build a jar called benchmark.jar
  • It is the benchmark.jar that should be run to run the benchmark not any other jars produced along the way that will be in your target folder.

To run use the command  java -jar target/benchmarks.jar JMHSample04 -wi 5 -t 1 -i 5 -f 1


 Method 2 - For running in your IDE


Add these dependencies to your Maven pom.xml file:

  org.openjdk.jmh
  jmh-core
  1.5.1


  org.openjdk.jmh
  jmh-generator-annprocess
  1.5.1


Then decide which methods you want benchmarked and add the annotation @Benchmark to them. If you need any initialisation code add it in a method which should be marked @Setup. 

The easiest way to run the benchmark is by adding by adding this implementation into your main method.
public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(JMHSample04.class.getSimpleName()).
                warmupIterations(5).
                measurementIterations(5).
                threads(1).
                forks(1).
                build();
        new Runner(opt).run();
}


JMH benchmark Results :

As an example to see the format of a JMH benchmark, this is what my results looked like:

Example 1 : Arrays sort and Collection sort Benchmark Comparison

package org.sample;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;

@State(Scope.Thread)
public class JMHSample02 {
 List arrayList;
 int[] array;
 Random random;

 @Setup(Level.Trial)
 public void init() {
  random = new Random();
  array = new int[150];
  arrayList = new ArrayList();
  for (int i = 0; i < 150; i++) {
   int randomNumber = random.nextInt();
   array[i] = randomNumber;
   arrayList.add(new Integer(randomNumber));
  }
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.SECONDS)
 public void arraysSort() {
  Arrays.sort(array);

 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.SECONDS)
 public void collectionsSort() {
  Collections.sort(arrayList);
 }
}


JMH benchmark Results :



Example 2 : Java 8 Stream API (Parallel Stream vs Sequential Stream vs  for-each vs iterator )


Java 8 claimed that Stream API would employ multi-core CPUs  to process data in parallel fashion which would eliminate difficulties  of  dealing with multi-thread  codes , as well as gaining performance advantages obtained from  multicored CPUs. 
 
Stream parallel one is the slowest one among four approaches of summing  integers in array.

package org.sample;

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;

@State(Scope.Benchmark)
public class JMHSample_01_HelloWorld {

 volatile int counts = 9999999;
 volatile List values = new ArrayList<>(counts);
 volatile int processors = Runtime.getRuntime().availableProcessors();

 @Setup
 public void setup() {
  populate(values);

 }

 public void populate(List list) {
  for (int i = 0; i < counts; i++) {
   if (i < counts / 2) {
    list.add(i, i);
   } else {
    list.add(i, i - counts);
   }
  }
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.MICROSECONDS)
 public int iteratorSumIntegers() {
  int result = 0;
  Iterator ite = values.iterator();
  while (ite.hasNext()) {
   result += (int) ite.next();
  }
  return result;
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.MICROSECONDS)
 public int fooEachSumIntegers() {
  int result = 0;
  for (Integer value : values) {
   result += value.intValue();
  }
  return result;
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.MICROSECONDS)
 public int parallelSumIntegers() {
  int result = values.parallelStream().mapToInt(i -> i).sum();
  return result;
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.MICROSECONDS)
 public int sequentialSumIntegers() {
  int result = values.stream().mapToInt(i -> i).sum();
  return result;
 }
}


JMH benchmark Results :


Example 3 : "Protobuf performs up to 6 times faster than JSON."


Protobuf, the binary format crafted by Google, surpasses JSON performance even on JavaScript environments like Node.js/V8 and web browsers.

Protocol buffers, or Protobuf, is a binary format created by Google to serialize data between different services. Google made this protocol open source and now it provides support, out of the box, to the most common languages, like JavaScript, Java, C#, Ruby and others. In our tests, it was demonstrated that this protocol performed up to 6 times faster than JSON.



package org.sample;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import com.proto.StudentBookProtos;
import com.proto.StudentService;
import com.proto.model.StudentBook;

@State(Scope.Thread)
public class JMHSample04 {

 private byte[] studentBookAsJSON;
 private StudentBook studentBookObject;
 private StudentBookProtos.StudentBook.Builder studentBuilder;
 private StudentService studentService;

 @Setup(Level.Trial)
 public void init() {
  studentService = new StudentService();
  studentBookObject = studentService.getStudentBookObject();
  studentBuilder = studentService.getStudentBookProtoObject();
  studentBookAsJSON = studentService.studentBookAsJSONByte(studentBookObject);

  try (FileOutputStream output = new FileOutputStream("abc_proto.txt")) {
   studentBuilder.build().writeTo(output);
   output.close();
  } catch (IOException e) {
   e.printStackTrace();
  }

  try (FileOutputStream output = new FileOutputStream("abc_json.txt")) {
   output.write(studentBookAsJSON);
   output.close();
  } catch (IOException e) {
   e.printStackTrace();
  }

 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.SECONDS)
 public void writeProtoToFile() {
  try (FileOutputStream output = new FileOutputStream("abcd_proto.txt")) {
   studentBuilder.build().writeTo(output);
   output.close();
  } catch (IOException e) {
   e.printStackTrace();
  }
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.SECONDS)
 public void writeJSONToFile() {
  try (FileOutputStream output = new FileOutputStream("abcd_json.txt")) {
   output.write(studentBookAsJSON);
   output.close();
  } catch (IOException e) {
   e.printStackTrace();
  }
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.SECONDS)
 public com.proto.StudentBookProtos.StudentBook deserialize_protobuf_to_student_object() {
  com.proto.StudentBookProtos.StudentBook studentBook = null;
  try (FileInputStream inputStream = new FileInputStream("abc_proto.txt")) {
   studentBook = com.proto.StudentBookProtos.StudentBook.parseFrom(inputStream);
   inputStream.close();
  } catch (IOException e) {
   e.printStackTrace();
  }
  return studentBook;
 }

 @Benchmark
 @BenchmarkMode(Mode.Throughput)
 @OutputTimeUnit(TimeUnit.SECONDS)
 public StudentBook deserialize_json_to_student_object() {
  StudentBook studentBook = null;
  try (FileInputStream inputStream = new FileInputStream("abc_json.txt")) {
   byte fileContent[] = new byte[(int) inputStream.available()];
   inputStream.read(fileContent);
   studentBook = studentService.getStudentBookFromJSONByte(fileContent);
   inputStream.close();
  } catch (IOException e) {
   e.printStackTrace();
  }
  return studentBook;
 }
}

JMH benchmark Results :

No comments:

Post a Comment