Configure a DataSource application’s JVM heap

Default heap size

If no JVM heap size is configured in a DataSource component’s configuration, then the DataSource component uses the JVM’s default values for initial and maximum heap size.

Java 8 default heap size

When heap size is unspecified on start up, the default maximum heap size for a 64-bit HotSpot Server JVM is set to 25% of the total physical memory, up to a maximum value of 32GB. For more information, see Parallel collection: Default Heap Size in Oracle’s Java 8 documentation.

We recommend that you always configure an initial and maximum heap size for a DataSource’s JVM.

The table below lists the recommended JVM heap configuration for a number of combinations of components and modules:

Recommended heap sizes
Component Modules Initial heap size Max heap size

Liberator

JMX

256MB

256MB

Liberator

Permissioning Service

1024MB

1024MB

Liberator

JMX + Permissioning Service

1024MB

1024MB

Transformer

JMX

256MB

256MB

Transformer

Refiner

Refiner’s heap memory requirements are workload dependent. We recommend beginning with 2048MB and tuning upwards or downwards from there.

Transformer

JMX + Refiner

C DataSource

JMX

256MB

256MB

Java DataSource

JMX

For Caplin Java DataSources, see the DataSource’s java.conf file. For custom Java DataSources, we recommend beginning with 256MB and tuning upwards or downwards from there.

Configuring JVM heap size for a C DataSource

These are generic instructions for all C DataSources. For instructions specific to Liberator and Transformer, follow the instructions below:

To configure the JVM heap size of a C DataSource deployed to a Deployment Framework, edit the file global_config/overrides/<datasource_name>/DataSource/etc/java.conf.

Initial and maximum heap size are configured using the jvm-options configuration item to set the JVM start-up options -Xms (initial heap size) and -Xmx (maximum heap size).

Although all C DataSources set JVM heap size parameters using jvm-options, some C DataSources, like Liberator and Transformer, define heap size in configuration variables. Check the structure of the DataSource’s java.conf file and edit according to the convention used.

Syntax for -Xms and -Xmx JVM start-up parameters:

  • Initial heap size: -Xms{size}{k|K|m|M}

  • Maximum heap size: -Xmx{size}{k|K|m|M}

For example, to set the initial and maximum size for the JVM heap to 1024 megabytes, set the configuration below:

Deployment Framework: global_config/overrides/<datasource_name>/DataSource/etc/java.conf
# Initial heap size
jvm-options -Xms1024m

# Maximum heap size
jvm-options -Xmx1024m
For maximum performance, we recommend that you set initial heap size and maximum heap size to the same value.

Configuring JVM heap size for Liberator

The default configured JVM heap size for a Liberator deployed to a Deployment Framework is 256MB.

To change the configured defaults for JVM heap size, edit the file global_config/overrides/servers/Liberator/etc/java.conf and change the values assigned to the configuration variables below:

Deployment Framework: global_config/overrides/servers/Liberator/etc/java.conf
# Initial JVM heap size (in megabytes)
define LIBERATOR_JVM_XMS_HEAP_SIZE        256

# Maximum JVM heap size (in megabytes)
define LIBERATOR_JVM_XMX_HEAP_SIZE        256
For maximum performance, we recommend that you set initial heap size and maximum heap size to the same value.

Configuring JVM heap size for Transformer

The default configured JVM heap size for a Transformer deployed to a Deployment Framework is 256MB.

To change the configured defaults for JVM heap size, edit the file global_config/overrides/servers/Transformer/etc/java.conf and change the values assigned to the configuration variables below:

Deployment Framework: global_config/overrides/servers/Transformer/etc/java.conf
# Initial JVM heap size (in megabytes)
define JVM_XMS_HEAP_SIZE        256

# Maximum JVM heap size (in megabytes)
define JVM_XMX_HEAP_SIZE        256
For maximum performance, we recommend that you set initial heap size and maximum heap size to the same value.

Configuring JVM heap size for a Java DataSource

To configure the JVM heap size of a Java DataSource deployed to a Deployment Framework, edit the Deployment Framework file kits/<datasource_name>/DataSource/bin/start-jar.sh.

The start-jar.sh file runs the DataSource using the java command. The initial and maximum JVM heap size are passed to the java command in the parameters -Xms (initial heap size) and -Xmx (maximum heap size).

Some variations of start-jar.sh set literal values for -Xms and -Xmx; other variations set Bash variables for -Xms and -Xmx. Check the structure of the DataSource’s start-jar.sh and edit according to the convention used.

Syntax for -Xms and -Xmx JVM start-up parameters:

  • Initial heap size: -Xms{size}{k|K|m|M}

  • Maximum heap size: -Xmx{size}{k|K|m|M}

For example, to set the initial size and the maximum size for the JVM heap to 1024 megabytes, prepend -Xms1024m -Xmx1024m to the java command’s arguments:

Deployment Framework: kits/<datasource-name>/DataSource/bin/start-jar.sh
java ... -Xms1024m -Xmx1024m ...
For maximum performance, we recommend that you set initial heap size and maximum heap size to the same value.

Tuning JVM heap size

This section provides a brief introduction to tuning a DataSource application’s JVM heap size. For more detail, see the Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide on the Oracle website.

The twin goals of tuning a DataSource application’s JVM heap size are as follows:

  • To ensure that the DataSource JVM heap has enough capacity for known production-level workloads plus enough capacity in reserve for unexpected higher workloads.

  • To ensure that the frequency and duration of garbage collection events are within acceptable performance limits for your application.

The instructions below use the jstat JVM profiler to collect statistics from a running JVM. If you prefer a graphical interface, try VisualVM with the Visual GC plugin.

To tune JVM heap size, follow the steps below:

  1. Stop the DataSource application if it is running.

  2. Set an initial and maximum heap size in the DataSource’s configuration. For instructions on how to do this, see the links below:

  3. Start the DataSource application.

  4. Discover the LVMID (local virtual machine identifier) for the JVM you want to tune. On Linux, this is synonymous with the process identifier (PID). The example below discovers the PID for a local Transformer server:

    $ pgrep transformer
    19533
  5. Run a JVM profiler to monitor the JVM’s heap and garbage collection statistics. The example below uses the jstat command to monitor the JVM with PID 19533:

    $ jstat -gc -h 10 -t 19533 15s

    For information on the options used above, see jstat on the Oracle website.

  6. Run a production-level workload against your Caplin stack and monitor the results.

    If the DataSource JVM raises out-of-memory errors during your tests, then stop the test, increase the maximum heap size for the DataSource component, and begin again.

  7. Analyse the results.

    Sample jstat output for a Transformer with a heap size of 256MB
    Timestamp        S0C    S1C     S0U    S1U      EC       EU        OC         OU       MC       MU    CCSC    CCSU     YGC    YGCT   FGC     FGCT     GCT
             9746.5 10752.0 10752.0  0.0    0.0   65536.0  65139.7   175104.0    2861.2   11136.0 10272.2 1408.0 1173.8      5    0.039   3      0.061    0.100
             9761.5 10752.0 10752.0  0.0    0.0   65536.0  65536.0   175104.0    2861.2   11136.0 10272.2 1408.0 1173.8      5    0.039   3      0.061    0.100
             9776.5 10752.0 10752.0 2816.0  0.0   65536.0   1497.2   175104.0    2869.2   11136.0 10320.8 1408.0 1175.4      6    0.063   3      0.061    0.123

    For a description of each of the columns in the output above, see the jstat documentation on the Oracle website.

    Analysing jstat output is a complex topic. For an introduction, see Analysing jstat output below.

  8. Repeat the test with smaller and larger heap sizes to observe how different heap sizes affect garbage collection performance.

Analysing jstat output

A full discussion of how to analyse jstat output is beyond the scope of this page, but some questions to ask of the data include:

  • How much memory is in use (EU + S0U + S1U + OU) after a full garbage collection (FGC) event?

  • Young-generation garbage collection:

    • What is the average interval between executions?

    • What is the total execution time? (YGCT)

    • What is the average execution time? (YGCT/YGC)

  • Full garbage collection:

    • What is the average interval between executions?

    • What is the total execution time? (FGCT)

    • What is the average execution time? (FGCT/FGC)

If there is relatively little heap memory free after a full garbage collection, then consider increasing the heap size.

If garbage collection events occur frequently, then consider increasing the heap size.

As a general rule, increasing the heap size reduces the number of garbage collector executions, but increases execution time. In practice, this relationship is also affected by generational garbage collection, object life time, and the relatively faster performance of the young-generation garbage collector. If an application produces a large number of short-lived objects, increasing the heap size may reduce overall execution time by reducing the promotion of objects to the old area, where garbage collection is more costly.


See also: