Version

    17. Engine Configuration

    CloverDX internal settings (defaults) are stored in the defaultProperties file located in the CloverDX engine. This source file contains various parameters that are loaded at run-time and used during transformation execution. We do not recommend changing values in this file.

    In Designer, the path to the file is plugins/com.cloveretl.gui/lib/lib/cloveretl.engine.jar. In Server Core, the path to the file is WEB-INF/lib/cloveretl.engine.jar.

    If you need to change the default setting, create a local file with only those properties you need to override and place the file in the project directory. To instruct CloverDX to retrieve the properties from this local file, go to Window  Preferences  CloverDX  CloverDX Runtime and either define the path to the file in the CloverDX Engine Properties field or put the following parameter in the VM parameters field:

    -Dclover.engine.config.file=/full/path/to/file.properties

    Note: engine properties have to be set for each workspace individually.

    Content of defaultProperties file

    Here we present some of the properties and their values as they are presented in the defaultProperties file:

    • Record.RECORD_LIMIT_SIZE = 268435456

      Limits the maximum size of a record. Theoretically, the limit can be set very high, but you should keep it as low as possible for an easier error detection. For more details on memory demands, see Edge Memory Allocation.

    • Record.RECORD_INITIAL_SIZE = 65536

      Sets the initial amount of memory allocated to each record. The memory can grow dynamically up to Record.RECORD_LIMIT_SIZE, depending on how memory-greedy an edge is. See Edge Memory Allocation.

    • Record.FIELD_LIMIT_SIZE = 268435456

      Limits the maximum size of one field within a record. For more details on memory demands, see Edge Memory Allocation.

    • Record.FIELD_INITIAL_SIZE = 65536

      Sets the initial amount of memory allocated to each field within a record. The memory can grow dynamically up to Record.FIELD_LIMIT_SIZE, depending on how memory-greedy an edge is. See Edge Memory Allocation.

    • Record.DEFAULT_COMPRESSION_LEVEL = 5

      This sets the compression level for compressed data fields (cbyte).

    • DEFAULT_INTERNAL_IO_BUFFER_SIZE = 32768

      Determines the internal buffer size the components allocate for I/O operations. Increasing this value affects performance negligibly.

    • USE_DIRECT_MEMORY = false

      The CloverDX engine can use direct memory for data records manipulation. For example, underlying memory of CloverBuffer (container for serialized data records) uses direct memory (if the usage is enabled). This attribute is by default false.

      Using direct memory can slightly improve performance in some cases. However, direct memory is out of control of a Java Virtual Machine, as the direct memory is allocated outside of the Java heap space in direct memory. If OutOfMemory exception occurs and usage of direct memory is enabled, try to turn it off.

      In CloverDX 4.9.0-M2, the default value was changed from true to false.

    • DEFAULT_DATE_FORMAT = yyyy-MM-dd

    • DEFAULT_TIME_FORMAT = HH:mm:ss

    • DEFAULT_DATETIME_FORMAT = yyyy-MM-dd HH:mm:ss

    • DEFAULT_REGEXP_TRUE_STRING = true|T|TRUE|YES|Y|t|1|yes|y

    • DEFAULT_REGEXP_FALSE_STRING = false|F|FALSE|NO|N|f|0|no|n

    • DataParser.DEFAULT_CHARSET_DECODER = UTF-8

    • DataFormatter.DEFAULT_CHARSET_ENCODER = UTF-8

    • Lookup.LOOKUP_INITIAL_CAPACITY = 512

      The initial capacity of a lookup table when created without specifying the size.

    • DataFieldMetadata.DECIMAL_LENGTH = 12

      Determines the default maximum precision of decimal data field metadata. Precision is the number of digits in a number, e.g. the number 123.45 has a precision of 5.

    • DataFieldMetadata.DECIMAL_SCALE = 2

      Determines the default scale of decimal data field metadata. Scale is the number of digits to the right of the decimal point in a number, e.g. the number 123.45 has a scale of 2.

    • Record.MAX_RECORD_SIZE = 33554432

      This is a deprecated property. Nowadays, you should use Record.RECORD_LIMIT_SIZE.

      Limits the maximum size of a record. Theoretically, the limit is tens of MBs, but you should keep it as low as possible for easier error detection.

    You can define locale that should be used as the default one.

    The setting is the following:

    # DEFAULT_LOCALE = en.US

    By default, system locale is used by CloverDX. If you uncomment this row you can set the DEFAULT_LOCALE property to any locale supported by CloverDX, see the List of all Locale

    Similarly, the default time zone can be overridden by uncommenting the following entry:

    # DEFAULT_TIME_ZONE = 'java:America/Chicago';'joda:America/Chicago'

    For more information about time zones, see the Time Zone section.

    Properties specific for Wrangler

    • CSVAnalyzer.LINES_TO_ANALYZE = 1000

      Maximum number of lines read during metadata analysis of a CSV data file. This sets the data sample size Wrangler uses to detect columns and their data types for CSV sources.

    • CSVAnalyzer.BYTES_TO_ANALYZE = 524288

      Maximum number of bytes read during metadata analysis of a CSV data file. This sets the data sample size Wrangler uses to detect columns and their data types for CSV sources.

    • CSVAnalyzer.MAJORITY_TYPE_GUESS_THRESHOLD = 90

      The confidence needed for CSV data type detection of a column, in percent. With default setting, 90% or more values in a column must contain integer number for Wrangler to detect the column as integer. If only 89% of values are integers Wrangler will attempt to generalize the data type to a larger data type such as decimal or string.

    • XLSAnalyzer.LINES_TO_ANALYZE = 1000

      Maximum number of lines read during metadata analysis of an Excel file. This sets the data sample size Wrangler uses to detect columns and their data types for Excel sources.

    • Wrangler.MAX_NUMBER_OF_COLUMNS = 1000

      Maximum number of columns allowed in Wrangler data set. Data sets with more columns are rejected and cannot be worked with.

    • Wrangler.DEFAULT_SORT_LOCALE = en.US

      Locale used for sorting wrangler data sets. If not specified DEFAULT_LOCALE is used.

    Compatibility

    In 4.4.0-M2, the default encoding was changed from ISO-8859-1 to UTF-8. Therefore, DataParser.DEFAULT_CHARSET_DECODER and DataFormatter.DEFAULT_CHARSET_ENCODER were set to UTF-8.

    Since 6.0.0, added Wrangler properties.