Version

    4. Writing data in Wrangler

    Wrangler allows you to write data in CSV files and to Microsoft Excel xlsx files. See below for more details about each data target.

    Downloading and viewing output files

    When files are written by Wrangler, they are stored in your workspace. You can download your files from Wrangler user interface from My jobs screen by clicking on the arrow icon next to the output file name:

    my jobs downloading outputs
    Figure 9. My jobs screen showing outputs of two successfully executed jobs

    As with files you upload, you can also see the output files via CloverDX Server Console in Sandboxes module. The files are located in your Wrangler workspace sandbox in data-out directory.

    CSV data target

    CSV data target allows you to write your data into CSV files that are stored in your Wrangler workspace. To change settings of your data target, click on the Target in the overview diagram of your job at the top of the transformation editor screen:

    job diagram
    Figure 10. Job diagram showing main parts of your Wrangler job

    CSV data target configuration

    Following settings are available for CSV data target:

    csv data target configuration
    Figure 11. CSV data target configuration dialog
    • Output file name: name of the output file. Default value is the name of your Wrangler job with .csv extension.

    • Encoding: encoding for the output file. Default (recommended) value is UTF-8 which is an universal encoding that will allow you to save data in any language while making it readable in many other apps. Other common encodings include windows-1252 (US English encoding commonly used for files created on Windows) or ISO-8859-1 (US ASCII encoding). Selecting the wrong encoding here can produce a file in which some characters are not stored correctly.

    • Field delimiter: a single character to use as a separator between columns in the output file. You can either select one of the provided options or type your own character. Most common delimiter is either a comma or a tab.

    • Record delimiter: configures character(s) that are used to separate rows in the output file. The default value is a "Windows new line" which means that each row appears on a separate line in the output file. Select the option based on your target platform - if you are targeting Windows users, use Windows CRLF option, for Linux or Mac users pick Linux LF option. You can also type your own delimiter if needed.

    • Quoting: allows you to configure whether values in your file are surrounded by quotes or not. If you enable quotes, the data may look like this (assuming two columns: Name and Age):

      "Name","Age"
      "Alice","22"
      "Bob","23"

      Quoting is needed when your data can contain your delimiter character - e.g., if your delimiter is a comma and your data can contain commas as well. When configuring quoting you can pick which character to use - whether double quotes or single quotes. Double quotes (the default value) are more common and recommended if quoting is needed.

    • Header: allows you to configure whether to write file header or not. If enabled, the file header will be written as the first row of the output and will contain labels of all columns in your data. We recommend enabling the header as that will allow data recipients to know what is in the data.

    • Error handling: determines what to do with rows that contain errors in any of their cells. Two options are provided:

      • Skip rows with invalid cells (default value): if selected, any rows that contain errors will be ignored on output. When you run the job, you will be able to see number of skipped rows on My jobs page when job execution ends:

        my jobs rows skipped
      • Fail job on the first invalid cell: when an error is encountered when writing the file, the job is stopped and the error will be reported on My jobs page:

        my jobs job failed on output

    CSV data target limitations

    CSV target will allow you to work with any size of data - unlimited number of rows or columns. It does not allow you to append data to existing file. The output file is overwritten on every job run.

    Excel data target

    Excel data target allows you to write your data to Microsoft Excel files in xlsx format. The target produces a file with one sheet and a header. Formatting of data in the output file will follow formatting configured for columns in your Wrangler job.

    excel target output
    Figure 12. Sample Excel file create with CloverDX Wrangler

    Excel data target configuration

    Following settings are available for the Excel target:

    excel data target configuration
    Figure 13. Excel data target configuration dialog
    • Output file name: name of the output file. The default value is the name of your Wrangler job with .xlsx extension.

    • Sheet name: name of the sheet to create in the output file. Note that Excel limits the length of the sheet name to 31 characters.

    • Error handling: determines what to do with rows that contain errors in any of their cells. Two options are provided:

      • Skip rows with invalid cells (default value): if selected, any rows that contain errors will be ignored on output. When you run the job, you will be able to see number of skipped rows on My jobs page when job execution ends:

        my jobs rows skipped
      • Fail job on the first invalid cell: when an error is encountered when writing the file, the job is stopped and the error will be reported on My jobs page:

        my jobs job failed on output

    Formatting of Excel files

    The formatting of data in Excel output files depends on formats configured on columns in your data set. Wrangler will set the corresponding Excel format depending on display format for each column.

    When saving to Excel file, Wrangler will save the complete value even if the formatting restricts the display of data. This is especially important for numbers which may be stored with higher precision than what is displayed in Wrangler preview.

    For example, consider a value like 3.1415926535 written in Wrangler with decimal format set to #.### (i.e. three decimal places):

    excel target formatting wrangler data

    The column $formatted_constant was created as a duplicate of $constant_value column and then configured to show just three decimal places. The same is then replicated in Excel output in which complete value is stored and the formatting only applies during data display:

    excel target formatting sheet data

    Note that the formats may not always be the same since Excel formatting and Wrangler formatting have slightly different capabilities.

    Excel data target limitations

    Note that some restrictions apply when working with Excel files. Some of the restrictions below are limitations of the Excel file format itself while others are technical limitations of the Excel data target in Wrangler.

    • You cannot write more than 1048576 rows into an Excel file. This is a limitation of the Excel file format. If you need to work with more data, you will need to use CSV data target.

    • You cannot have more than 16384 columns in the output. This is a limitation of Excel file format. Note that this is much higher than the default soft-limit for Wrangler columns - 1000 columns - so you are unlikely to reach this.

    • You can only create a single sheet in the output file using Wrangler.

    • You cannot append to the output file - the file is overwritten after each job run. This also means that you cannot use templates and add a sheet to them or modify them in any way.