1. Getting started
- How Wrangler works (video)
- Quick tutorial (video)
- Source: Choosing data for your job
- Combining multiple data sources
- Job anatomy: Understanding steps and repeatability
- What to consider before adding a column
- Column data types
- Formulas and calculations
- Modify values based on a condition
- Working with strings (text)
- Working with decimals (including currency)
- Working with dates
- Errors in your data
- No data or some missing?
- No value: Working with null and empty values
- Replacing empty value with a default value
- Filtering data
- Getting results from your job
- Target file configuration
Welcome to Wrangler!
To get yourself comfortable with CloverDX Wrangler, we recommend you take the following path to learn the Wrangler basics so that you can later explore additional functionality on your own:
-
Start with a quick introduction video How Wrangler works (2 minutes).
-
Watch this 15 minute Quick tutorial that will guide you through a sample task.
-
Explore the rest of this Getting Started section (use the menu on the left) to learn the key principles of the Wrangler.
-
Start your own project!
-
Use the steps reference part of this documentation to understand each transformation step in detail.
How Wrangler works (video)
This video provides a brief overview of what Wrangler is and what you can do with it.
Quick tutorial (video)
We highly recommend you follow this quick tutorial before you start exploring on your own.
The video will walk you through building your first Wrangler job and explain the core principles in detail. You can then read additional details in the rest of this guide.
Source: Choosing data for your job
CloverDX Wrangler uses concepts of data sources and data targets to define where the data comes from and where it is written to at the end of the transformation.
Data source is the location from where your data is loaded. All the sources that you want to work with need to be loaded or added into Wrangler in the Data catalog > My Sources section.
The sources can be:
-
a CSV file uploaded to Wrangler under Data Catalog > My Sources.
-
an existing data source in the Explorer section, where sources are defined and managed by your organization’s IT team. To add it to your sources, select it in the Explorer section of the Data catalog and click on the Use in a new job button.
See Reading Data in Wrangler for more information.
Data target defines the location where the data is written to. CloverDX Wrangler supports CSV and Excel data targets. Both write output files into Wrangler workspace. Output files can be downloaded from the My jobs page once a job has finished successfully. See Getting results from your job for more information.
Combining multiple data sources
Each job must have one main source and it can also optionally include any number of additional lookup sources.
To add (join) additional sources to your job:
-
First, add the additional source(s) to My sources.
-
Back in the job, add the Lookup step.
-
To link multiple source files, add a new Lookup step for each source separately.
Lookup configuration
-
Lookup data source
-
Pick the second source you want to add to your dataset.
-
It needs to be in My sources first, otherwise it won’t be visible in the dropdown.
-
-
Data column
-
Typically, the "ID" or "key" column in the main source (CustomerID, InvoiceNumber, etc…).
-
-
Lookup column
-
The corresponding "ID" or "key column in the second source (the other CustomerID, InvoiceNumber, etc…).
-
-
Columns to add from lookup
-
Choose which column(s) you want to load from the second source.
-
The fewer columns you load from the second source, the better performance you’ll get.
-
Lookup hints
Things to bear in mind when working with lookups:
-
Values added from the Lookup step will always be added to a new column; it is not possible to output the values to an existing column.
-
When there are values in your dataset for which there are no matches in the lookup file, the output values will be left empty. If you want to populate all such values with a static value, you can use the Replace empty values step.
-
Lookups are case-sensitive, and the data used as keys in the lookup must match exactly with the data in your dataset in order to be matched.
-
When a data source used in an existing Lookup step is deleted and uploaded again, the step will need to be edited and the lookup source file selected again.
-
The source file should not include duplicate keys. For example, the table below includes currencies and conversion rates, where "EUR" appears twice. If this data is used in a lookup to pull the rates, only the first rate (1.07) for EUR will be used.
The data preview in Wrangler is limited to 1000 rows. If the lookup file or the source file include more than 1000 rows, it might happen that you will not see any matched data from the lookups when editing the job. Refer to the following section for more information on how to proceed. |
For more information about the Lookup step, refer here.
Job anatomy: Understanding steps and repeatability
Instead of manipulating data directly and potentially doing something you might regret in the future, Wrangler records the steps you’re taking and then plays them back on the source data when the job is executed. Source data is never modified in any way.
You can always go back to change, remove, disable, or insert steps in your job to revert or modify the actions you’ve taken in the past.
Example: We initially deleted "Postal code" but later realized we might be needing it.
We can remove Step 1 (Delete Postal column) to bring the column back to our dataset as if we never deleted it in the first place. All subsequent actions will still be applied.
Tip: Steps are your safeguard against making mistakes. Any decision you’ve made in the past can be reverted or modified.
Disable a step if you don’t want to remove it just yet. Disabled steps won’t take any effect on the data and will stay in the job until you delete or re-enable them.
Moving between steps
The preview always shows the data corresponding to the currently highlighted step. You can freely move around the steps to see how the data looks at each point in the sequence. To see the final output, highlight the last step.
Example:
Phone
is still visible here because step 3 is selected. Clicking on the last step wil delete the Phone
column as per the step configuration.
What to consider before adding a column
Before using the Add column step, note that there are steps that add a new column automatically:
-
The Lookup step will automatically create a new column with joined data
-
The Calculate formula step has an option to create a new column
In both scenarios you don’t need to explicitly add a new column beforehand - both the Lookup and Calculate formula steps add columns for you.
Column data types
When a file is loaded, CloverDX Wrangler will attempt to automatically recognize the column data types. The assigned data type is indicated by an icon in the left corner of each column header, or when hovering over a column header, the column data type will be displayed in parentheses.
Data type | Example column | Example usage |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For more information about data types, refer here.
Since having the correct data type is essential when using calculation and transformation formulas, it is important to review the loaded data, and if it is not in the desired format, convert it using one of the conversion steps:
Formulas and calculations
Formulas can be entered in the following steps:
In your formulas you will need to specify the column(s) that you want to work with. Since column names in formulas must not include spaces and other unsupported characters, and cannot exist multiple times, it is important to use the technical column name that is automatically stripped of all unsupported characters and a sequence number is added in case of duplicate column names. You can find the technical column name by hovering over a column header: use the full name, including the dollar sign.
Example: In the example below even though the name of the column is "Test Column# 3", its technical name is "$Test_Column_3".
The column names are also case-sensitive, and need to be entered in the exact form as displayed. You might otherwise receive the following error:
Error: Field 'Test_column_3' does not exist in record 'invoices_csv'
(the column cannot be found because the formula includes lowercase 'c').
Comparing values
The operators below can be used to compare values; however, note that that you can only compare values with the same data types.
Operator | Description |
---|---|
|
Equal to (double =) for all data types |
|
Not equal to |
|
Greater than |
|
Greater or equal |
|
Less than |
|
Less or equal |
Important information to bear in mind:
-
When working with decimals, you need to specify all the decimal places in the unformatted form in your formula. Decimal values can also only include decimal points; commas as decimal separators are not supported. See Working with decimals (including currency) for more information.
-
When working with dates, make sure to enter them in the
YYYY-MM-dd format
(orYYYY-MM-dd HH:mm:ss
when using timestamps). See Working with dates for more information. -
When working with strings, they need to be enclosed in double or single quotes. See Working with strings (text) for more information.
-
When working with empty values, see No value: Working with null and empty values for more information.
Example | Explanation |
---|---|
|
Returns |
|
Returns |
|
Returns |
|
Returns |
|
Often the best options. |
|
Returns |
|
Returns |
Calculations and functions
Operator | Description | Example formula | Result |
---|---|---|---|
|
Add numbers or concatenate strings |
|
|
|
Concatenate strings |
|
|
|
Subtract numbers |
|
|
|
Multiply numbers |
|
|
|
Divide integers |
|
500 if |
|
Divide decimals |
|
500.5 if |
|
Modulo (remainder after division) |
|
|
When working with columns that include numbers (integers or decimals), you can use mathematical operators.
Note: With the exception of the + operator, these operators cannot be used for any other data types than integers or decimals. If you, e.g., attempt to use the - operator when working with date columns, you will get the following error:
Error: Operator '-' is not defined for types: 'date' and 'date'
The + operator can be used as a concatenation function to join values from multiple columns when working with string values (or a combination of string and other data types values; however, in such a case a use of other functions to set the desired format might be needed: more information on this can be found in the Functions section). You can also optionally insert a space or any other characters in between the values.
Example: You want to join invoice numbers and status values and want to enter the & symbol in between, preceded and followed by a space:
$InvoiceNumber + ' & ' + $Status
For more complex cases you might want to use one of our existing functions.
Function | Description | Example formula | Result |
---|---|---|---|
|
IF THEN ELSE |
|
|
|
Difference in two dates (use day, week, month, year, hour, minute, second, millisecond as unit) |
|
|
|
Returns |
|
Returns |
|
Converts the specified string to lowercase |
|
E.g. the text RED Apple will be converted to |
Functions can also be combined, e.g. to convert values to lowercase and look for the values that include the word "green" in them, the formula would like as follows:
contains(lowerCase($Contact), "green")
For a list of existing date functions refer here.
For a list of existing string functions refer here.
For a list of existing conversion functions refer here.
Modify values based on a condition
In many transformations you’ll need to test your data and use different values in your formula depending on the result of the test. Wrangler formulas offer two ways of doing this - either with if
function or with ternary operator.
if function
if
function allows you to test a condition and return a value depending on whether the condition evaluated to true or false. Two variants are provided:
-
if(condition, value_if_true)
-
if(condition, value_if_true, value_id_false)
The parameters in the function:
-
condition
is a boolean expression - the test you want to run -
value_if_true
andvalue_if_false
are expressions that are calculated depending on the result of the test. Both must return the same data type.
The first form of the function is useful when you only need to react when the condition is true. That way if condition is met, the function returns value_if_true.
Examples:
-
Return "USD" currency code if no currency code is provided or if exchange rate is 1:
if($currencyCode == "USD" OR $exchangeRate == 1.0, "USD")
-
Use
if($datePaid > $dueDate, "Past due", "Paid on time")
to return either "Past due" for invoices paid after their due date or "Paid on time" for invoices paid before their due date. -
You can also nest
if
functions to create more complex formulas:if($Total < 20000,"Low", if($Total < 50000, "Medium", "High"))
. This expression will return one of "Low", "Medium" or "High" dependig on the value of $Total column.
Ternary operator
Second way of getting the same result is a ternary operator ?
. Its syntax is the following:
condition ? value_if_true : value_if_false
In this case, both value_if_true
and value_if_false
have to be present. Same examples as above written using ternary operator:
-
$currencyCode == "USD" OR $exchangeRate == 1.0 ? "USD" : $currencyCode
(this assumes that the result is written into the $currencyCode column) -
$datePaid > $dueDate ? "Past due" : "Paid on time"
-
$Total < 20000 ? "Low" : ($Total < 50000 ? : "Medium" : "High")
Note the use of parenthesis to make the expression easier to read.
Both ways lead to the same result and it is up to you to use whichever you are more comfortable with.
Working with strings (text)
String is one of the most common types of data. Wrangler supports quite a few steps that work with string data - see more details in Text manipulation steps.
When working with string values in formulas, you have two options of how to write them - either with double quotes or with single quotes. The following represent the same strings in your formulas:
-
"test"
-
'test'
Note that string columns can have two different types of empty values (null
values and an empty string). See No value: Working with null and empty values for more information.
Note that when working with strings, string comparisons are sensitive to white-spaces and letter case. So "TREE"
is not the same as "tree"
and "blue car"
is not the same as "blue car"
(multiple spaces between words).
For a list of string functions refer here.
Working with decimals (including currency)
Currency is represented as a decimal column with corresponding formatting on a column.
-
Right-click → Convert column to Decimal to change the column data type to decimal if it is a different data type.
-
Right-click → Change display format…
-
Choose the correct format and update the currency symbol if needed
To see the values in their unformatted form, set the display format to No specific format.
Things to bear in mind when working with decimals:
-
When using decimals in formulas, they needs to be entered with all the decimal places in their unformatted form, e.g. if there is a decimal with 5 decimal places and the column display format is changed to display just the first two decimal places, you will need to enter the full 5-place decimal in your formula.
-
Decimal values can only include decimal points, not decimal commas as the decimal separator.
Working with dates
In Wrangler there is a single date
type - it stores both date and time with millisecond precision. If you wish to write date value in a formula, you can use syntax yyyy-MM-dd
for dates or yyyy-MM-dd HH:mm:ss.SSS
for full date and time.
For example, these are all valid date values:
-
2023-01-01
- date only, time is set to midnight -
1931-12-24
- date only, time is set to midnight -
2023-01-01 14:45:30
- date and time -
2023-01-01 14:45:30.123
- full date and time with millisecond precision
You can use this syntax when writing date and time values in your formulas or in steps that require you to enter a date value (e.g., Replace empty value step).
When writing formulas that work with dates, following functions can come in handy:
-
today()
will return current date and time with full precision. Note that the value can be different for different rows in your data depending on how long your job takes. If you wish to set all rows to the same value, use Current date and time step. -
dateAdd(date fromDate, integer howMany, timeUnit)
to add to date. For example, if you wish to add 1 month to$dateColumn
, you can do this:dateAdd($dateColumn, 1, month)
. You can also add negative values to "go backwards in time". You can use several time units:year
,month
,week
,day
,hour
,minute
,second
,millisec
(note that they need to be written exactly like this without quotes - they are keywords). -
dateDiff(date1, date2, timeUnit)
to calculate difference between provided dates in given time unit. The function computesdate1 - date2
so it can return negative values ifdate2
is afterdate1
. For example, if you wish to determine how many days went by between invoice due date and when it was paid:dateDiff($dueDate, $paidDate, day)
.
Example formula | Example Dates | Result |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In many cases, you may need to break your date down into smaller "pieces" - date components. Several functions can help you do that:
-
getMonth($date)
- returns values from 1 to 12 -
getDay($date)
- returns values from 1 to 31 -
getHour($date)
- returns values from 0 to 23 -
getMinute($date)
- returns values from 0 to 59 -
getSecond($date)
- returns values from 0 to 59 -
getMillisecond($date)
- returns values from 0 to 999
You can also extract whole part of a date via following functions:
-
extractDate($date)
to return just the date part of the value (time is set to 0:00:00.000) -
extractTime($date)
to return just the time. Since there can be no date value without the date part, the result will have the date part set to 1970-01-01 (beginning of the Unix epoch).
Errors in your data
Errors (or invalid values) occur when the underlying data doesn’t match the column data type.
To correct errors:
-
Convert the column to string or other better suited data type (right-click → Convert)
-
Use the Fix errors menu to remove erroneous values or rows or replace errors with new values.
No data or some missing?
If you don’t see the data you expect, it might be caused by the limited size of the preview data. This is not an error, and you can continue working, only with limited visibility into the preview data.
Run the job to process and see all the data in the result.
How to work around the preview size limitation:
-
Keep adding steps as you need, regardless of the empty grid and warning message.
-
Run the job and inspect the full results in the target Excel/CSV.
-
You might want to apply some filters that limit the number of records you can see as the very last steps of your job.
No value: Working with null and empty values
The No value label (in italics) indicates the a cell is empty (will show as empty in the target, depending on how it handles empty values).
In formulas, No value is represented with the null
keyword (case sensitive).
Note that for string columns there are two different empty values - a null value, which comes up as No value in fields, and an empty string (i.e., a string with zero length), which comes up as an empty field in Wrangler. They do not represent the same data and when writing your formulas you’ll need to make sure you handle both cases if needed. To work with empty strings, use two straight (double or single) quotes: ""
or ''
In formulas you can use function isBlank which can determine if a string is empty - it will return true
for strings that are null, empty or contain only whitespaces.
Example to remove null values:
Example to remove empty string values:
Replacing empty value with a default value
In many cases you may want to replace empty value of given column with a default value. This can be done via Replace empty values step. This step will allow you to use single value instead of all empty values in selected column.
Note that this step will correctly handle string data - both null`
and empty string will be replaced with the value you provide.
If you wish to dynamically compute the new value, you will have to use Calculate formula step. The best way to do this is to use if
function which has the following syntax:
if(condition, value_if_true)
The function returns value_if_true
if the condition
is satisfied (i.e., if the condition evaluates to true for given row). This is what you can effectively use to replace empty values. For example, to use USD as a default currency, you can do this:
if(isBlank($currencyCode), "USD")
isBlank
function is a CTL function that returns true if a string value is null
, empty string or contains just spaces. If the condition is not met, original value remains in the column.
if
function also has a second variant with one extra parameter:
if(condition, value_if_true, value_if_false)
See below for more details on how to use if
function.
Filtering data
To filter your data, use the Filter rows based on formula step.
Getting results from your job
You must Run the job first before the complete result from the job is available. Then click the Download result button on My jobs screen.
Target file configuration
To change type or attributes of the output file, configure your target:
-
Edit the job
-
Click Target at the top of the screen