CrossJoin
Short Description
CrossJoin creates a Cartesian product of records from connected input ports.
Same input metadata | Sorted inputs | Slave inputs | Outputs | Output for driver without slave | Output for slaves without driver | Joining based on equality | Auto-propagated metadata |
---|---|---|---|---|---|---|---|
⨯ |
⨯ |
0-n |
1 |
⨯ |
⨯ |
⨯ |
✓ |
Ports
Port type | Number | Required | Description | Metadata |
---|---|---|---|---|
Input |
0 |
✓ |
Master input port |
Any1 |
1-n |
⨯ |
Slave input port(s) |
Any2 |
|
Output |
0 |
✓ |
For output data records |
Any3 |
Metadata
CrossJoin automatically generates metadata on the output port from metadata on its input ports. The generated metadata can be seen as a dynamic template.
CrossJoin Attributes
Attribute | Req | Description | Possible values |
---|---|---|---|
Advanced |
|||
Transform |
A transformation in CTL or Java defined in the graph. |
||
Transform URL |
An external file defining the transformation in CTL or Java. |
||
Transform class |
An external transformation class. |
||
Transform source charset |
Encoding of an external file defining the transformation. |
e.g. UTF-8 |
Details
CrossJoin creates a Cartesian product of input records.
It works in the following way: the component takes the first record from the first port, the first record from the second port (…) and the first record from the last port and generates the output record. Subsequently, it takes the first record from the first port, the first record from the second port (…) and the second record from the last port. It continues with the third record from the last input port and so on.
Processing Large Number of Records
If you process a very large number of records, temporary files with the swapped records may be created on your hard drive. This prevents excessive memory usage.
Examples
Simple CrossJoin Example
Given a list of customers and a list of products of "All on the Store Ltd."
Customers:
Brown Smith Jones
Goods:
Pineapple Turnip Spaceship
Create a list containing all possibilities.
Solution
You only need to connect sources of data with CrossJoin component. No setup of attributes of the component is necessary.
The result is
Brown|Pineapple Brown|Turnip Brown|Spaceship Smith|Pineapple Smith|Turnip Smith|Spaceship Jones|Pineapple Jones|Turnip Jones|Spaceship
Best Practices
The edge giving the most records should be connected to the first input port.
If the transformation is specified in an external file (with Transform URL), we recommend users to explicitly specify Transform source charset.
Compatibility
Version | Compatibility Notice |
---|---|
4.1.0-M1 |
The CrossJoin component is available since 4.1.0-M1. |