Persistent Lookup Table
This type of lookup table serves a great number of data records.
The data records are stored in files; only a few records are cached in main memory.
These files are in JDBM
format (http://jdbm.sourceforge.net). When you specify the file name, two files will be created: with db
and lg
extensions.
Persistent lookup table can work in two modes: with key duplicates and without key duplicate. If you switch between the modes, you should delete and refill the lookup table.
Without key duplicates
With the Allow key duplicates property unchecked, the persistent lookup table does not allow storing multiple records with the same key value. You can choose whether to store the first one or the last with the Replace checkbox.
This is the default option.
With key duplicates
With Allow key duplicates property enabled, you can store multiple records with the same key to the table. The Replace property is not used. Key duplicates in persistent lookup table are available since 4.3.0.
Persistent lookup table internally uses B+Tree to store the records. If node is mentioned here, it is the node of the B+Tree.
Creating Persistent Lookup Table
In the first step of the wizard, choose Persistent lookup.
Then set up the required properties: give a Name to the lookup table, select the corresponding Metadata, specify the File where the data records of the lookup table will be stored and the Key that should be used to look up data records from the table.
Advanced Properties
To overwrite old records with newer ones, check the Replace checkbox. This way, the latest record with the same key is stored. Otherwise the first record with the same key would be stored.
You can disable transactions with Disable transactions. Disabling transactions increases graph performance; however, it can cause data loss if manipulation with the table is interrupted.
Commit interval defines the number of records that are committed at once. When the limit or end of phase is reached, the records are committed to the lookup table.
By specifying Page size, you are defining the number of entries (records) per node of B+Tree (in the implementation).
Cache size specifies the maximum number of nodes (of B+Ttree) in cache.
Allow key duplicates allows storing multiple records with the same key value.
Replace checkbox is ignored in lookup tables with key duplicates. |
Then click OK and Finish.
Using Persistent Lookup Table
You can use LookupTableReaderWriter to add records to Persistent Lookup Table.
Persistent Lookup Table Configuration Tweaks
Performance of persistent lookup table can be affected by the advanced parameters. These parameters configure the internal B+Tree implementation and size of caches.
To speed up reading, increase cache size.
To speed up writing, increase commit interval.
Compatibility
Version | Compatibility Notice |
---|---|
4.3.0 |
You can now use Allow key duplicates to allow storing duplicated key values into the table. |