datavault4dbt is highly customizable by using many global variables. Since they are applied on multiple levels, a high rate of standardization across your data vault 2.0 solution is guaranteed.
PREREQUISITES
The default values of those variables are set inside the packages dbt_project.yml under <your_dbt_project>/dbt_packages/datavault4dbt/dbt_project.yml and should be copied to your own dbt_project.yml. Copy all variables defined under the keyword vars, and paste them under the keyword vars in your file.
VARIABLES AND THEIR USAGES
All the following variables are prefixed with datavault4dbt.
GENERAL CONFIGURATION
| Name | Usage | Explanation |
|---|
| include_business_objects_before_appearance | Ref Table | If a Ref_Hub entry should appear in the ref_table (snapshot based), even if the snapshot date is before the first appearance of that business object. |
| enable_static_analysis_overwrite | All macros | Relevant for Fusion compatibility. For more info, see here. |
| multi_source_models_execution_aware_loading | Multi source entities | Whether multi source entities should respect the dbt command to reduce runtimes. |
COLUMN ALIASES
| Name | Usage | Explanation |
|---|
| ldts_alias | Stage, DV entities | The name of the load-date column in all DV entities. Is generated in the staging area. |
| rsrc_alias | Stage, DV entities | The name of the record-source column in all DV entities. Is generated in the staging area. |
| ledts_alias | Version 1 Satellites | The name of the load-end date column in version 1 Satellites and MA-Satellites. |
| sdts_alias | Snapshot Table, PITs | The name of the snapshot-date column in the snapshot table (+view) and all PITs. |
| snapshot_trigger_column | Snapshot Table, PITs, Post-Hook | The name of the column that shows the activation state of single snapshots. |
| stg_alias | Record Tracking Satellite | The name of the column that holds info about the staging model of each record. |
| is_current_col_alias | Version 1 Satellites | The name of the column that indicates the current row per hashkey. |
| is_active_alias | Effectivity Satellite v0 | The name of the column that marks activity. Generated by the macro. |
HASH CONFIGURATION
| Name | Usage | Explanation |
|---|
| hash | Stage | What hash algorithm should be used for generating hash values. MD5, SHA1 or SHA2. |
| hash_datatype | Stage | The datatype that hash columns should have. Needs to fit the output of the used hash algorithm. |
| hashkey_input_case_sensitive | Stage | Whether the input business keys for hashkey calculation should be case sensitive or not. |
| hashdiff_input_case_sensitive | Stage | Whether the input descriptive attributes for hashdiff calculation should be case sensitive or not. |
| hashdiff_use_trim | Stage | Define the global default if hashdiff input columns should be wrapped with TRIM() or not. Defaults to true. Available from v1.16.0 |
STAGE CONFIGURATION
| Name | Usage | Explanation |
|---|
| copy_rsrc_ldts_input_columns | Stage | Whether the columns that are used for ldts and rsrc should also be inside the stage, or not. If true, the stage would hold the ldts- & rsrc-alias columns, and the original columns. If false, only the aliased columns are kept. |
SATELLITE CONFIGURATION
| Name | Usage | Explanation |
|---|
| is_active_datatype | Effectivity Satellite v0 | Controls the datatype which is used for the is_active-column. Defaults to Bit for Fabric & Synapse, Number on Oracle and Boolean on the remaining adapters. Available from v1.10.0 |
GHOST RECORD AND ZERO KEY CONFIGURATION
| Name | Usage | Explanation |
|---|
| beginning_of_all_times | Stage, Satellites, PIT | The timestamp that represents your earliest technical timestamp. |
| end_of_all_times | Stage, Version 1 Satellites, PIT | The timestamp that represents your latest technical timestamp. We recommend to not use the maximum possible timestamp of your database. |
| timestamp_format | Stage, Version 1 Satellites, PIT | The timestamp format of the two previous variables. |
| beginning_of_all_times_date | Stage | The date that represents your earliest technical date. Used for ghost-record creation of columns with the date-datatype. |
| end_of_all_times_date | Stage | The date that represents your latest technical date. Used for ghost-record creation of columns with the date-datatype. |
| date_format | Stage | The date format of the two previous variables. |
| default_unknown_rsrc | Stage | The default unknown value for the record source column you want to use. |
| default_error_rsrc | Stage | The default error value for the record source column you want to use. |
| rsrc_default_dtype | Stage | The default datatype that should be used for the two variables above. |
| stg_default_dtype | Record Tracking Satellite | The default datatype for the stg_alias column inside a record tracking satellite. |
| derived_columns_default_dtype | Stage | The default datatype for derived columns, if no other datatype can be detected automatically. |
DATATYPE SPECIFIC DEFAULT VALUES
For each datatype there is a default unknown and error value defined. Additionally, an alternative, usually much shorter value is defined. See the applied default values in the table below.
| Datatype | Error Value | Error Value alt. | Unknown Value | Unknown Value alt. |
|---|
| STRING | (error) | e | (unknown) | u |
| Numeric | -2 | | -1 | |
Those values are best changed by adding a global variable inside your dbt project, that is called following this pattern: datavault4dbt._value_<(alt)>__. If you want to change the default alternative error value for datatype STRING, you would need to set the global variable datavault4dbt.error_value_alt__STRING to your desired value.