Testing Reference
Complete reference for all data quality tests in the nexus package, including uniqueness, not-null, and composite key validations.
The nexus package includes comprehensive data quality tests to ensure ID
uniqueness, data integrity, and proper relationships between models. All tests
are defined in models/nexus-models/nexus.yml.
0. Source Testing (Strongly Recommended)
Before nexus processing begins, create comprehensive tests for your source models. Source tests are your first line of defense against data quality issues.
Why Source Tests Matter
- Early Detection: Catch problems before they propagate through nexus processing
- Pipeline Reliability: Prevent downstream failures in identity resolution
- Data Quality Assurance: Ensure IDs are unique and required fields are populated
Essential Source Tests
Create tests for your source models following this pattern:
# models/sources/your_source/your_source.yml
version: 2
models:
- name: your_source_events
tests:
- unique:
column_name: event_id
config:
severity: error
columns:
- name: event_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'evt_%'"
config:
severity: warn
- name: your_source_person_identifiers
tests:
- unique:
column_name: person_identifier_id
config:
severity: error
columns:
- name: person_identifier_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'per_idfr_%'"
config:
severity: warn
Running Source Tests
# Test all models in a specific source
dbt test --select models/sources/segment/
# Test and build everything in a source folder
dbt build --select models/sources/segment/
Complete Guide: See Source Testing Best Practices for detailed examples and patterns.
1. Test Categories
Primary Key Tests
- Uniqueness: Ensures no duplicate IDs across all records
- Not Null: Ensures all ID fields have values
Composite Key Tests
- Multi-column uniqueness: Validates unique combinations across multiple fields
- Edge relationship integrity: Ensures proper identifier connections
Data Integrity Tests
- Foreign key relationships: Validates references between models
- Business rule compliance: Ensures data follows expected patterns
- ID prefix validation: Ensures all ID columns follow the expected naming
convention from
create_nexus_idmacro
2. Event-Level Tests
nexus_events
Purpose: Validates the unified events table from all enabled sources.
tests:
- unique:
column_name: event_id
config:
severity: error
columns:
- name: event_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'evt_%'"
config:
severity: warn
- name: occurred_at
tests:
- not_null:
config:
severity: error
What it tests:
- Each event has a unique
event_id - No events are missing IDs or timestamps
- Event IDs follow the expected
evt_prefix pattern (warning level) - Events from all sources (Gmail, Google Calendar, Notion) are properly unified
Common failures:
- Duplicate event IDs when source models generate non-unique IDs
- Event IDs not following the expected
evt_prefix pattern
2.1. ID Prefix Validation Tests
All nexus models include ID prefix validation tests to ensure consistency with
the create_nexus_id macro. These tests use dbt_utils.expression_is_true with
warning severity to validate that ID columns start with the expected
prefixes.
Expected ID Prefixes
| Entity Type | Expected Prefix | Example ID |
|---|---|---|
| Events | evt_ |
evt_abc123... |
| Persons | per_ |
per_def456... |
| Groups | grp_ |
grp_ghi789... |
| Memberships | mem_ |
mem_jkl012... |
| States | st_ |
st_mno345... |
| Person Identifiers | per_idfr_ |
per_idfr_pqr678... |
| Group Identifiers | grp_idfr_ |
grp_idfr_stu901... |
| Person Traits | per_tr_ |
per_tr_vwx234... |
| Group Traits | grp_tr_ |
grp_tr_yza567... |
| Person Participants | per_prt_ |
per_prt_bcd890... |
| Group Participants | grp_prt_ |
grp_prt_efg123... |
| Person Edges | per_edg_ |
per_edg_hij456... |
| Group Edges | grp_edg_ |
grp_edg_klm789... |
Test Configuration
- dbt_utils.expression_is_true:
expression: "like 'evt_%'"
config:
severity: warn
Why warning severity?
- Allows builds to continue even with prefix violations
- Provides visibility into naming convention compliance
- Enables gradual adoption of naming standards
Common prefix violations:
- Manual ID generation that bypasses the
create_nexus_idmacro - Source data with unexpected ID formats
3. Identifier-Level Tests
nexus_person_identifiers
Purpose: Validates person identifiers from all sources have unique IDs.
tests:
- unique:
column_name: person_identifier_id
config:
severity: error
columns:
- name: person_identifier_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'per_idfr_%'"
config:
severity: warn
What it tests:
- Each person identifier record has a unique ID
- No person identifiers are missing IDs
- Person identifier IDs follow the expected
per_idfr_prefix pattern (warning level) - Person identifiers from Gmail, Google Calendar, and Notion are properly deduplicated
Common failures:
- Same person appears multiple times with different roles but same ID
- Duplicate source data not properly deduplicated
- Missing role or timestamp in ID generation
nexus_group_identifiers
Purpose: Validates group identifiers (domains, organizations) have unique IDs.
tests:
- unique:
column_name: group_identifier_id
config:
severity: error
columns:
- name: group_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each group identifier record has a unique ID
- No group identifiers are missing IDs
- Group identifiers properly deduplicated when multiple people from same domain attend same event
Common failures:
- Multiple employees from same company create duplicate group records
- Missing deduplication in source models
- Role not included in ID generation
nexus_membership_identifiers
Purpose: Validates person-to-group membership relationships have unique IDs.
tests:
- unique:
column_name: membership_identifier_id
config:
severity: error
columns:
- name: membership_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each membership relationship has a unique ID
- No memberships are missing IDs
- Same person can belong to multiple groups with different roles
Common failures:
- Same person-group combination with different roles gets same ID
- Missing role in membership ID generation
4. Trait-Level Tests
nexus_person_traits
Purpose: Validates person traits (names, emails, etc.) have unique IDs.
tests:
- unique:
column_name: person_trait_id
config:
severity: error
columns:
- name: person_trait_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each person trait record has a unique ID
- No person traits are missing IDs
- Person traits properly linked to identifiers
nexus_group_traits
Purpose: Validates group traits (domain names, organization details) have unique IDs.
tests:
- unique:
column_name: group_trait_id
config:
severity: error
columns:
- name: group_trait_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each group trait record has a unique ID
- No group traits are missing IDs
- Group traits properly linked to identifiers
5. Resolved Entity Tests
nexus_persons
Purpose: Validates final resolved person entities after identity resolution.
tests:
- unique:
column_name: person_id
config:
severity: error
columns:
- name: person_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each resolved person has a unique final ID
- Identity resolution properly merged duplicate identifiers
- No persons are missing final IDs
nexus_groups
Purpose: Validates final resolved group entities after identity resolution.
tests:
- unique:
column_name: group_id
config:
severity: error
columns:
- name: group_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each resolved group has a unique final ID
- Identity resolution properly merged duplicate identifiers
- No groups are missing final IDs
nexus_memberships
Purpose: Validates final resolved membership relationships.
tests:
- unique:
column_name: membership_id
config:
severity: error
columns:
- name: membership_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each resolved membership has a unique final ID
- Memberships properly link resolved persons to resolved groups
- No memberships are missing final IDs
6. Participant-Level Tests
nexus_person_participants
Purpose: Validates person participation in events with proper role handling.
tests:
- unique:
column_name: person_participant_id
config:
severity: error
columns:
- name: person_participant_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each person-event-role combination has unique participant ID
- Same person can participate in same event with multiple roles
- No participants are missing IDs
Common failures:
- Role not included in participant ID generation
- Same person-event combination with different roles gets same ID
nexus_group_participants
Purpose: Validates group participation in events with proper role handling.
tests:
- unique:
column_name: group_participant_id
config:
severity: error
columns:
- name: group_participant_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each group-event-role combination has unique participant ID
- Same group can participate in same event with multiple roles
- No participants are missing IDs
Common failures:
- Role not included in participant ID generation
- Same group-event combination with different roles gets same ID
7. Identity Resolution Tests
nexus_resolved_person_identifiers
Purpose: Validates resolved person identifiers after identity resolution processing.
tests:
- unique:
column_name: person_identifier_id
config:
severity: error
columns:
- name: person_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Resolved identifiers maintain unique IDs
- Identity resolution process doesn't create duplicates
- All identifiers properly linked to resolved persons
nexus_resolved_group_identifiers
Purpose: Validates resolved group identifiers after identity resolution processing.
tests:
- unique:
column_name: group_identifier_id
config:
severity: error
columns:
- name: group_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Resolved identifiers maintain unique IDs
- Identity resolution process doesn't create duplicates
- All identifiers properly linked to resolved groups
nexus_resolved_person_traits
Purpose: Validates resolved person traits after identity resolution processing.
tests:
- unique:
column_name: person_trait_id
config:
severity: error
columns:
- name: person_trait_id
tests:
- not_null:
config:
severity: error
nexus_resolved_group_traits
Purpose: Validates resolved group traits after identity resolution processing.
tests:
- unique:
column_name: group_trait_id
config:
severity: error
columns:
- name: group_trait_id
tests:
- not_null:
config:
severity: error
8. Edge Relationship Tests
nexus_person_identifiers_edges
Purpose: Validates edges connecting person identifiers for identity resolution.
tests:
- unique:
column_name:
"edge_id || '|' || identifier_type_a || '|' || identifier_value_a || '|'
|| identifier_type_b || '|' || identifier_value_b"
config:
severity: error
columns:
- name: edge_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each edge relationship is unique across all identifier combinations
- No edges are missing IDs
- Bidirectional edges are properly handled
Note: Uses concatenated string syntax for composite key uniqueness testing.
nexus_group_identifiers_edges
Purpose: Validates edges connecting group identifiers for identity resolution.
tests:
- unique:
column_name:
"edge_id || '|' || identifier_type_a || '|' || identifier_value_a || '|'
|| identifier_type_b || '|' || identifier_value_b"
config:
severity: error
columns:
- name: edge_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each edge relationship is unique across all identifier combinations
- No edges are missing IDs
- Bidirectional edges are properly handled
9. State Management Tests
nexus_states
Purpose: Validates entity state tracking and transitions.
tests:
- unique:
column_name: state_id
config:
severity: error
columns:
- name: state_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each state record has a unique ID
- State transitions are properly tracked
- No states are missing IDs
10. Running Tests
Run All Tests
dbt test --models nexus_*
Run Specific Model Tests
dbt test --models nexus_person_identifiers
dbt test --models nexus_group_participants
Run Only Uniqueness Tests
dbt test --models nexus_* --select test_type:unique
Run Tests with Increased Verbosity
dbt test --models nexus_* --debug
Run Only Prefix Validation Tests
# Run all expression_is_true tests (prefix validation)
dbt test --models nexus_* --select test_type:expression_is_true
# Check prefix compliance across all models
dbt test --models nexus_* --select test_name:*expression_is_true*
11. Test Failure Investigation
When tests fail, use these approaches:
1. Check Test Results
# View compiled test SQL
cat target/compiled/nexus/models/nexus-models/nexus.yml/unique_nexus_person_identifiers_person_identifier_id.sql
2. Run Diagnostic Queries
See Troubleshooting Duplicates for specific diagnostic queries.
3. Validate Fixes
# Rebuild and test incrementally
dbt run --models source_model
dbt run --models nexus_model
dbt test --models nexus_model
12. Test Configuration
Severity Levels
- error: Test failure stops execution (used for uniqueness and not-null tests)
- warn: Test failure logs warning but continues (used for ID prefix validation tests)
Custom Test Thresholds
tests:
- unique:
column_name: person_id
config:
severity: error
error_if: '>= 1' # Fail if any duplicates
warn_if: '>= 0' # Warn if any issues
Test Tags
All nexus tests are automatically tagged for easy filtering:
dbt test --models tag:nexus
dbt test --models tag:identity-resolution
For troubleshooting specific test failures, see the Troubleshooting Duplicates Guide.