Segment Template Source
Ready-to-use Segment integration for events, entity identifiers, entity traits, and attribution touchpoints with v0.3.0 entity-centric architecture
The Segment template source provides a complete integration for Segment analytics data, enabling event tracking, entity identification, and attribution analysis within the dbt-nexus v0.3.0 entity-centric framework.
Overview
This template source processes Segment data from three main event types:
- Tracks: User actions and custom events
- Pages: Page views and navigation events
- Identifies: User identification and trait updates
Features
- ✅ Event Processing: Unified event tracking across all Segment event types
- ✅ Entity Identification: Multi-identifier entity resolution (person entities)
- ✅ Entity Traits: User attribute and trait management
- ✅ Attribution Analysis: UTM parameter and click ID tracking
- ✅ Touchpoint Modeling: Attribution touchpoint identification
- ✅ v0.3.0 Compatible: Entity-centric architecture with unified entity models
Configuration
Basic Configuration
Enable the Segment template source in your dbt_project.yml:
vars:
nexus:
sources:
segment:
enabled: true
events: true
entities: ['person']
attribution: true # If using touchpoints
segment: # Keep for backward compatibility with unpivot macros
identifiers: ['email', 'user_id'] # Optional: specify custom identifiers
traits: ['name', 'company'] # Optional: specify custom traits
Multiple Segment Sources
When you have multiple Segment sources (e.g., different schemas or databases),
you need to create a segment_sources.yml file in your project to define all
sources:
1. Create models/sources/segment/segment_sources.yml:
version: 2
sources:
- name: WORDPRESS_SITE
description: 'WordPress site Segment data'
database: DEV_RAW
schema: WORDPRESS_SITE
tables:
- name: TRACKS
description: 'Segment track events'
- name: PAGES
description: 'Segment page events'
- name: IDENTIFIES
description: 'Segment identify events'
- name: APPOINTMENT_FORM_SUBMITTED
description: 'Appointment form submission events'
- name: SERVER_AWS_LAMBDA_TRACKING
description: 'Server AWS Lambda tracking Segment data'
database: DEV_RAW
schema: SERVER_AWS_LAMBDA_TRACKING
tables:
- name: TRACKS
description: 'Segment track events'
- name: PAGES
description: 'Segment page events'
- name: IDENTIFIES
description: 'Segment identify events'
- name: NEW_LEAD_WEBSITE
description: 'New lead website events'
2. Configure segment_sources in your dbt_project.yml:
vars:
# Segment sources configuration for union_segment_sources macro
segment_sources:
- name: WORDPRESS_SITE
tracks:
- name: APPOINTMENT_FORM_SUBMITTED
conversion: true
- name: SERVER_AWS_LAMBDA_TRACKING
tracks:
- name: NEW_LEAD_WEBSITE
conversion: true
Why This Setup?
- The
segment_sources.ymldefines the actual database sources that dbt can reference - The
segment_sourcesvariable tells theunion_segment_sourcesmacro which sources to union and which specific track tables to include - This allows you to have different table structures across different Segment sources while still unioning them together
Advanced Configuration
vars:
nexus:
segment:
enabled: true
location:
database: DEV_RAW
schema: WORDPRESS_SITE
tables:
tracks: TRACKS
pages: PAGES
identifies: IDENTIFIES
capabilities:
events: true
persons: true
groups: false
memberships: false
attribution: true
# Configure which identifiers and traits to extract from identify events
identifiers:
- segment_anonymous_id
- user_id
- email
traits:
- segment_anonymous_id
- user_id
- email
- first_name
Note: Unlike other template sources, Segment requires explicit configuration
of both database and schema as there are no universal defaults for Segment
implementations.
Configurable Traits: The identifiers and traits arrays allow you to
specify which fields to extract from Segment identify events, making the
template source adaptable to different Segment implementations.
Customizing Identifiers and Traits
If your Segment implementation doesn't include certain fields (like
first_name), you can customize the configuration:
vars:
nexus:
segment:
enabled: true
# ... other configuration ...
identifiers:
- segment_anonymous_id
- user_id
- email
# Add custom identifiers as needed
traits:
- segment_anonymous_id
- user_id
- email
# Remove first_name if not available
# Add custom traits as needed
Available Fields: The template source supports these standard Segment identify fields:
segment_anonymous_id(fromanonymous_id)user_idemailfirst_name- Custom traits as configured in your Segment implementation
Data Sources
The template source dynamically references Segment data based on your configuration. For the example configuration above, it expects:
| Table | Description | Required |
|---|---|---|
DEV_RAW.WORDPRESS_SITE.TRACKS |
Track events from Segment | Yes |
DEV_RAW.WORDPRESS_SITE.PAGES |
Page view events from Segment | Yes |
DEV_RAW.WORDPRESS_SITE.IDENTIFIES |
Identify events from Segment | Yes |
Configuration Flexibility: The actual table names are determined by your
location configuration, making the template source adaptable to any Segment
implementation structure.
File Structure for Multiple Sources
When using multiple Segment sources, your project structure should look like:
models/
└── sources/
└── segment/
└── segment_sources.yml # Your project-specific sources
dbt_packages/
└── nexus/
└── models/
└── sources/
└── segment/
└── segment.yml # Nexus package sources (configurable)
Important: The nexus package's segment.yml remains configurable and should
not be modified. Your project-specific sources go in
models/sources/segment/segment_sources.yml.
Models
Core Models
segment_events
Unified events table containing all Segment event types (tracks, pages, identifies).
Key Fields:
event_id: Unique event identifieroccurred_at: Event timestampevent_type: Type of event (web, identity)event_name: Specific event namesource: Source system (segment)
segment_entity_identifiers
Entity identifiers from all Segment event types (person entities only).
Key Fields:
entity_identifier_id: Unique identifier record IDentity_type: Entity type (always 'person' for Segment)event_id: Reference to source eventidentifier_type: Type of identifier (segment_anonymous_id, user_id, email)identifier_value: Identifier valueoccurred_at: Timestamp when captured
segment_entity_traits
Entity traits and attributes from Segment events (person entities only).
Key Fields:
entity_trait_id: Unique trait record IDentity_type: Entity type (always 'person' for Segment)event_id: Reference to source eventtrait_name: Name of the traittrait_value: Trait valueoccurred_at: Timestamp when captured
Attribution Models
segment_touchpoints
Attribution touchpoints with UTM parameters and click IDs.
Key Fields:
touchpoint_id: Unique touchpoint identifiersource: UTM source or campaign sourcemedium: UTM medium or campaign mediumcampaign: UTM campaign or campaign namechannel: Classified channel (paid, social, organic, referral, direct)touchpoint_type: Type of touchpoint (campaign, facebook_click, referral, direct)fbclid: Facebook click IDgclid: Google click ID
Database Compatibility
The Segment template source is fully compatible with Snowflake and supports
the three-part naming convention (database.schema.table). It uses the
nexus_source macro for dynamic source resolution, making it adaptable to
different database structures.
Snowflake Configuration
vars:
nexus:
segment:
enabled: true
location:
database: YOUR_DATABASE # Required for Snowflake
schema: YOUR_SCHEMA # Required - no default
tables:
tracks: YOUR_TRACKS_TABLE
pages: YOUR_PAGES_TABLE
identifies: YOUR_IDENTIFIES_TABLE
Attribution Configuration
The template source supports attribution analysis through UTM parameters and click IDs:
UTM Parameters
utm_source: Traffic sourceutm_medium: Marketing mediumutm_campaign: Campaign nameutm_content: Content identifierutm_term: Keyword term
Click IDs
fbclid: Facebook click identifiergclid: Google Ads click identifier
Channel Classification
Events are automatically classified into channels:
- Paid: UTM parameters present
- Social: Facebook referrer or fbclid present
- Organic: Google referrer present
- Referral: External referrer (excluding internal domains)
- Direct: No attribution information
Referral Exclusions
The template source automatically excludes internal domains from referral
classification. Configure your internal domains in your project's
dbt_project.yml:
vars:
# Global configuration for all template sources
internal_domains:
- 'yourcompany.com'
- 'subsidiary.com'
# Attribution-specific exclusions (required for segment_touchpoints)
referral_exclusions:
- '%yourcompany.com%'
- '%subsidiary.com%'
Important: The referral_exclusions variable is required for the
segment_touchpoints model to work properly. Without this configuration, you'll
get compilation errors like "NoneType object is not iterable" because the model
uses Jinja templating to iterate over these exclusions.
The exclusions use SQL LIKE operators with % wildcards to match any URL
containing your domain (including subdomains). For example, %yourcompany.com%
will exclude:
https://www.yourcompany.comhttps://blog.yourcompany.com/pagehttps://yourcompany.com/landing-page
Usage Examples
Enable Segment Integration
# dbt_project.yml
vars:
nexus:
segment:
enabled: true
Query Attribution Data
-- Get attribution touchpoints
select
touchpoint_id,
source,
medium,
campaign,
channel,
touchpoint_type,
occurred_at
from {{ ref('segment_touchpoints') }}
where occurred_at >= current_date - 30
order by occurred_at desc
Analyze Person Journey
-- Get person events with identifiers
select
e.event_name,
e.occurred_at,
pi.identifier_type,
pi.identifier_value
from {{ ref('segment_events') }} e
join {{ ref('segment_entity_identifiers') }} pi
on e.event_id = pi.event_id
where pi.identifier_type = 'user_id'
and pi.identifier_value = 'user_123'
and pi.entity_type = 'person'
order by e.occurred_at desc
Testing
The template source includes comprehensive tests:
- Uniqueness: Event IDs, person identifier IDs, person trait IDs
- Not Null: Required fields validation
- Accepted Values: Event types, identifier types, trait names
- Expression Tests: ID format validation
Run tests with:
dbt test --select package:nexus segment
Troubleshooting
Common Issues
Models Not Building
- Ensure
nexus.sources.segment.enabled: truein your project configuration - Verify Segment source tables exist and are accessible
- Check that both
databaseandschemaare configured (no defaults for Segment)
Compilation Errors
- "NoneType object is not iterable": Ensure
referral_exclusionsvariable is configured in the nexus package - "Source not found": Verify table name casing matches your configuration (lowercase in YAML, uppercase in database)
- "Schema does not exist": Check that the schema name in your configuration matches your actual database schema
- "Source named 'X.Y' which was not found": For multiple segment sources,
ensure you have created
models/sources/segment/segment_sources.ymlwith all your segment sources defined - "union_segment_sources macro error": Verify that your
segment_sourcesvariable indbt_project.ymlmatches the source names in yoursegment_sources.ymlfile
Missing Attribution Data
- Check that UTM parameters are being sent in Segment events
- Verify referrer exclusions are configured correctly
- Ensure
referral_exclusionsvariable is defined to prevent compilation errors
Person Resolution Issues
- Ensure person identifiers are being captured in Segment
- Check that anonymous_id and user_id are being sent consistently
Snowflake-Specific Issues
- Verify three-part naming convention:
database.schema.table - Check that the
databaseparameter is set in your configuration - Ensure table names match your actual Snowflake table names (case-sensitive)
Debug Queries
-- Check event data availability
select
event_type,
count(*) as event_count,
min(occurred_at) as earliest_event,
max(occurred_at) as latest_event
from {{ ref('segment_events') }}
group by event_type
-- Check attribution data
select
channel,
touchpoint_type,
count(*) as touchpoint_count
from {{ ref('segment_touchpoints') }}
group by channel, touchpoint_type
v0.3.0 Entity-Centric Migration
The Segment template source has been updated for dbt-nexus v0.3.0 with entity-centric architecture:
Key Changes
- Model Names:
segment_person_identifiers→segment_entity_identifiers - Model Names:
segment_person_traits→segment_entity_traits - Field Names:
person_identifier_id→entity_identifier_id - Field Names:
person_trait_id→entity_trait_id - New Field:
entity_type(always 'person' for Segment) - Configuration:
nexus.segment.enabled→nexus.sources.segment.enabled
Migration Steps
- Update Configuration: Change to new
nexus.sources.segmentstructure - Update References: Update any custom models referencing old model names
- Add Entity Type Filtering: Add
entity_type = 'person'to queries if needed - Test Migration: Run
dbt run --select segment_entity_identifiers segment_entity_traits
Backward Compatibility
The nexus.segment configuration namespace is preserved for the unpivot macros,
so existing identifier and trait configurations continue to work.
Architecture Notes
Unlike other template sources (Gmail, Google Calendar), Segment uses a simplified architecture:
- No intermediate/union layers: Direct entity models without four-layer structure
- Person-only entities:
entity_type='person'is hardcoded (no groups) - No relationships: Segment doesn't track entity-to-entity relationships
- Simplified structure: Maintains Segment's flat, straightforward approach
Migration from Legacy Sources
If migrating from a legacy Segment source implementation:
- Backup Current Implementation: Save existing models and tests
- Enable Template Source: Set
nexus.sources.segment.enabled: true - Test Migration: Run
dbt run --select package:nexus segment - Update References: Update any custom models referencing old source models
- Remove Legacy Files: Delete old Segment source files
Technical Implementation
Dynamic Source Resolution
The Segment template source uses the nexus_source macro for dynamic source
resolution:
-- Base models use the nexus_source macro
select * from {{ nexus_source('segment', 'tracks') }}
select * from {{ nexus_source('segment', 'pages') }}
select * from {{ nexus_source('segment', 'identifies') }}
This macro automatically resolves to the correct database.schema.table structure based on your configuration.
Jinja Templating
The source definitions use Jinja templating for complete configurability:
sources:
- name:
"{{ var('nexus', {}).get('segment', {}).get('location', {}).get('schema')
}}"
database:
"{{ var('nexus', {}).get('segment', {}).get('location',
{}).get('database', '') }}"
tables:
- name:
"{{ var('nexus', {}).get('segment', {}).get('location',
{}).get('tables', {}).get('tracks', 'TRACKS') }}"
Error Prevention
The template source includes several error prevention measures:
- Referral Exclusions: Prevents
NoneTypeiteration errors by requiringreferral_exclusionsconfiguration - No Hardcoded Defaults: Forces explicit configuration to prevent assumptions about Segment implementations
- Case Sensitivity: Handles table name casing correctly for different database systems
Best Practices
- Consistent Naming: Use consistent event names in Segment
- UTM Parameters: Always include UTM parameters for campaign tracking
- Person Identification: Send both anonymous_id and user_id consistently
- Data Quality: Monitor for missing or invalid timestamps
- Attribution: Configure referrer exclusions for internal domains
- Configuration: Always specify both
databaseandschema(no defaults for Segment) - Testing: Test compilation before running models to catch configuration issues early
Support
For issues or questions:
- Check the troubleshooting guide
- Review existing implementations in other client projects
- Consult the Segment documentation
Ready to get started? Enable the Segment template source in your project
configuration and run dbt run --select package:nexus segment to begin
processing your Segment data.