Environment Management
Effective environment management is crucial for maintaining data quality and enabling collaborative development in dbt projects. Properly separating development, testing, and production environments allows teams to develop and test transformations without disrupting critical business processes.
Why Environment Separation Matters
Imagine working in a single environment where both developers and business users access the same system. When an analytics engineer accidentally introduces a breaking change to an important model, it immediately impacts dashboards and reports that business leaders rely on. This scenario erodes trust in data and can lead to poor decision-making.
Reduced risk
Test changes without affecting production data or business users
Improved collaboration
Multiple team members can work simultaneously without conflicts
Better quality control
Validate transformations before they reach decision-makers
Deployment control
Manage the release process deliberately with proper reviews
Common Environment Approaches
There are several ways to implement environments in dbt, with no single consensus in the community. The approach you choose depends on factors like team size, administrative capacity, and cost considerations.
1. One Database, Multiple Schemas
This is dbt's recommended approach: using different schemas within one data warehouse to separate environments, with each developer having their own development schema.
This approach provides good isolation while minimizing administrative overhead.
2. One Database per Environment
Another approach is to have separate databases for development and production, with individual schemas for each developer in the development database.
This keeps the production database clean and uncluttered, while providing clear separation between environments.
3. One Database per Developer
In some cases, teams opt to give each developer a personal copy of the production database. Schemas remain identical across all databases, and developers can clone data or run models as needed.
This provides maximum isolation but can increase administrative burden and costs.
Configuring Environments in dbt
dbt provides several mechanisms for managing environments:
Using profiles.yml
The primary way to configure environments is through your profiles.yml
file:
Switch between environments by specifying the target:
Using Target Schemas
One powerful feature for environment management is dbt's target schema functionality, which appends the target name to your schema:
This automatically creates separate schemas for each environment while maintaining consistent naming.
Environment-Specific Variables
Define variables that change based on the environment:
Access these in your models:
Key Considerations for Environment Management
When designing your environment strategy, consider these important factors:
Administrative Burden
Managing multiple environments becomes complex with larger teams
• Use custom schemas with consistent naming • Create self-service options for refreshing development data • Automate environment provisioning and cleanup
Team Size & Collaboration
The number of developers affects your environment strategy
• Small teams (1-2 developers): Consider shared development environment • Larger teams: Implement discrete spaces to avoid conflicts • Enable read access to other development environments for collaboration
Cost Management
Multiple copies of production data increases compute and storage costs
• Use data filtering in development (e.g., WHERE date >= CURRENT_DATE - 30
)
• Utilize database cloning features when available
• Evaluate if complete production copies are necessary
• Limit development compute resources
Environment Management in Paradime
Paradime enhances environment management with several integrated features:
Project Environments
Paradime allows you to create and manage environments through its interface, with environment-specific settings, variables, access permissions, and logging.
Deployment Management
The platform streamlines deployment between environments through deployment packages, scheduled promotions, change tracking, and rollback capabilities.
Best Practices
Regardless of your chosen approach, follow these environment management best practices:
Development
• Use consistent naming conventions • Establish processes for regular environment cleanup • Enable detailed logging for troubleshooting • Set query timeouts to catch inefficient models early
Testing/Staging
• Maintain a near-exact replica of production • Implement automated testing through CI/CD pipelines • Regularly validate with production-like data volumes • Perform performance testing before deployment
Production
• Implement change approval workflows • Use appropriately sized compute resources • Set up monitoring and alerting for performance and errors • Maintain thorough documentation of the environment
Last updated
Was this helpful?