Environment Management

Effective environment management is crucial for maintaining data quality and enabling collaborative development in dbt projects. Properly separating development, testing, and production environments allows teams to develop and test transformations without disrupting critical business processes.

Why Environment Separation Matters

Imagine working in a single environment where both developers and business users access the same system. When an analytics engineer accidentally introduces a breaking change to an important model, it immediately impacts dashboards and reports that business leaders rely on. This scenario erodes trust in data and can lead to poor decision-making.

Benefit

Description

Reduced risk

Test changes without affecting production data or business users

Improved collaboration

Multiple team members can work simultaneously without conflicts

Better quality control

Validate transformations before they reach decision-makers

Deployment control

Manage the release process deliberately with proper reviews

Common Environment Approaches

There are several ways to implement environments in dbt, with no single consensus in the community. The approach you choose depends on factors like team size, administrative capacity, and cost considerations.

1. One Database, Multiple Schemas

This is dbt's recommended approach: using different schemas within one data warehouse to separate environments, with each developer having their own development schema.

# Example profiles.yml
my_project:
  target: dev
  outputs:
    dev:
      type: snowflake
      schema: dbt_jsmith  # Schema named after user
      # Other connection details
    prod:
      type: snowflake
      schema: analytics
      # Other connection details

This approach provides good isolation while minimizing administrative overhead.

2. One Database per Environment

Another approach is to have separate databases for development and production, with individual schemas for each developer in the development database.

This keeps the production database clean and uncluttered, while providing clear separation between environments.

3. One Database per Developer

In some cases, teams opt to give each developer a personal copy of the production database. Schemas remain identical across all databases, and developers can clone data or run models as needed.

This provides maximum isolation but can increase administrative burden and costs.

Configuring Environments in dbt

dbt provides several mechanisms for managing environments:

Using profiles.yml

The primary way to configure environments is through your profiles.yml file:

my_project:
  target: dev  # Default target
  outputs:
    dev:
      type: snowflake
      schema: dbt_jsmith
      # Other connection details
    test:
      type: snowflake
      schema: dbt_staging
      # Other connection details
    prod:
      type: snowflake
      schema: analytics
      # Other connection details

Switch between environments by specifying the target:

dbt run --target prod

Using Target Schemas

One powerful feature for environment management is dbt's target schema functionality, which appends the target name to your schema:

# In dbt_project.yml
models:
  my_project:
    +schema: marketing  # Creates schema: marketing_dev, marketing_prod, etc.

This automatically creates separate schemas for each environment while maintaining consistent naming.

Environment-Specific Variables

Define variables that change based on the environment:

# In dbt_project.yml
vars:
  dev:
    log_level: debug
    row_limit: 1000
  prod:
    log_level: info
    row_limit: null

Access these in your models:

SELECT *
FROM {{ ref('large_table') }}
{% if var('row_limit') %}
LIMIT {{ var('row_limit') }}
{% endif %}

Key Considerations for Environment Management

When designing your environment strategy, consider these important factors:

Consideration

Details

Recommendations

Administrative Burden

Managing multiple environments becomes complex with larger teams

• Use custom schemas with consistent naming • Create self-service options for refreshing development data • Automate environment provisioning and cleanup

Team Size & Collaboration

The number of developers affects your environment strategy

• Small teams (1-2 developers): Consider shared development environment • Larger teams: Implement discrete spaces to avoid conflicts • Enable read access to other development environments for collaboration

Cost Management

Multiple copies of production data increases compute and storage costs

• Use data filtering in development (e.g., WHERE date >= CURRENT_DATE - 30) • Utilize database cloning features when available • Evaluate if complete production copies are necessary • Limit development compute resources

Environment Management in Paradime

Paradime enhances environment management with several integrated features:

Project Environments

Paradime allows you to create and manage environments through its interface, with environment-specific settings, variables, access permissions, and logging.

Deployment Management

The platform streamlines deployment between environments through deployment packages, scheduled promotions, change tracking, and rollback capabilities.

In Paradime, environments are designed to work with your chosen approach, whether you use schema-based separation or distinct databases. See Connection documentation for details.

Best Practices

Regardless of your chosen approach, follow these environment management best practices:

Environment

Best Practices

Development

• Use consistent naming conventions • Establish processes for regular environment cleanup • Enable detailed logging for troubleshooting • Set query timeouts to catch inefficient models early

Testing/Staging

• Maintain a near-exact replica of production • Implement automated testing through CI/CD pipelines • Regularly validate with production-like data volumes • Perform performance testing before deployment

Production

• Implement change approval workflows • Use appropriately sized compute resources • Set up monitoring and alerting for performance and errors • Maintain thorough documentation of the environment

PreviousManaging Seeds NextVariables and Parameters

Last updated 5 months ago

Was this helpful?