Paradime Help Docs
Get Started
  • 🚀Introduction
  • 📃Guides
    • Paradime 101
      • Getting Started with your Paradime Workspace
        • Creating a Workspace
        • Setting Up Data Warehouse Connections
        • Managing workspace configurations
        • Managing Users in the Workspace
      • Getting Started with the Paradime IDE
        • Setting Up a dbt™ Project
        • Creating a dbt™ Model
        • Data Exploration in the Code IDE
        • DinoAI: Accelerating Your Analytics Engineering Workflow
          • DinoAI Agent
            • Creating dbt Sources from Data Warehouse
            • Generating Base Models
            • Building Intermediate/Marts Models
            • Documentation Generation
            • Data Pipeline Configuration
            • Using .dinorules to Tailor Your AI Experience
          • Accelerating GitOps
          • Accelerating Data Governance
          • Accelerating dbt™ Development
        • Utilizing Advanced Developer Features
          • Visualize Data Lineage
          • Auto-generated Data Documentation
          • Enforce SQL and YAML Best Practices
          • Working with CSV Files
      • Managing dbt™ Schedules with Bolt
        • Creating Bolt Schedules
        • Understanding schedule types and triggers
        • Viewing Run History and Analytics
        • Setting Up Notifications
        • Debugging Failed Runs
    • Migrating from dbt™ cloud to Paradime
  • 🔍Concepts
    • Working with Git
      • Git Lite
      • Git Advanced
      • Read Only Branches
      • Delete Branches
      • Merge Conflicts
      • Configuring Signed Commits on Paradime with SSH Keys
      • GitHub Branch Protection Guide: Preventing Direct Commits to Main
    • dbt™ fundamentals
      • Getting started with dbt™
        • Introduction
        • Project Strucuture
        • Working with Sources
        • Testing Data Quality
        • Models and Transformations
      • Configuring your dbt™ Project
        • Setting up your dbt_project.yml
        • Defining Your Sources in sources.yml
        • Testing Source Freshness
        • Unit Testing
        • Working with Tags
        • Managing Seeds
        • Environment Management
        • Variables and Parameters
        • Macros
        • Custom Tests
        • Hooks & Operational Tasks
        • Packages
      • Model Materializations
        • Table Materialization
        • View​ Materialization
        • Incremental Materialization
          • Using Merge for Incremental Models
          • Using Delete+Insert for Incremental Models
          • Using Append for Incremental Models
          • Using Microbatch for Incremental Models
        • Ephemeral Materialization
        • Snapshots
      • Running dbt™
        • Mastering the dbt™ CLI
          • Commands
          • Methods
          • Selector Methods
          • Graph Operators
    • Paradime fundamentals
      • Global Search
        • Paradime Apps Navigation
        • Invite users to your workspace
        • Search and preview Bolt schedules status
      • Using --defer in Paradime
      • Workspaces and data mesh
    • Data Warehouse essentials
      • BigQuery Multi-Project Service Account
  • 📖Documentation
    • DinoAI
      • Agent Mode
        • Use Cases
          • Creating Sources from your Warehouse
          • Generating dbt™ models
          • Fixing Errors with Jira
          • Researching with Perplexity
          • Providing Additional Context Using PDFs
      • Context
        • File Context
        • Directory Context
      • Tools and Features
        • Warehouse Tool
        • File System Tool
        • PDF Tool
        • Jira Tool
        • Perplexity Tool
        • Terminal Tool
        • Coming Soon Tools...
      • .dinorules
      • Ask Mode
      • Version Control
      • Production Pipelines
      • Data Documentation
    • Code IDE
      • User interface
        • Autocompletion
        • Context Menu
        • Flexible layout
        • "Peek" and "Go To" Definition
        • IDE preferences
        • Shortcuts
      • Left Panel
        • DinoAI Coplot
        • Search, Find, and Replace
        • Git Lite
        • Bookmarks
      • Command Panel
        • Data Explorer
        • Lineage
        • Catalog
        • Lint
      • Terminal
        • Running dbt™
        • Paradime CLI
      • Additional Features
        • Scratchpad
    • Bolt
      • Creating Schedules
        • 1. Schedule Settings
        • 2. Command Settings
          • dbt™ Commands
          • Python Scripts
          • Elementary Commands
          • Lightdash Commands
          • Tableau Workbook Refresh
          • Power BI Dataset Refresh
          • Paradime Bolt Schedule Toggle Commands
          • Monte Carlo Commands
        • 3. Trigger Types
        • 4. Notification Settings
        • Templates
          • Run and Test all your dbt™ Models
          • Snapshot Source Data Freshness
          • Build and Test Models with New Source Data
          • Test Code Changes On Pull Requests
          • Re-executes the last dbt™ command from the point of failure
          • Deploy Code Changes On Merge
          • Create Jira Tickets
          • Trigger Census Syncs
          • Trigger Hex Projects
          • Create Linear Issues
          • Create New Relic Incidents
          • Create Azure DevOps Items
        • Schedules as Code
      • Managing Schedules
        • Schedule Configurations
        • Viewing Run Log History
        • Analyzing Individual Run Details
          • Configuring Source Freshness
      • Bolt API
      • Special Environment Variables
        • Audit environment variables
        • Runtime environment variables
      • Integrations
        • Reverse ETL
          • Hightouch
        • Orchestration
          • Airflow
          • Azure Data Factory (ADF)
      • CI/CD
        • Turbo CI
          • Azure DevOps
          • BitBucket
          • GitHub
          • GitLab
          • Paradime Turbo CI Schema Cleanup
        • Continuous Deployment with Bolt
          • GitHub Native Continuous Deployment
          • Using Azure Pipelines
          • Using BitBucket Pipelines
          • Using GitLab Pipelines
        • Column-Level Lineage Diff
          • dbt™ mesh
          • Looker
          • Tableau
          • Thoughtspot
    • Radar
      • Get Started
      • Cost Management
        • Snowflake Cost Optimization
        • Snowflake Cost Monitoring
        • BigQuery Cost Monitoring
      • dbt™ Monitoring
        • Schedules Dashboard
        • Models Dashboard
        • Sources Dashboard
        • Tests Dashboard
      • Team Efficiency Tracking
      • Real-time Alerting
      • Looker Monitoring
    • Data Catalog
      • Data Assets
        • Looker assets
        • Tableau assets
        • Power BI assets
        • Sigma assets
        • ThoughtSpot assets
        • Fivetran assets
        • dbt™️ assets
      • Lineage
        • Search and Discovery
        • Filters and Nodes interaction
        • Nodes navigation
        • Canvas interactions
        • Compare Lineage version
    • Integrations
      • Dashboards
        • Sigma
        • ThoughtSpot (Beta)
        • Lightdash
        • Tableau
        • Looker
        • Power BI
        • Streamlit
      • Code IDE
        • Cube CLI
        • dbt™️ generator
        • Prettier
        • Harlequin
        • SQLFluff
        • Rainbow CSV
        • Mermaid
          • Architecture Diagrams
          • Block Diagrams Documentation
          • Class Diagrams
          • Entity Relationship Diagrams
          • Gantt Diagrams
          • GitGraph Diagrams
          • Mindmaps
          • Pie Chart Diagrams
          • Quadrant Charts
          • Requirement Diagrams
          • Sankey Diagrams
          • Sequence Diagrams
          • State Diagrams
          • Timeline Diagrams
          • User Journey Diagrams
          • XY Chart
          • ZenUML
        • pre-commit
          • Paradime Setup and Configuration
          • dbt™️-checkpoint hooks
            • dbt™️ Model checks
            • dbt™️ Script checks
            • dbt™️ Source checks
            • dbt™️ Macro checks
            • dbt™️ Modifiers
            • dbt™️ commands
            • dbt™️ checks
          • SQLFluff hooks
          • Prettier hooks
      • Observability
        • Elementary Data
          • Anomaly Detection Tests
            • Anomaly tests parameters
            • Volume anomalies
            • Freshness anomalies
            • Event freshness anomalies
            • Dimension anomalies
            • All columns anomalies
            • Column anomalies
          • Schema Tests
            • Schema changes
            • Schema changes from baseline
          • Sending alerts
            • Slack alerts
            • Microsoft Teams alerts
            • Alerts Configuration and Customization
          • Generate observability report
          • CLI commands and usage
        • Monte Carlo
      • Storage
        • Amazon S3
        • Snowflake Storage
      • Reverse ETL
        • Hightouch
      • CI/CD
        • GitHub
        • Spectacles
      • Notifications
        • Microsoft Teams
        • Slack
      • ETL
        • Fivetran
    • Security
      • Single Sign On (SSO)
        • Okta SSO
        • Azure AD SSO
        • Google SAML SSO
        • Google Workspace SSO
        • JumpCloud SSO
      • Audit Logs
      • Security model
      • Privacy model
      • FAQs
      • Trust Center
      • Security
    • Settings
      • Workspaces
      • Git Repositories
        • Importing a repository
          • Azure DevOps
          • BitBucket
          • GitHub
          • GitLab
        • Update connected git repository
      • Connections
        • Code IDE environment
          • Amazon Athena
          • BigQuery
          • Clickhouse
          • Databricks
          • Dremio
          • DuckDB
          • Firebolt
          • Microsoft Fabric
          • Microsoft SQL Server
          • MotherDuck
          • PostgreSQL
          • Redshift
          • Snowflake
          • Starburst/Trino
        • Scheduler environment
          • Amazon Athena
          • BigQuery
          • Clickhouse
          • Databricks
          • Dremio
          • DuckDB
          • Firebolt
          • Microsoft Fabric
          • Microsoft SQL Server
          • MotherDuck
          • PostgreSQL
          • Redshift
          • Snowflake
          • Starburst/Trino
        • Manage connections
          • Set alternative default connection
          • Delete connections
        • Cost connection
          • BigQuery cost connection
          • Snowflake cost connection
        • Connection Security
          • AWS PrivateLink
            • Snowflake PrivateLink
            • Redshift PrivateLink
          • BigQuery OAuth
          • Snowflake OAuth
        • Optional connection attributes
      • Notifications
      • dbt™
        • Upgrade dbt Core™ version
      • Users
        • Invite users
        • Manage Users
        • Enable Auto-join
        • Users and licences
        • Default Roles and Permissions
        • Role-based access control
      • Environment Variables
        • Bolt Schedules Environment Variables
        • Code IDE Environment Variables
  • 💻Developers
    • GraphQL API
      • Authentication
      • Examples
        • Audit Logs API
        • Bolt API
        • User Management API
        • Workspace Management API
    • Python SDK
      • Getting Started
      • Modules
        • Audit Log
        • Bolt
        • Lineage Diff
        • Custom Integration
        • User Management
        • Workspace Management
    • Paradime CLI
      • Getting Started
      • Bolt CLI
    • Webhooks
      • Getting Started
      • Custom Webhook Guides
        • Create an Azure DevOps Work item when a Bolt run complete with errors
        • Create a Linear Issue when a Bolt run complete with errors
        • Create a Jira Issue when a Bolt run complete with errors
        • Trigger a Slack notification when a Bolt run is overrunning
    • Virtual Environments
      • Using Poetry
      • Troubleshooting
    • API Keys
    • IP Restrictions in Paradime
    • Company & Workspace token
  • 🙌Best Practices
    • Data Mesh Setup
      • Configure Project dependencies
      • Model access
      • Model groups
  • ‼️Troubleshooting
    • Errors
    • Error List
    • Restart Code IDE
  • 🔗Other Links
    • Terms of Service
    • Privacy Policy
    • Paradime Blog
Powered by GitBook
On this page
  • Introduction
  • Application components
  • Networks
  • AWS VPC
  • AWS RDS
  • Hashicorp Vault
  • AWS FsX Share
  • AWS EKS
  • Auth0
  • Sendgrid
  • Third party integrations
  • Slack
  • Looker
  • Tableau
  • Data Security Model
  • Data Warehouse
  • Summary

Was this helpful?

  1. Documentation
  2. Security

Security model

PreviousAudit LogsNextPrivacy model

Last updated 11 months ago

Was this helpful?

Introduction

In this section we describe the data security model of the Paradime application and its infrastructure components. This document is intended to help both practitioners seeking to understand the architecture of the Paradime application as well as administrators and security professionals.

Application components

The diagram below shows the critical components of the application and we will explain the security considerations for each component in more detail below.

Networks

Any traffic from the internet first reaches AWS edge-locations and then using Global Accelerator, they are routed through the AWS network. This improves the speed and reduces application load times for our users all over the world. Once within the AWS Cloud, the traffic is passed through Web Application Firewall (WAF) - to filter out any malicious traffic, DDoS attacks and bots trying to attack the Paradime application. The filtered traffic then hits our Load Balancers which then route the traffic to our kubernetes ingress controller located in our EKS cluster. Having this 4 stage network control helps us provide low latency, globally available and secure network traffic to our application inside the EKS cluster. Also to mention all compute and storage components within the VPC are never exposed to the internet and are all run inside private subnets.

Where customers need additional security, we provide AWS Private Link service between Paradime and customer VPC so any network traffic stays within AWS network without hitting the internet. This means Paradime can be connected to the customer's VPC within the same subnet IP range as the customer.

AWS VPC

The Paradime app infrastructure lives in an AWS VPC managed by Paradime Labs. The VPC is shared by all customers within a region, but we can handle single tenant deployment upon request where we can deploy Paradime in a dedicated VPC at an extra cost.

AWS RDS

The Paradime app leverages the AWS Postgres RDS to store application data. Every company has their own Postgres database with their own randomly generated credentials so that each company’s data is effectively isolated in single tenant mode. No secrets or user credentials are stored in the database. The names and email addresses of users within a company are stored with our identity provider (Auth0) and we only store a unique alpha-numeric identifier in our database that allows us to fetch user details from Auth0 into memory during application runtime over a https API call without needing any storage in the backend.

Hashicorp Vault

We use Hashicorp Vault on our own infrastructure in AWS, as our trusted storage for secrets and credentials encrypted at rest and in transit. This allows us to store sensitive information with all the power and security of Vault. In Vault, specifically we store:

  • User specific credentials to access data warehouse connected to an user’s company

  • User specific environment variables keys and values

  • Company specific credentials to access the company’s warehouse, git repository and third party integrations that the admin of the company has connected.

AWS FsX Share

For every company, we provision an AWS FSX File System with their own subpath and exclusive access to users of that company to that subpath. In the FsX shares, we hold:

  • A clone of the company’s dbt™️ git repository for every developer so that changes to code made by one developer does not affect others - each developer/admin has access to only their own folder and their dbt™️ profile.yml credentials also live in their own folder that nobody else has access to

  • The SSH key to access the company specific dbt™️ repository

  • The profile.yml file that dbt™️ uses to execute dbt™️ CLI commands and connect to data warehouse is also sand-boxed inside each developer/admin's cloned dbt™️ git repo

AWS EKS

The application uses AWS Elastic Kubernetes Service (EKS) to manage our application resources. EKS provides a high degree of reliability and scalability for the Paradime application. Within our kubernetes application, we have two relevant units:

  • Paradime controller - runs the overall Paradime application

  • Company-specific controller - runs company specific workloads including requesting company-specific secrets from Vault

Both the controllers above are designed to request secrets from Vault without caching except for postgres db credentials and git repository ssh key for speed and performance considerations.

Auth0

To identify and authenticate users, we use Auth0. Auth0 provides a secure authorization and identification platform and it allows us to provide:

  • Ability to login with your G-Suite or other SSO credentials like OKTA and Google SAML

  • have user credentials stored in a secure environment outside of Paradime

Sendgrid

Third party integrations

All the connection options to third party integrations are optional.

Slack

Company administrators can connect Paradime to their Slack workspace - this allows Paradime users to invite others to the app seamlessly and collaborate. The Slack integration requires administrators use OAuth 2.0 to authenticate the Paradime Slack App with the company’s Slack workspace and an OAuth token is stored by us in our Vault. Whenever a user wishes to invite others using Slack, we fetch a list of users from Slack using the pre-authorized token. However, we don’t store any list of users on our infrastructure and this list is only fetched into memory at the time when an user chooses to invite others.

Looker

Company administrators can connect Paradime to their Looker workspace - this allows us to build the data sources to data dashboards lineage, where the customer is using Looker. The Looker integration requires:

  • entering an SSH key generated by us in the git repo where LookML files are stored

  • an API Key and API Secret generated against the user used to connect to Looker using Looker API 3

Tableau

Company administrators can connect Paradime to their Tableau site - this allows us to build the data sources to data dashboards lineage, where the customer is using Tableau Online. The Tableau integration requires:

  • a Token Name and a Token Value generated for the Tableau Site we are connecting to

Data Security Model

Data Security Model

Using AWS RDS, EKS and Hashicorp Vault, we have built a logically separate single-tenant architecture with the following features:

  • the data between multiple companies are kept separate using kubernetes namespaces as a result each company's resources are logically separate

  • within a company, dbt™️ code, operational data and sensitive credentials are all separate

  • for dbt™️ code, each developer is allocated an isolated pod so that no developer is able to gain access to another developers folder

  • access to FsX, or to RDS is managed through randomly generated usernames and passwords that are stored in Vault

All the above, ensures there is no single point of failure that a potential attacker can use to compromise the security of the platform.

Data Warehouse

Only company administrators can add a data warehouse to the Paradime company account. Currently Snowflake, BigQuery, Redshift and Firebolt are supported. This connection is ONLY necessary if the users within an account want to run their dbt™️ CLI commands like “dbt run” etc. from within the application.

The Paradime application, when connected to a data warehouse, enables users to dispatch dbt™️ commands, which in turn dispatch SQL to the connected warehouse for transformation purposes. However, it is possible for users to dispatch SQL that returns customer data into the Paradime application. But, this data is never persisted and will only exist in the memory on the kubernetes pod attached to the currently logged in user. Hence, here Paradime’s primary role is always as a data processor and not a data store.

In order to properly lock down data, the administrator should connect Paradime to the dev warehouse environment and apply proper data warehouse permissions in that environment outside of Paradime to prevent improper access or storage of sensitive data. Individual users only have access to the dev environment. Only admins can set up connection to production warehouse and production warehouse is only required to run scheduled jobs and not for day to day using Paradime.

In terms of specific access requirements for each warehouse type, it’s identical to what one would need to run dbt™️ and their requirements are set out below in dbt™️ help docs:

Important to note that for Snowflake and Redshift, we support per user access credentials including database, schema, username and password..

Summary

At the end, we would like to summarize the following main points about the Paradime.io application security model:

  • In our AWS infrastructure, we store each company’s data separately and within a company we store sensitive information in Hashicorp Vault, non-sensitive application data in AWS Postgres protected by randomly generated password and code in company specific protected AWS FsX subpath and per developer kubernetes pods.

  • As a result, in our application design there is no single point of failure.

  • We don’t store or manage user credentials and we use Auth0, who provide secure and compliant storage of user information.

  • Whenever we use a 3rd party data processor, we use GDPR and SOC2 compliant applications so that our user’s security is not compromised.

  • For our data warehouse connection, we operate as a data processor and don’t store any data.

  • Having a warehouse connection is not mandatory but it provides significant productivity advantages to developer workflows when using Paradime. Having a connection to production data warehouse is not strictly needed to use Paradime.

  • We never access or need to access the customers own data in production. For data catalog & cost analytics, we need access to the information schema and cost analytics permissions on Snowflake and NOT to actual customer data. In Workbench, when the user queries their own customer data, the information is only held in memory in the location where the Paradime workspace is setup and is erased from memory upon page refresh.

  • Our connection requirements for warehouses are identical to what dbt™️ requires to operate.

have user credentials stored in a GDPR and SOC 2 compliant storage in a secure environment - more can be found here:

We use Sendgrid to send emails to notify our users when needed e.g. during onboarding, email confirmation and resetting passwords. Sendgrid is a GDPR and SOC 2 compliant service that is used by the world’s biggest companies to send email. All Sendgrid emails are triggered through authenticated HTTPS REST API. Sendgrid's security information and compliance status is available here: .

BigQuery -

Snowflake -

Redshift -

Firebolt -

Databricks -

For Snowflake, we support OAuth using . Using Snowflake OAuth, users will be able to connect Paradime to their Snowflake without sharing their username and password.

We have been independently audited and our SOC 2 report is available upon request. We use Drata, Inc. () to continuously monitor our infrastructure, policies and personnel against control tests. The real-time state of our compliance and monitoring can be found in our .

📖
https://auth0.com/security
https://sendgrid.com/policies/security/
https://docs.getdbt.com/reference/warehouse-profiles/bigquery-profile
https://docs.getdbt.com/reference/warehouse-profiles/snowflake-profile
https://docs.getdbt.com/reference/warehouse-profiles/redshift-profile
https://docs.getdbt.com/reference/warehouse-profiles/firebolt-profile
https://docs.getdbt.com/reference/warehouse-setups/databricks-setup
Snowflake Security Integration
https://drata.com
Trust Center
Critical application components
Data Security Model