# Airbyte CLI

The Paradime SDK provides CLI commands to interact with Airbyte, allowing you to trigger sync and reset jobs for connections with real-time monitoring capabilities. Works with both Airbyte Cloud and self-hosted Airbyte Server.

{% hint style="warning" %}
🔑 **API Access Required**

You will need Airbyte credentials to use these commands:

* **Airbyte Cloud**: Client ID and Client Secret from your [Airbyte Cloud account settings](https://cloud.airbyte.com/settings/application). [More here](https://docs.paradime.io/app-help/documentation/integrations/etl/airbyte-cloud)
* **Airbyte Server**: API Key and API Secret from your self-hosted Airbyte instance. [More here](#airbyte-server-self-hosted)

**🔧 Connector Configuration**

* Ensure the connection exists and is enabled in the Airbyte dashboard.
* Confirm that both the source and destination connectors have valid credentials and can connect.
* Make sure the credentials used have the necessary permissions to run syncs and access resources.
* Run manual tests or sample syncs in the Airbyte UI to validate configuration prior to large or automated runs.
  {% endhint %}

## Trigger Airbyte Sync/Reset

Trigger sync or reset jobs for one or more Airbyte connections with real-time progress monitoring and comprehensive status reporting.

### CLI Command

```bash
paradime run airbyte-sync
```

#### Options

| Flag                    | Type                      | Description                                                                                                                                              |
| ----------------------- | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--connection-id`       | Required, TEXT (multiple) | The ID of the Airbyte connection(s) you want to run jobs for. Can specify multiple connections by repeating the flag.                                    |
| `--job-type`            | Required, CHOICE          | Type of job to run. Choices: `sync`, `reset`.                                                                                                            |
| `--client-id`           | Required, TEXT            | Your Airbyte client ID (Cloud) or API key (Server). Can be set via `AIRBYTE_CLIENT_ID` environment variable.                                             |
| `--client-secret`       | Required, TEXT            | Your Airbyte client secret (Cloud) or API secret (Server). Can be set via `AIRBYTE_CLIENT_SECRET` environment variable.                                  |
| `--base-url`            | Optional, TEXT            | Airbyte API base URL. Default: `https://api.airbyte.com/v1` (Cloud). Can be set via `AIRBYTE_BASE_URL` environment variable.                             |
| `--use-server-auth`     | Optional, Flag            | Use authentication for self-hosted Airbyte Server (instead of OAuth for Cloud). Default: `False`. Can be set via `USE_SERVER_AUTH` environment variable. |
| `--workspace-id`        | Optional, TEXT            | Optional workspace ID to scope the job.                                                                                                                  |
| `--wait-for-completion` | Optional, Flag            | Wait for jobs to complete before returning. Shows real-time progress and final status. Default: `True`.                                                  |
| `--timeout-minutes`     | Optional, INTEGER         | Maximum time to wait for job completion (in minutes). Only used with `--wait-for-completion`. Default: 1440 (24 hours).                                  |

{% hint style="info" %}
**Recommended Setup**

For security and convenience, set your Airbyte credentials as environment variables:

**For Airbyte Cloud:**

```bash
AIRBYTE_CLIENT_ID="your_client_id"
AIRBYTE_CLIENT_SECRET="your_client_secret"
```

**For Airbyte Server:**

```bash
AIRBYTE_CLIENT_ID="your_api_key"
AIRBYTE_CLIENT_SECRET="your_api_secret"
AIRBYTE_BASE_URL="http://localhost:8001/api/v1"
```

{% endhint %}

## Job Examples

### Sync a single connection (Airbyte Cloud)

```bash
# Using environment variables (recommended)
paradime run airbyte-sync \
  --connection-id "e3b2eda2-44af-4e32-b1b9-8b8c9e2d1234" \
  --job-type sync
```

### Sync multiple connections in parallel

```bash
# Using environment variables (recommended)
paradime run airbyte-sync \
  --connection-id "sales_connection_id" \
  --connection-id "marketing_connection_id" \
  --connection-id "support_connection_id" \
  --job-type sync
```

### Trigger job without waiting for completion

```bash
# Start job and return immediately without monitoring
paradime run airbyte-sync \
  --connection-id "large_dataset_connection" \
  --job-type sync \
  --no-wait-for-completion
```

### Custom workspace and timeout

```bash
# With specific workspace and extended timeout
paradime run airbyte-sync \
  --connection-id "connection_id" \
  --job-type sync \
  --workspace-id "workspace_123" \
  --timeout-minutes 2880  # 48 hours
```

{% hint style="success" %}

### Sample Output

```
Starting sync jobs for 3 Airbyte connection(s)...

============================================================
🚀 TRIGGERING AIRBYTE JOBS
============================================================

[1/3] 🔌 sales_postgres_connection
----------------------------------------

[2/3] 🔌 marketing_api_connection
----------------------------------------

[3/3] 🔌 support_webhook_connection
----------------------------------------

============================================================
⚡ LIVE PROGRESS
============================================================
19:15:32 🔍 [sales_postgres_connection] Checking connection status...
19:15:32 🔍 [marketing_api_connection] Checking connection status...
19:15:32 🔍 [support_webhook_connection] Checking connection status...
19:15:33 📊 [sales_postgres_connection] Status: active
19:15:33 🚀 [sales_postgres_connection] Triggering sync job...
19:15:33 📊 [marketing_api_connection] Status: active
19:15:33 🚀 [marketing_api_connection] Triggering sync job...
19:15:33 📊 [support_webhook_connection] Status: inactive
19:15:33 ⚠️  [support_webhook_connection] Warning: Connection is inactive
19:15:34 ✅ [sales_postgres_connection] Job triggered successfully (ID: job_abc123)
19:15:34 ⏳ [sales_postgres_connection] Monitoring job progress...
19:15:34 ✅ [marketing_api_connection] Job triggered successfully (ID: job_def456)
19:15:34 ⏳ [marketing_api_connection] Monitoring job progress...
19:15:35 ⏳ [sales_postgres_connection] Job pending... (0m 1s elapsed)
19:15:35 ⏳ [marketing_api_connection] Job pending... (0m 1s elapsed)
19:16:40 🔄 [marketing_api_connection] Job running... (1m 5s elapsed)
19:16:45 🔄 [sales_postgres_connection] Job running... (1m 10s elapsed)
19:17:18 ✅ [sales_postgres_connection] Job completed successfully (1m 43s)
19:17:40 ✅ [marketing_api_connection] Job completed successfully (2m 5s)

================================================================================
📊 JOB RESULTS
================================================================================
CONNECTION                STATUS     JOB TYPE
------------------------- ---------- ----------
sales_postgres_connection ✅ SUCCESS  SYNC
marketing_api_connection  ✅ SUCCESS  SYNC
support_webhook_connection⚠️ INCOMPLETE SYNC
================================================================================
```

{% endhint %}

## List Airbyte Connections

List all available Airbyte connections with their IDs, status, and configuration details.

### CLI Command

```bash
paradime run airbyte-list-connections
```

#### Options

| Flag                | Type           | Description                                                                                                                  |
| ------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `--client-id`       | Required, TEXT | Your Airbyte client ID (Cloud) or API key (Server). Can be set via `AIRBYTE_CLIENT_ID` environment variable.                 |
| `--client-secret`   | Required, TEXT | Your Airbyte client secret (Cloud) or API secret (Server). Can be set via `AIRBYTE_CLIENT_SECRET` environment variable.      |
| `--base-url`        | Optional, TEXT | Airbyte API base URL. Default: `https://api.airbyte.com/v1` (Cloud). Can be set via `AIRBYTE_BASE_URL` environment variable. |
| `--use-server-auth` | Optional, Flag | Use authentication for self-hosted Airbyte Server (instead of OAuth for Cloud). Default: `False`.                            |
| `--workspace-id`    | Optional, TEXT | Filter connections by workspace ID. If not specified, lists all connections across all workspaces.                           |

### Usage Examples

```bash
# List all connections (Airbyte Cloud, using environment variables)
paradime run airbyte-list-connections

# List connections for a specific workspace
paradime run airbyte-list-connections --workspace-id "workspace_abc123"
```

{% hint style="success" %}
**Sample Output**

```
🔍 Listing all connections

================================================================================
📋 FOUND 5 CONNECTION(S)
================================================================================

[1/5] 🔌 e3b2eda2-44af-4e32-b1b9-8b8c9e2d1234
--------------------------------------------------
   Name: Sales PostgreSQL
   ✅ Status: active
   Source ID: source_postgres_abc123
   Destination ID: dest_warehouse_def456

[2/5] 🔌 f4c3feb3-55bg-5e43-c2c0-9c9d0f3e2345
--------------------------------------------------
   Name: Marketing API
   ✅ Status: active
   Source ID: source_api_ghi789
   Destination ID: dest_warehouse_def456

[3/5] 🔌 a1b2cda1-33de-4d21-a1a8-7a7b8d1c1234
--------------------------------------------------
   Name: Support Webhooks
   ⏸️ Status: inactive
   Source ID: source_webhook_jkl012
   Destination ID: dest_warehouse_def456

[4/5] 🔌 b2c3edb2-44ef-5e32-b2b9-8c8d9f2d2345
--------------------------------------------------
   Name: Analytics Events
   ✅ Status: active
   Source ID: source_events_mno345
   Destination ID: dest_warehouse_def456

[5/5] 🔌 c3d4fec3-55fg-6f43-c3c0-9d9e0g3f3456
--------------------------------------------------
   Name: Customer Data
   ❌ Status: deprecated
   Source ID: source_crm_pqr678
   Destination ID: dest_warehouse_def456

================================================================================
```

{% endhint %}

## Job Status Reference

Understanding the status indicators in the output:

### Job Status

* ✅ succeeded: Job completed successfully
* ❌ failed: Job failed and needs attention
* 🚫 cancelled: Job was cancelled before completion
* ⚠️ incomplete: Job finished but may be incomplete
* 🔄 running: Job is currently executing
* ⏳ pending: Job is queued and waiting to start

### Connection Status

* ✅ active: Connection is enabled and ready for jobs
* ⏸️ inactive: Connection is disabled
* ❌ deprecated: Connection is deprecated and should be updated

## Platform Support

### Airbyte Cloud

* **Authentication**: OAuth 2.0 with client credentials
* **Base URL**: `https://api.airbyte.com/v1`
* **Setup**: Get credentials from [Airbyte Cloud Settings](https://cloud.airbyte.com/settings/application)

### Airbyte Server (Self-hosted)

* **Authentication**: API key and secret
* **Base URL**: Custom (e.g., `http://localhost:8001/api/v1`)
* **Setup**: Generate API credentials in your Airbyte Server admin panel

## Important Notes

* **Parallel Execution**: Multiple connections process jobs simultaneously for efficiency
* **Real-time Monitoring**: Live progress updates show job status with timestamps
* **Platform Flexibility**: Works with both Cloud and self-hosted deployments
* **Job Types**: Supports both sync (incremental) and reset (full refresh) operations
* **Long-running Operations**: Default 24-hour timeout accommodates large data syncs
* **Connection Identification**: Use exact connection IDs from your Airbyte dashboard

## Environment Variable Reference

| Environment Variable    | Description                      | Platform |
| ----------------------- | -------------------------------- | -------- |
| `AIRBYTE_CLIENT_ID`     | Your client ID or API key        | Both     |
| `AIRBYTE_CLIENT_SECRET` | Your client secret or API secret | Both     |
| `AIRBYTE_BASE_URL`      | Custom API base URL              | Server   |

## Troubleshooting Common Issues

### Inactive Connections

If connections are inactive, jobs may fail. Check connection status in your Airbyte dashboard and ensure sources and destinations are properly configured.

### Authentication Errors

For authentication issues:

* **Cloud**: Verify client ID and secret in Airbyte Cloud settings
* **Server**: Verify API key and secret in your server admin panel

### Network Timeouts

For very large datasets, consider increasing the timeout:

```bash
paradime run airbyte-sync \
  --connection-id "big_data_connection" \
  --job-type sync \
  --timeout-minutes 2880  # 48 hours
```

### Finding Connection IDs

Use the list command to discover available connection IDs:

```bash
paradime run airbyte-list-connections
```

### Platform-specific Issues

**Airbyte Cloud:**

* Ensure you're using OAuth client credentials, not personal access tokens
* Check your Airbyte Cloud subscription limits

**Airbyte Server:**

* Verify the base URL is accessible from your environment
* Ensure API access is enabled in your server configuration
* Check server logs for detailed error messages

## Workflow Integration

These Fivetran CLI commands are designed to integrate seamlessly into your data pipeline workflows. Common use cases include:

1. **Pre-dbt**™ **Sync**: Trigger Airbyte syncs before running dbt™  or other transformations.
2. **Scheduled Data Ingestion**: Automate regular sync cycles for critical data sources.
3. **Event-driven Syncs**: Trigger syncs based on external events or schedules
4. **Multi-source Orchestration**: Coordinate syncs across multiple data connectors

For more information about integrating Airbyte syncs into your Paradime workflows, see the [Bolt Schedules documentation.](https://docs.paradime.io/app-help/developers/paradime-cli/broken-reference)

## API Rate Limits

Be aware of API rate limits when running multiple concurrent jobs:

* **Airbyte Cloud**: Respects standard API rate limits
* **Airbyte Server**: Depends on your server configuration and resources

The CLI includes automatic retry logic with exponential backoff to handle temporary rate limit issues.
