--- alias: operation-guide-clickhouse-backupandrestoreoverview tags: - database - ClickHouse - backup - restore description: "Back up and restore ClickHouse databases within Critical Manufacturing MES using API requests and automated installation processes" --- # ClickHouse Database Backup and Restore This guide provides procedures for backing up and restoring ClickHouse databases in Critical Manufacturing MES. !!! info "Version Compatibility" This procedure applies to MES version 11.x and later. For earlier versions, please refer to the [MES v11.0.0 Infrastructure Migration Guide](https://developer.criticalmanufacturing.com/explore/guides/migration/version-specific/1100/infrastructure/). ## Overview Critical Manufacturing MES provides tools to create backup deployment packages of existing ClickHouse databases. These packages are placed in a designated MES custom installation packages folder, which triggers the MES installation process to restore from the backup instead of setting up a clean database. !!! note The [copy environment](https://developer.criticalmanufacturing.com/explore/guides/deployment/copy_environment/) process is commonly used to migrate existing environments to new infrastructure or replicate production data into staging environments for acceptance testing, troubleshooting or training purposes. In situations where the SQL Server backups exist but ClickHouse backups were not generated - and it is no longer possible to create matching backups - follow the procedure described in the [[operation-guide-clickhouse-recreateclickhousebackups]] guide. ## Database Structure {% include-markdown 'includes/tech/mes-clickhouse-databases.md' %} ## Backup Procedure This section describes how to back up a ClickHouse database schema and download the resulting package using the MES Web API. ### Prerequisites - An MES user account with the `SysAdministrators` role. - Access to the MES Web API. - A tool for making HTTP requests (e.g., `curl`, Postman, or PowerShell). - Access to the SQL Server backup process for the same environment. - Access to the System Monitoring page to verify the `Work Queue Lag` counter in `System/Database/Kafka`. !!! warning "Backup Sequencing and Lag Validation" Before starting the SQL Server backups, ensure the `Work Queue Lag` counter in `System/Database/Kafka` shows `0 Message(s)`. For more information on System Monitoring, see [[user-guide-system-monitoring]]. After the SQL Server backups have completed, start the ClickHouse backups. Even with this sequencing, some messages may still be in flight between the SQL Server backup and the ClickHouse backup. These messages may already be marked as processed in SQL Server but not yet exist in ClickHouse. For custom IoT Event Definitions, events still in flight in Kafka when the ClickHouse backup is taken cannot be recovered later through [[operation-guide-data-processing]]. ### Step 1: Obtain Authentication Credentials You can authenticate your API requests using a Bearer Token or Basic Authentication with a Personal Access Token (PAT). #### Option A: Bearer Token (Session-based) This method is convenient for manual, one-off tasks. After logging in to the MES GUI, you can retrieve the Bearer token: 1. Open your browser's developer tools (F12). 2. Navigate to the `Network` tab. 3. Make any request in the MES GUI and copy the `Authorization` header value, which will be in the format `Bearer `. #### Option B: Basic Authentication (For Automation) This method is ideal for automation scripts. You will need a Personal Access Token (PAT) created from your user account profile page (`Administration` > `Security` > `Your Account Profile`). 1. Format your string as `username:your_pat_token`. 2. Encode that string in **Base64**. === "PowerShell" ```powershell [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("username:your_pat_token")) ``` === "Bash" ```bash echo -n "username:your_pat_token" | base64 ``` The output will be a long base64-encoded string. !!! info "API Documentation" For more information on MES Web API usage, see the [Invoke MES HTTP API Guide](https://developer.criticalmanufacturing.com/explore/guides/customizations/business/invokemeshttpapi/). ### Step 2: Create Database Backup Initiate the backup process by sending a `POST` request. This operation is **asynchronous**; the server will acknowledge the request immediately while the backup runs in the background. #### Endpoint `POST {environment-url}/environment/clickhouse/backup/{database-name}` #### Request You may use PowerShell to invoke the backup API as follows: === "Using Bearer Token" ```powershell curl --location "https://mesintegration.dev/environment/clickhouse/backup/MESCDM" ` --request POST ` --header "Authorization: Bearer {your-token-here}" ``` === "Using Basic Authentication" ```powershell curl --location "https://mesintegration.dev/environment/clickhouse/backup/MESCDM" ` --request POST ` --header "Authorization: Basic {your-base64-encoded-credentials}" ``` !!! note "Environment URL" The `{environment-url}` is the base URL of your MES instance (e.g., `https://mesintegration.dev/`). #### Response The API may return one of the following responses: | Status Code | Description | Action | | --- | --- | --- | | **202 Accepted** | Backup job started successfully. | Returns the backup job GUID on response body. | | **400 Bad Request** | Invalid inputs (e.g. database name). | Ensure the DB name on the URL is correct and try again. | !!! warning Save the GUID from the response body. You will need it to retrieve the file in the next step. ### Step 3: Download Backup Package Use the GUID to check the status and download the package. Because the process is asynchronous, you may need to "poll" (request repeatedly) until the file is ready. #### Endpoint `GET {environment-url}/environment/clickhouse/backup/{guid}` #### Request You may use PowerShell to invoke the download backup API as follows: === "Using Bearer Token" ```powershell curl.exe --location "https://mesintegration.dev/environment/clickhouse/backup/{backup-guid}" ` --remote-header-name --remote-name ` --request GET ` --header "Authorization: Bearer {your-token-here}" ``` === "Using Basic Authentication" ```powershell curl.exe --location "https://mesintegration.dev/environment/clickhouse/backup/{backup-guid}" ` --remote-header-name --remote-name ` --request GET ` --header "Authorization: Basic {your-base64-encoded-credentials}" ``` #### Response The API may return one of the following responses: | Status Code | Description | Action | | --- | --- | --- | | **200 OK** | Backup is complete. | The file stream is returned; save the output. | | **202 Accepted** | Backup is still processing. | Wait a few seconds and try the request again. | | **400 Bad Request** | Invalid GUID or backup not found. | Verify the GUID or check server logs for failures. | | **500 Error** | Server-side failure. | Check the Environment Manager logs, or contact your IT Administrator if you lack access permissions. | !!! tip "Polling for Backup Completion" In a script, continue to call the GET endpoint as long as you receive a `202` status code. Once you receive a `200`, you need to save the file stream data in the response to a file. The backup file is immediately deleted once the file stream of download request that returned 200 is closed. ## Restore Procedures ### Prerequisites - The target ClickHouse instance must have volumes configured with the same names as those used in the source environment. Mismatched volume names will cause the restore to fail. ### Automated Restore (During Installation) The MES installer can automatically detect and restore from backup packages if they are placed in the correct location with the required file names. 1. **Ensure correct naming:** The downloaded backup packages need to follow a naming convention. For MES v11.3.0, the filenames are: - **MES main database:** `Cmf.ClickHouse.FullBackup.Online.11.3.0.zip` - **CDM database:** `Cmf.ClickHouse.FullBackup.CDM.11.3.0.zip` - **DWH database:** `Cmf.ClickHouse.FullBackup.DWH.11.3.0.zip` - **ODS database:** `Cmf.ClickHouse.FullBackup.ODS.11.3.0.zip` !!! warning "Full Backups with the System Name" To support restoring different environments from the same installation packages directory, the installation logic now prioritizes backups that include the system name in their filename. The installer will search for backup files in the following order: 1. **System-specific:** `.Cmf.ClickHouse.FullBackup...zip` 2. **Generic:** `Cmf.ClickHouse.FullBackup...zip` For example, for a system named `MES` running version `11.3.0`, the installer will first search for `MES.Cmf.ClickHouse.FullBackup.Online.11.3.0.zip`. If this file is not found, it will then look for the generic filename: `Cmf.ClickHouse.FullBackup.Online.11.3.0.zip`. 2. **Place packages in the correct folder:** Place all four database backup packages in the MES custom installation packages directory. ```text ├── Cmf.ClickHouse.FullBackup.Online.11.3.0.zip ├── Cmf.ClickHouse.FullBackup.CDM.11.3.0.zip ├── Cmf.ClickHouse.FullBackup.DWH.11.3.0.zip └── Cmf.ClickHouse.FullBackup.ODS.11.3.0.zip ``` 3. Run the MES installation process. The installer will automatically detect and restore from the backup packages. !!! warning "Do Not Rename Packages" Ensure that your backup packages follow the naming convention above, adapting just the version number. Therefore, you should only change the `11.3.0` to your actual MES version. ### Manual Restore For manual restoration outside of the installation process, follow the steps below: 1. **Extract the backup package:** ```bash unzip backup-package.zip -d backup-contents/ ``` 2. **Connect to ClickHouse:** ```bash clickhouse-client --host {clickhouse-host} --port 9000 ``` 3. **Execute restore commands:** Follow the ClickHouse backup restoration procedures as documented in the [official ClickHouse documentation](https://clickhouse.com/docs/operations/backup). ## Best Practices - **Regular Backups:** Schedule regular backups of all four databases. - **Backup Sequencing:** Start the SQL Server backups only after the `Work Queue Lag` counter in `System/Database/Kafka` reaches `0 Message(s)`. After the SQL Server backups have completed, back up ClickHouse. - **Verify Backups:** Test restore procedures in a non-production environment. - **Post-Restore Reconciliation:** After restoring SQL Server and ClickHouse backups, run [[operation-guide-data-processing]] for a time range that covers the backup window to reconcile any messages that were still in flight when the backups were taken. - **Storage Management:** Monitor backup storage usage and implement retention policies. - **Security:** Protect backup files and limit access to authorized personnel. - **Documentation:** Maintain records of backup schedules and restore procedures. ## Troubleshooting | Issue | Solution | |-------|----------| | Backup request fails | Verify the user has the `SysAdministrators` role and that the token is valid. | | Download returns `204 No Content` | The backup is still being generated; wait and retry. | | Download returns `400 Bad Request` | Check that the GUID is correct and properly formatted. | | Download returns `500 Internal Server Error` | Check the MES `EnvironmentManager` and ClickHouse service status and logs for errors. | | Download times out | Large databases may require extended timeout settings for the download client. | | Restore fails | Check the ClickHouse logs and verify the backup file integrity. | ## Related Documentation - [ClickHouse Official Backup Documentation](https://clickhouse.com/docs/operations/backup) - [MES Copy Environment Guide](https://developer.criticalmanufacturing.com/explore/guides/deployment/copy_environment/) - [MES Web API Guide](https://developer.criticalmanufacturing.com/explore/guides/customizations/business/invokemeshttpapi/)