LogoLogo
  • Welcome to Firefly Documentation
  • Introduction
    • What is Firefly?
    • Who is Firefly for?
    • Why use Firefly?
    • Terminology (Glossary)
  • Key Features
    • Infrastructure-as-Code Automation
    • Cloud Asset Inventory
    • Drift Detection & Remediation
    • Policy-as-Code for Compliance & Governance
    • Cost Visibility & Optimization
    • AI Assistant
    • ChatOps Integration
  • Getting Started
    • Account Setup & Onboarding
    • Connecting Cloud Accounts
    • UI Walkthrough & Navigation
    • First Steps in Firefly
  • Detailed Guides
    • Dashboard Overview
    • Cloud Asset Inventory
      • Remediating Drifts
      • Deleting Assets
      • Creating IaC-Ignore Rules
      • Creating Exclude-Drift Rules
    • Policy & Governance
      • Creating Policy-as-Code Governance Rules
      • Remediating Policy Violations
    • Workflows & Guardrails
      • Creating Workflows
      • Creating Guardrail Rules
    • Codification
    • Self-Service
    • IaC Explorer
    • Event Center
    • Backup and Disaster Recovery
    • Notifications
    • User Management
    • SSO Configuration
  • Integrations
    • Integrations Overview
    • Integrating Data Sources
      • AWS
      • Azure
      • Google Cloud
      • Kubernetes
      • Akamai
      • Datadog
      • New Relic
      • Okta
      • GitHub
      • Cloudflare
      • NS1
      • PagerDuty
      • MongoDB Atlas
      • HashiCorp Vault
    • Integrating IaC Remote State
      • Terraform Cloud
      • Google Cloud Storage
      • env0
      • HashiCorp Consul
      • Firefly States Redactor
    • Integrating Version Control
      • GitHub
      • GitLab
      • Azure DevOps
      • CodeCommit
      • Bitbucket
    • Integrating Notifications
      • Slack
      • Microsoft Teams
      • PagerDuty
      • Opsgenie
      • Torq
      • Webex
      • Google Chat
      • Webhook
    • Integrating Project Management
      • Jira
      • ServiceNow
    • Integrating Workflows with CI/CD
    • Integrating Backstage
    • Integrating MCP
  • Use Cases & Best Practices
    • Cloud Governance & Visibility
    • Cost Optimization Strategies
    • Compliance and Security Best Practices
    • Infrastructure Automation & Self-Service
    • Best Practices and Implementation Tips
  • Analytics & Reporting
    • Analytics Dashboard Overview
    • Using Analytics for Improvement
    • Exporting and Sharing Reports
    • Analytics Security and Privacy
  • Code Snippets & Examples
    • Terraform Snippet for an AWS EC2 Instance (Codified via Firefly)
    • Example Rego Policy (OPA) for a Custom Rule
    • GitHub Actions Workflow YAML for Firefly Integration
    • JSON Output Example: Exporting Inventory
  • Troubleshooting & FAQs
    • Common Issues and Solutions
    • FAQs
  • General Information
    • Firefly API
      • Authentication
      • Inventory
      • Codification
      • Workflows
      • Self-Service
      • Policy & Governance
      • IaC Explorer
      • Event Center
      • Backup & Disaster Recovery
      • Notifications
      • Integrations
      • Identity & Access Management
    • Security & Compliance
    • Pricing Tiers & Add-ons
    • Contacting Support
Powered by GitBook
On this page
  • Overview: Disaster Recovery
  • Recovering from Infrastructure Failure
  • Recovering Deleted Assets: When the Responsible Asset is Known
  • Viewing Mutations: When the Responsible Asset is Unknown
  • Preventing Misconfiguration and Reliability Risks
  • Receiving Notifications on Asset Changes
  • Subscribing to Insights for Reliability and Misconfiguration Prevention
  • Summary

Was this helpful?

  1. Detailed Guides

Backup and Disaster Recovery

Firefly's Backup and Disaster Recovery (DR) capabilities provide robust tools to safeguard your cloud infrastructure, enabling you to recover quickly from failures and prevent misconfigurations that could lead to outages. This guide details how to use Firefly to mitigate, diagnose, and recover from infrastructure failures, as well as how to proactively prevent them.


Overview: Disaster Recovery

Disaster recovery (DR) is the process of restoring your cloud environment to a healthy state after an incident such as accidental deletion, misconfiguration, or infrastructure failure. Firefly offers:

  • Automated backups and configuration history

  • Rapid recovery tools for deleted or misconfigured assets

  • Comprehensive mutation logs for root cause analysis

  • Proactive notifications and insights to prevent disasters


Recovering from Infrastructure Failure

When an infrastructure failure occurs, Firefly provides tools to help you diagnose, resolve, and recover. The recovery process depends on whether you know which asset caused the failure.

Recovering Deleted Assets: When the Responsible Asset is Known

If you know which asset was deleted or misconfigured (e.g., a team member accidentally deleted a resource), you can restore it using Firefly's codification and GitOps integration.

Procedure:

  1. Select Inventory > Deleted.

    • This view lists all assets that have been deleted from your environment.

  2. Filter by Time Range.

    • Use the filter to narrow down the list to the relevant time period.

  3. Select the Deleted Asset and Codify.

    • Click on the asset that was deleted. Use the "Codify" action to generate the Infrastructure-as-Code (IaC) template for the asset.

  4. Create a Pull Request.

    • Firefly will prompt you to select the appropriate repository and branch for your GitOps workflow. Submit a pull request to restore the asset via code.

  5. Review and Merge.

    • Once reviewed and merged, your CI/CD pipeline will recreate the asset in your cloud environment.

Tip: This process ensures that the restored asset is managed by code, reducing the risk of future drift or manual misconfiguration.

Viewing Mutations: When the Responsible Asset is Unknown

If you do not know which asset caused the failure, use Firefly's mutation tracking to investigate recent changes and identify the root cause.

Procedure:

  1. Select Inventory.

    • View all assets in your environment.

  2. Apply Filters.

    • Filter assets by data source, environment, account, and location to narrow your search.

  3. Filter by Asset Flags: Mutations.

    • Use the "Mutations" filter to show assets with recent changes or drifts.

  4. Review Mutation Log.

    • Select an asset and open its mutation log to see a timeline of configuration changes.

  5. Codify Revision.

    • For any suspicious or recent change, select the revision date and use "Codify Revision" to generate the IaC template for that point in time.

  6. Revert via Pull Request or Terraform Import.

    • Restore the asset to a previous configuration by submitting a pull request or using the provided Terraform import commands.

Tip: Mutation logs provide a detailed audit trail, including who made each change and what was modified, making root cause analysis straightforward.


Preventing Misconfiguration and Reliability Risks

Proactive prevention is key to avoiding disasters. Firefly enables you to set up notifications and subscribe to insights that alert you to risky configurations or changes.

Receiving Notifications on Asset Changes

Stay informed about changes in your infrastructure by subscribing to notifications. These alerts help you:

  • Detect single points of failure

  • Monitor data protection status

  • Maintain visibility into system operations

How to Subscribe:

  • Go to Settings > Notifications in Firefly.

  • Choose your preferred notification channels (Slack, Teams, email, etc.).

  • Select which events or asset changes should trigger notifications (e.g., deletions, drifts, policy violations).

Tip: Fine-tune your notification settings to avoid alert fatigue and focus on critical events.

Subscribing to Insights for Reliability and Misconfiguration Prevention

Firefly Insights are policy-driven checks that highlight risky configurations. Subscribing to these insights helps you proactively address issues before they lead to outages.

Top 5 Insights to Reduce Disaster Risk:

  1. Reliability: AWS Auto-Scaling Groups in a Single Availability Zone

    • Auto-scaling groups should span multiple availability zones for high availability. Single-zone groups risk downtime if that zone fails.

  2. Reliability: AWS Database Instances in a Single Availability Zone

    • Databases in one zone are vulnerable to zone failures. Multi-AZ deployment is recommended for resilience.

  3. Reliability: AWS RDS Instance Without Deletion Protection

    • Without deletion protection, accidental or automated deletions can cause permanent data loss.

  4. Reliability: AWS DynamoDB Tables Without Point-in-Time Recovery

    • Enable point-in-time recovery to restore tables to any previous state and protect against data loss.

  5. Misconfiguration: AWS ELB/LB Without Access Logs Enabled

    • Access logs are essential for troubleshooting, monitoring, and security analysis. Enable logging to maintain visibility.

How to Subscribe:

  • Go to Settings > Insights in Firefly.

  • Subscribe to the above insights and configure notification preferences.

Tip: Regularly review insight recommendations and remediate flagged issues to maintain a resilient infrastructure.


Summary

Firefly's backup and disaster recovery features empower you to:

  • Rapidly recover from accidental deletions or misconfigurations

  • Investigate and revert problematic changes

  • Proactively prevent outages with real-time notifications and policy-driven insights

By integrating these tools and practices into your operations, you can ensure your cloud environment remains resilient, auditable, and secure.

PreviousEvent CenterNextNotifications

Last updated 19 days ago

Was this helpful?