Skip to main content

Archive Overview

Why Use SparkLogs Archiving & Replication?

  • Unlimited Data Retention: Retain an additional, backup copy of your observability data for as long as you need with no hidden retention fees for keeping your data.
  • Strong Compliance Support: All data objects are immutable after creation, allowing safe replication to WORM, bucket-locked, or object-locked storage for regulatory requirements.
  • Maximum Flexibility: Seamlessly replicate archives to your own cloud storage (AWS S3, GCS, Azure Blob, or any S3-compatible provider) for compliance, DR, and sovereignty needs.
  • Queryable Archive: Archived data is stored as partitioned Parquet files so they're instantly queryable in tools like AWS Athena, BigQuery, Azure Synapse, Apache Spark, and more.
  • Zero Hassle: No complex tiering or rehydration process: query historical and current data in a unified platform.

With SparkLogs archiving and replication, it's fast, simple, and cost-effective to build your own cold Data Lake for extensive historical analysis, machine learning, point-in-time forensic investigations, or meeting the strictest data governance needs.

Archiving Details

SparkLogs automatically retains your ingested data for live querying for months or even years according to your chosen retention period, with no extra retention charges. You can verify your workspace's retention policy in the Data Sovereignty section of the home dashboard. Our unique architecture makes it cost effective to store and query data for much longer than legacy systems.

In addition, an archived backup of all your data is created daily and kept for 1 year (on Cloud plans) or any custom period (on Private Cloud deployments). This is maintained as a compressed, hive-partitioned Parquet Data Lake, yielding 10x+ space savings and direct compatibility with leading analytics systems like AWS Athena, BigQuery, Azure Synapse, Apache Spark, and more.

Private Cloud customers have direct access to their archived data in their configured Google Cloud Storage bucket.

SparkLogs Cloud customers can directly access their archived data by replicating it to their own object storage bucket at no extra cost.

Replication Details

With replication you can further copy archived data to one or more additional storage buckets, including AWS S3, S3-compatible providers, GCS, Azure Blob Storage, and more. Follow the setup guide, test the connection to your target bucket(s), and then wait for the first daily replication to complete.

Archiving and replication operations run daily, usually within an hour after midnight UTC.