Artie is designed to handle high throughput with low latency. We automatically handle schema evolution and use atomic operations to guarantee data consistency.
Automatic column detection, mapping and creation
Option to skip or drop deleted columns
Choose between soft or hard row deletes, or skip deletes entirely
No more missing data from pipelines that silently fail
We hard fail and continuously retry on every data ingestion error to guarantee data consistency
We leverage Kafka for ordering and use Kafka and SQL transactions to ensure idempotency and atomicity
CDC log-based replication is the most non-intrusive and efficient way to replicate data from databases
Historical backfills do not require table locking and can be performed against replicas
Minimal operation log growth as logs are constantly streamed to Kafka clusters, reducing risk of replication slot overflow
Kafka is our external buffer (instead of your operation log) to handle back pressure from data pipeline errors
Extract changes from source systems
Publish data to Kafka
Consume from Kafka and stream data to a destination staging table
Align staging and destination table schema and merge changes
Delete staging table and commit offset in Kafka
Get deployment analytics out-of-the-box to answer questions like “How many rows have been synced in the past hour?” or “What is the latency for our most important tables?” or “What factors impact latency and how are those metrics trending?
Real-time observability into database pipelines and peripheral infrastructure. Understand how systems impact one another, reduce downtime and debug issues faster, and generate proactive alerts to maintain robust infrastructure.
Enable history mode with a single click and we will create a separate table that records every data mutation along with the database timestamp and type of operation (Slowly Changing Dimension Tables Type 4).
Configure peak and off-peak hours with various latency requirements to reduce Snowflake compute costs.
Easily go from partitioned tables from your source and into a single table in your destination. Turn on ‘skip deletes’ to keep a smaller number of partitions in your source database to save on storage costs and improve performance, while archiving all data in your data warehouse.