Skip to content

witboost

DATA CAPTURE

witboost Data Capture automatically keeps in sync a data lake with the source systems.

Every data lake has an ingestion pipeline, more and more often a real-time stream of data coming from change data capture, which creates the problem of applying mutations into a data lake storage (typically immutable) without involving tons of batch jobs (degrading data freshness and creating a scheduling hell into the cluster).

witboost Data Capture:

• Decouples the pipeline from the specific CDC format

• Applies all the mutations in streaming refreshing the lake in near real-time

• Guarantees ACID compliance, supporting all major table formats (Delta, Iceberg, Hudi)

• Codeless, just configuration

• Extensible

• Enables business events generation directly on CDC stream

  • SCD2
  • Data Deduplication

Discover other modules of

DATA ENGINEERING boost

Would you like to know more?

GET A DEMO