Apache Paimon

Apache Paimon

Streaming Lakehouse Tables for Fast Inserts and Analytics

von Trex Team

€8,62 inkl. MwSt.
Format: EPUB DRM: Kein DRM 5.7 MB

Beschreibung

"Apache Paimon: Streaming Lakehouse Tables for Fast Inserts and Analytics"
If you’re already fluent in modern data platforms but tired of choosing between streaming ingestion speed and lakehouse analytics, this book is written for you. It targets experienced engineers and architects building multi-engine lakehouses who need a precise mental model of how a streaming table format behaves under real production pressure—high write rates, concurrent readers, evolving schemas, and operational constraints across object storage and metastores.
You’ll go deep on how Paimon’s snapshot-based commit system, manifests, and metadata structures enable atomic visibility, time travel, and incremental processing. The book then builds from table semantics—append-only versus primary-key upserts—into the LSM-style merge model, changelog production, and correctness boundaries for CDC and replay. You’ll learn to engineer physical layouts (partitioning, bucketing, sorting) that control amplification and keep queries stable, and to implement ingestion patterns that coordinate engine checkpoints with consistent commits. Finally, it operationalizes the platform: compaction strategy, small-file SLOs, retention and expiration safety, automation, observability, and cross-engine querying with clear governance and consistency contracts.
Expect an advanced, systems-oriented treatment with decision criteria, failure modes, and upgrade/compatibility risk management across Flink, Spark, Trino/Hive, and varied storage backends. Familiarity with distributed systems, stream processing fundamentals, and lakeho

Produktdetails

ISBN 6610001180393
Verlag NobleTrex Press
Erscheinungsdatum 10.03.2026
Sprache Englisch