OpenLineage

OpenLineage

Standardized Data Lineage for Modern Pipelines

von Trex Team

€8,62 inkl. MwSt.
Format: EPUB DRM: Kein DRM 5.6 MB

Beschreibung

"OpenLineage: Standardized Data Lineage for Modern Pipelines"
Data lineage is only useful when it stays correct under retries, partial failures, and heterogeneous tooling—and that’s where most implementations quietly break. This book is written for experienced data engineers, platform engineers, and architects who need lineage they can trust in production, not just a diagram for a slide. It focuses on the practical contracts and design choices that determine whether lineage becomes reliable operational telemetry or an expensive, fragmented graph.
You’ll build a rigorous mental model of lineage goals and scope, then dive into OpenLineage as an event standard: producers, transports, and consumers. From there, the book goes deep on the core event model—RunEvents, Jobs, Datasets, and input/output semantics—so you can produce deterministic graphs across orchestrators and execution engines. You’ll learn lifecycle correctness (START/terminal, idempotency, retries), identity and naming strategies that don’t drift, and facet design for rich metadata without interoperability loss. Practical chapters cover capture-layer decisions, mixing automatic and custom instrumentation safely, configuring transports for real networks, validating events, monitoring lineage completeness, and emitting governance-relevant metadata without creating privacy risk.
Expect a specification-aware, failure-mode-driven approach with clear decision criteria and operational guardrails. Familiarity with modern data stacks (orchestrators, warehouses/lakes, streaming, CI/CD, observability) is assumed; the payof

Produktdetails

ISBN 6610001179151
Verlag NobleTrex Press
Erscheinungsdatum 09.03.2026
Sprache Englisch