FastFlowTransform Documentation Hub¶
FastFlowTransform (FFT) is a SQL + Python data modeling engine with a deterministic DAG, parallel executor, optional caching, incremental builds, auto-generated docs, snapshots, and built-in data-quality tests. The fft CLI orchestrates compilation, execution, docs, validation, and history tracking across DuckDB, Postgres, BigQuery (pandas + BigFrames), Databricks/Spark, and Snowflake Snowpark.
Use this page as the front door into the docs: start with the orientation section, then jump to the guide that matches the task you have at hand.
Table of Contents¶
- Quick Orientation
- Build & Run Projects
- Modeling & Configuration
- Execution & State Management
- Testing & Data Quality
- Docs, Debugging & Operations
- Examples & Tutorials
- Reference & Contribution
- Need Help?
Quick Orientation¶
- New to FFT? Read the Quickstart for installation (venv + editable install), seeding, and the first
fft run. - Want the bigger picture? The Technical Overview explains the project layout, DAG, scheduler, registry, executors, and the roadmap snapshot.
- Learning the CLI surface area? Browse the CLI Guide for command groups such as
fft run,fft snapshot run,fft dag,fft docgen,fft test, andfft utest.
Build & Run Projects¶
- Project layout & CLI workflow: Pair the “Project Layout” chapter of the Technical Overview with the CLI Guide to understand how
fft run,fft test, andfft dagfit together. - Profiles & environments: Profiles & Environments covers executor profiles, environment overrides, credential handling, and engine-specific flags.
- Runtimes & observability flags: Logging & Verbosity explains log levels, JSON logs, progress indicators, and metrics toggles during
fft run. - Local runtimes & engines: Local Engine Setup walks through DuckDB, Postgres, Spark/Delta, BigQuery, and Snowflake Snowpark bootstrapping for the demos.
- CI-friendly workflows: CI Checks & Change-Aware Runs introduces
fft ci-checkandfft run --changed-sincefor structural validation and diff-aware pipelines.
Modeling & Configuration¶
- SQL + Python authoring model: API & Models documents the Python node decorators, HTTP helper (
fastflowtransform.api.http), and howref()/source()bindings work in both SQL and Python models. - Templates, macros, and config keys: Configuration & Macros lists the
config(...)options, reusable macros, helper functions, and naming rules for.ff.sql/.ff.py. - Project-level metadata: Project Configuration describes
project.yml, default materializations, tags, exposures, docs strings, and themodels/hierarchy. - Sources & seeds: Sources shows how to register upstream tables/files, snapshots of raw data, and how state tracking interacts with sources.
Execution & State Management¶
- Parallelism, caching & rebuilds: Cache & Parallelism dives into the level-wise scheduler, fingerprint cache, and
--rebuild/--no-cachebehaviors. - Incremental models: Incremental Processing explains merge vs append strategies, cleanup rules, and engine-specific hooks.
- Snapshots / history tables: Snapshots documents the
materialized='snapshot'config, timestamp vs check strategies, and the dedicatedfft snapshot run . --env <profile>entrypoint. - Selective runs: State Selection covers
--selector,--select,--exclude,--changed, and--resultsfilters across DAGs.
Testing & Data Quality¶
- Schema-bound YAML tests: YAML Tests details how to define and run column-level constraints declared in
.yml. - Reusable data-quality suites: Data Quality Tests catalogs reconciliation, freshness, and anomaly rules that can attach to models or sources.
- Source freshness guard-rails: Source Freshness covers
fft source freshness, metadata insources.yml, and interpreting warn/error thresholds in the docs UI. - Fast model unit tests: Unit Tests shows how to author
.sql/.pyassertions, seed fixtures, and run them viafft utest.
Docs, Debugging & Operations¶
- Auto-generated docs & lineage: Auto Docs explains
fft dag --html,fft docgen, JSON exports, and optionalsync-db-commentsfor Postgres/Snowflake. - Visibility & logging: Logging & Verbosity lists CLI flags for structured logs, progress bars, and verbose executor info.
- Troubleshooting: Troubleshooting & Error Codes enumerates the most common failures, retry strategies, and diagnostic commands.
Examples & Tutorials¶
- Core walkthroughs: Basic Demo and Materializations Demo cover the standard table/view/incremental builds and DAG navigation.
- Testing-focused: Data Quality Tests Demo and Macros Demo showcase advanced assertions and templating.
- Performance & state: Cache Demo, Environment Matrix Demo, and Incremental Demo highlight rebuilds and selective runs.
- API & integrations: API Demo illustrates Python HTTP models; Local Engine Setup provides engine-specific Makefiles.
- History tracking: Snapshot Demo demonstrates the snapshot materialization end-to-end with timestamp/check strategies.
All demos live in the top-level examples/ directory and ship with Makefiles plus runnable seeds.
Reference & Contribution¶
- API reference: Browse the generated API Reference (MkDocStrings) for public functions, classes, and executors under
src/fastflowtransform. - Architecture internals: The Technical Overview dives into registries, DAG building, validation, and engine abstractions.
- Contributing: Follow Contributing.md for dev environment setup (
uv,pyproject.toml), coding standards, tests, and PR expectations. - License: Apache 2.0 — see License.md.
Need Help?¶
- Open an issue or PR with context — start with Contributing.md if you want to propose changes.
- Surface documentation gaps, bugs, or missing examples via GitHub issues in MirrorsAndMisdirections/FastFlowTransform.
- For roadmap highlights or planning threads, check the final section of the Technical Overview.