2026Personal project
NYC Parking Violations Analytics (dbt + DuckDB)
A dbt project that models NYC parking violations with a medallion architecture (bronze -> silver -> gold). The pipeline cleans raw data, enriches it with fee logic, and delivers analytics-ready tables for reporting.
ETL/ELT

Portfolio highlights
- •End-to-end ELT pipeline in dbt with DuckDB.
- •Layered modeling with clear lineage from raw to curated outputs.
- •Data quality tests and documentation generation.
Architecture
- •Bronze models stage raw tables without transformation.
- •Silver models standardize columns, add flags, and join fee logic.
- •Gold models deliver aggregated business metrics.
Data
Core datasets and storage locations used by the pipeline.
- DuckDB database:
data/nyc_parking_violations.db - Raw tables:
parking_violations_2023,parking_violation_codes - Reference CSVs: files in
data/
Model summary
The layered dbt models that power the medallion architecture.
- Bronze:
bronze_parking_violations,bronze_parking_violation_codes - Silver:
silver_parking_violations,silver_parking_violation_codes,silver_violation_tickets,silver_violation_vehicles - Gold:
gold_ticket_metrics,gold_vehicle_metrics
Tests
Data quality coverage baked into the project.
- •Built-in tests on key fields (
unique,not_null). - •Custom generic test
generic_not_nullfor column null checks. - •Singular test
violation_codes_revenue(warning severity).
Explore & Repo
Source code and documentation for the full dbt project.