My name is Josh Cooper, a
Wisconsin based

Senior Data Engineer

Microsoft Fabric Builder

Cloud Data Platform Engineer

Fabric Platform Automation

Productionize Microsoft Fabric capacity, workspace, inventory, and migration automation at enterprise scale using Python, PySpark, Azure DevOps, and governed self-service pipelines.

Lakehouse & Warehouse Engineering

Design and operate lakehouse and warehouse architectures across Fabric, Snowflake, Azure, and AWS, with ETL/ELT frameworks, CDC patterns, data validation, and source-to-target lineage.

Governance & Analytics Operations

Build governed inventory, ACL, tenant-setting, MDM, and operational reporting pipelines while supporting secure SDLC, compliance controls, Power BI dashboards, and production analytics operations.

Senior Data Engineer with 10+ years of experience designing cloud data platforms, lakehouse and warehouse architectures, ETL/ELT frameworks, governance pipelines, and enterprise analytics infrastructure across Microsoft Fabric, Azure, AWS, Snowflake, and Power BI.

  • Works
  • Certifications
Jul '24 - Present

Fabric Data & Operations Engineer at PwC via Insight Global

  • Support PwC enterprise data platform operations for a financial services consulting environment, delivering automation through Agile teams using Azure DevOps.
  • Built and productionized Microsoft Fabric Capacity Management automation supporting 339 capacities, 1.18M workspaces, and 2.96M Fabric and Power BI items.
  • Optimized workspace migration tooling for 100K workspaces, reducing projected migration runtime from 7 days to 8 hours through improved API design and execution patterns.
  • Led Power BI Premium P SKU to F SKU migration automation across workspace discovery, validation, execution, logging, and post-migration verification.
  • Implemented Fabric inventory and governance pipelines landing capacities, workspaces, users, ACLs, tenant settings, and Fabric items into Lakehouse bronze, silver, and gold layers with CDC.
  • Developed ARMA, a JSON-driven Python automation framework for capacity lifecycle management across DEV, UAT, and PROD via Azure DevOps and Fabric self-service pipelines.
  • Automated governance controls including regional validation, BillingID enforcement, dynamic tagging, service principal execution, and configuration standardization.
  • Delivered Fabric Management dashboards and PySpark notebook utilities for capacity utilization and operational KPI tracking with Azure Log Analytics.
Dec '23 - Jul '24

Lead Data Engineer at Great Wolf Resorts

  • Led data engineering delivery for a 1,000+ employee hospitality organization, managing FTE and contract engineers in an Agile environment.
  • Led end-to-end redesign of the AWS and Snowflake data warehouse, migrating PMS systems to cloud with S3 staging, IAM access controls, and Glue/Lambda pipelines.
  • Delivered same-day operational reporting by implementing near-real-time data flows for leadership dashboards.
  • Built the organization's first Snowflake data lineage model using LLM-assisted analysis to map source-to-target dependencies and improve pipeline governance.
  • Implemented GA4 and BigQuery integration to centralize analytics data and improve query performance.
  • Developed and validated data pipelines using Python and Snowflake Snowpark for large-scale dataset processing.
  • Provided technical direction and mentorship to FTE and contract engineers.
Apr '19 - Nov '23

Data Engineer at Wisconsin Department of Employee Trust Funds

  • Served as senior data engineer on a 15-person team supporting enterprise data warehousing, MDM, ETL, and reporting initiatives in an Agile/Jira environment.
  • Built the agency data warehouse from the ground up, establishing data lake infrastructure, storage systems, database design, ingestion pipelines, and governance policies.
  • Designed and implemented a multi-domain Semarchy MDM model mastering person, address, email, phone, and employment entities across CRM, legacy systems, and OnBase.
  • Built ETL/ELT pipelines with Alteryx, Python, and SSIS, including API-based integrations for real-time data flow and automated scheduling via Control-M and Python/Bash.
  • Established CI/CD pipelines for the MDM system and championed SDLC practices across ETL and dashboard development using Git.
  • Administered Tableau and Alteryx servers, including patching, configuration management, and dashboard development for agency-wide reporting.
  • Maintained strict HIPAA compliance with focus on PII/PHI handling across pipelines.
  • Mentored 8 junior developers across dashboard, ETL, and reporting projects.
May '15 - Apr '19

Business Intelligence Analyst/Developer at Acoustic Ceiling Products

  • Served as sole BI and data engineering hire for a 200-employee manufacturing company, reporting directly to the CEO and building the company's first centralized analytics platform.
  • Built the company's data infrastructure from scratch, ingesting internal and external operational data into centralized AWS RDS and Redshift warehouses.
  • Designed ETL pipelines integrating previously siloed systems and replaced manual processes with automated data flows using Lambda for containerized job execution.
  • Deployed and maintained AWS infrastructure with CloudFormation, CloudWatch monitoring, SES alerting, and IAM/VPC security controls.
  • Administered Tableau Server on Linux EC2 and developed the company's dashboard and reporting suite from the data warehouse.
  • Partnered directly with the CEO to define reporting priorities, translate business requirements, and deliver executive dashboards.

Microsoft Certifications

  • Microsoft Certified: Azure AI Engineer Associate
  • Microsoft Certified: Fabric Analytics Engineer Associate

AWS Certifications

  • AWS Certified Solutions Architect - Associate

Alteryx Certifications

  • Alteryx Designer Advanced Certification
  • Alteryx Designer Core Certified
  • Alteryx Server Administration
  • Alteryx Server Implementation

Snowflake Certification

  • SnowPro Core Certification

Wisconsin Lake Explorer

Filter more than 15,000 Wisconsin lakes by fish species, access amenities, and DNR regulations—all plotted on an interactive Leaflet map.

Built with curated DNR datasets, accessible controls, and dynamic clustering for fast exploration.

Grammarling

Smart flashcards that accelerate multilingual vocabulary growth through spaced repetition and active recall.

Supports multiple languages with customizable decks and progress tracking.

Key Skills

  • SQL
  • Python & PySpark
  • Microsoft Fabric
  • Azure & AWS
  • Snowflake
  • Power BI
  • ETL/ELT Frameworks
  • Lakehouse/Warehouse Architecture
  • MDM
  • CI/CD

Technical Skills

  • Languages: SQL, Python, PySpark, Bash
  • Platforms: Microsoft Fabric, Azure, AWS, Snowflake, Power BI, Tableau
  • Architecture: Lakehouse, warehouse, data lake, MDM, governance pipelines, CDC
  • Engineering: ETL/ELT, API integrations, data validation, lineage, operational reporting
  • DevOps: Azure DevOps, CI/CD, Git, CloudFormation, Control-M, IAM, VPC, CloudWatch
  • Delivery: Agile teams, documentation, mentoring, secure SDLC, production support

Hobbies

Aquaponics; learning ML/AI; utilizing AWS and Azure for various side projects.