The brief
A corporate L&D team needed an AWS Data Lake hands-on lab for incoming engineers — one lab, 30 students, a single AWS account, no risk of one student wrecking another's environment. Off-the-shelf platforms didn't cover the Glue + Athena pipeline they wanted to teach.
What we built
- Customer-managed IAM policy scoped to quicklabs-{username}-* resources, region-locked to us-west-2
- Glue service role with explicit Deny on writes outside the student's own catalog namespace
- Two Terraform modules — admin (IAM, workgroup) and student (S3, Glue, ETL job)
- Athena workgroup per student, results bucket isolated, query history scoped
- PySpark Glue ETL converting raw CSV → partitioned Parquet (snappy, by year)
- Admin walkthrough for end-to-end verification + student handout for delivery
How it works
- 1Admin runs the IAM Terraform once per student — provisions user, role, policies, workgroup
- 2Student receives credentials and signs into a region-locked console
- 3Student runs the lab Terraform under their own credentials — proves the policy is correct
- 4Student exercises the pipeline: crawl raw, run ETL, crawl curated, query in Athena
- 5Cleanup is one Terraform destroy — IAM and infra teardown in one motion
What this proves
- 30 students in one AWS account with zero cross-tenant access
- Iteration loop: every IAM gap surfaced as a 403, fixed in the policy, re-applied — under five minutes per cycle
- Total cost per cohort under $50 in AWS spend
- Whole lab re-runs from a clean account in under 15 minutes