Class Overview
Why is this important?
The ability to perform ETL processes with regular expressions is a critical skill for accounting professionals in today's data-driven environment. Accounting graduate students who master these techniques will be better equipped to handle messy or inconsistent data, which is common in professional settings. These skills enhance their analytical abilities, enabling them to derive meaningful insights from complex datasets and improve decision-making processes. Additionally, automating data preparation tasks with tools like Alteryx increases efficiency and adds value in audit, advisory, and other accounting services where accurate data management is crucial.
What will we do?
In this class, students will use Alteryx to perform an Extract, Transform, and Load (ETL) process, focusing on the use of regular expressions to clean and restructure raw data. Students will extract specific patterns from unstructured text data, such as customer or transaction records, transform it into structured formats, and load the cleansed data into analytical workflows for further processing. Through hands-on exercises, students will learn how to automate and streamline data preparation tasks, making data more accessible and ready for analysis.
How this relates to other classes:
Building on the foundation from the previous class, where students were introduced to the fundamentals of Alteryx, this class delves deeper into the Extract, Transform, and Load (ETL) process, emphasizing the use of regular expressions to clean and reformat raw data. Students will apply their understanding of Alteryx to extract specific patterns from unstructured data, such as employee ID records and emails, and transform it into usable formats.
Materials and Preparation
Class Materials
- Case: Analytics_mindset_case_studies_ETL_Case2_Alteryx
 - Case: Analytics_mindset_case_studies_ETL_Case3_Alteryx
 - Case: Innovation_mindset_case_studies_Cybersecurity_Audit_Enron_Emails
 - Link: Online Regular Expressions Tool
 - Link: Chat GPT (or use other LLM) for help with regular expressions
 - Link: Husky OnNet VPN Instructions (required for off-campus access to labs)
 - Slides: PowerPoint or PDF
 - Slide decks including additional in-class/after-class slides: PowerPoint or PDF
 - Analytics Tools: Alteryx: RegEx Tool and Join Tool
 - 
              
Suggested Pre-Class Preparation
- There is no required preparation for this class. The cases will guide the Labs, as a reference.
 
 - 
                
Class Plan
- After a very brief review of ETL Cases 2 and 3 we will work in the remote labs on the Join (ETL 4) case.
 - We will plan to complete the join case in the first half of class and then set-up and work on an advanced ETL case: Innovation_mindset_case_studies_Cybersecurity_Audit_Enron_Emails.
 - We will also examine a database of emails that will provide more advanced regular expression challenges.