Class Overview
Why is this important?
Understanding and mastering the ETL process is essential for accounting graduate students because the quality of any data analysis depends on how well the data has been prepared. Poorly structured or unclean data can lead to faulty conclusions, negatively impacting decision-making in professional settings. By working with real-time financial data, students learn to handle common challenges such as inconsistent formats, missing data, and the integration of multiple data sources. This class equips students with the practical skills to ensure their data is reliable, enabling them to deliver accurate, data-driven insights critical for roles in auditing, financial reporting, and advisory services. The ability to effectively prepare and manage large datasets also provides a foundation for more advanced analytics and automation tasks.
What will we do?
This class provides a discussion and set of practical exercises relating to the Extract, Transform, and Load (ETL) process. ETL is the first and most crucial step in data analytics, involving the extraction of raw data, its transformation into a suitable format, and the loading of that data into analytical workflows. Students will focus on preparing data for analysis, recognizing that improper preparation can lead to inaccurate or misleading results. In this session, we will continue working with real-time financial data from the SEC and other financial APIs, applying the ETL process to ensure the data is ready for in-depth analysis.
How this relates to other classes:
This class extends the prior session, which introduced the analytical mindset and skillset by focusing on the practical application of these concepts through the ETL process. In the earlier class, students learned how to approach data with a critical, analytical perspective, identifying key patterns and trends. Now, by incorporating real-time data extraction from financial APIs and applying rigorous data preparation techniques, students deepen their understanding of how a properly structured dataset serves as the foundation for robust analysis. This class reinforces the analytical mindset by requiring students to actively engage with complex, unstructured data and refine it for accurate analysis, thus bridging the gap between conceptual knowledge and practical implementation. By extending these skills, students are better prepared to handle the data challenges they will encounter in professional accounting roles.
Materials and Preparation
Class Materials
- Case: Analytics_mindset_case_studies_ETL_Case2_Alteryx
 - Case: Analytics_mindset_case_studies_ETL_Case3_Alteryx
 - Case: Analytics_mindset_case_studies_ETL_Case4_Alteryx
 - Link: Online Regular Expressions Tool
 - Link: Chat GPT (or use other LLM) for help with regular expressions
 - Link: Husky OnNet VPN Instructions (required for off-campus access to labs)
 - Slides: PowerPoint or PDF
 - Analytics Tools: Alteryx tools: Formula, Text-to-Column, RegEx.
 - 
              
Suggested Pre-Class Preparation
- There are no required readings for this class.
 
 - 
                
Class Plan
- We will continue to work on Extract, Transform, and Load exercises in Alteryx.
 - During class we will plan on working through all three ETL Cases assigned to this class.
 - After a brief review, the bulk of the class time will be spent in the remote labs working with Alteryx.