This class provides a discussion and set of practical exercises relating to the Extract, Transform, and Load (ETL) process. ETL is the first and most crucial step in data analytics, involving the extraction of raw data, its transformation into a suitable format, and the loading of that data into analytical workflows. Students will focus on preparing data for analysis, recognizing that improper preparation can lead to inaccurate or misleading results. In this session, we will continue working with real-time financial data from the SEC and other financial APIs, applying the ETL process to ensure the data is ready for in-depth analysis.
Case: EDGAR Explorer
Slides: will be available for download by the beginning of class in either
powerpoint
or
pdf formats.
Data: A data update may be required for this class. To ensure your files are the most up-to-date, navigate to ACCTG522_Labs folder and run the command git pull
.
Analytics Tools: Git and GitHub, API keys using Python, Alteryx download tool and other Alteryx ETL tools.