Selecting Sources

To investigate the impact of COVID-19 on education, we worked with data from the Stanford Education Data Archive (SEDA) Administrative District Annual Subgroup Test Score Dataset (seda_admindist_annualsub_cs_2024.2). This dataset provides district-level average test scores in math and reading/language arts (RLA) for U.S. public and charter school students in grades 3–8, and includes subgroup breakdowns by race, gender, and economic status. These disaggregated measures allow researchers to examine differences in academic outcomes across demographic groups and geographic districts (Reardon, 2024).

Our analysis focused on the period 2019–2024. We used 2019 as a pre-pandemic baseline to represent academic performance before COVID-19 disruptions. We then examined 2022 as the first reliable post-pandemic year, when most schools had returned to in-person instruction and statewide testing resumed more consistently. Finally, we used 2023 and 2024 to observe patterns of early and later recovery, allowing us to compare how different student subgroups recovered over time.

By analyzing these years together, we were able to examine learning loss associated with the pandemic, identify patterns of recovery, and explore whether racial and socioeconomic disparities widened or narrowed during the recovery period. This approach helped us better understand how the pandemic affected students differently across communities and school districts (Reardon, 2024)..

Processing Data

For our analysis, we focused on the 2019–2024 testing years to capture both the baseline period before COVID-19 and the years during and after recovery. We filtered and cleaned the SEDA dataset in a Jupyter Notebook using Python. First, we reduced the dataset to only the relevant years. From the full dataset of 1,198,268 rows, this produced two working files. The dataset covering 2019–2024 kept 293,018 rows, while a smaller dataset focusing specifically on the post-COVID recovery period (2022–2024) kept 200,624 rows.

We used the empirical Bayes (EB) achievement scores provided by SEDA along with their adjusted standard errors, because these measures are recommended by SEDA for analysis. EB estimates improve reliability by reducing noise in districts with smaller sample sizes, while the adjusted standard errors provide more accurate uncertainty estimates.
To make the analysis more manageable and computationally efficient, we kept only the columns necessary for our research questions and removed unrelated variables. Rows with missing EB scores were dropped or excluded from analysis because missing achievement values would prevent valid comparisons across districts and years (principles outlined in Data + Design by Trina Chiasson and Dyanna Gregory).

Presenting Our Narrative

We used WordPress to build our website because of its customizable themes, user-friendly interface, and wide range of plugins.

For our data visualizations, we used Tableau to create interactive charts and maps. Tableau enabled us to clearly present trends and patterns in our data while allowing users to explore the visuals dynamically. We embedded these visualizations directly into our WordPress pages to maintain a seamless user experience. Additionally, we ensured that each visualization included captions and contextual explanations so that viewers could understand the significance of the data without prior knowledge of the subject. It was critical that our visualizations are ethical, coherent and cohesive, with minimal ambiguity (as we learned the importance of in Albert Cairo’s How Charts Lie).

Meet the Team

Robin Murphy – Project Manager

Hi, my name is Robin and I’m a third year Cognitive Science major and Digital Humanities minor. As Project Manager, I was responsible for overseeing team operations and managing communications. I also supported the creation of data visualizations, helping connect our analyses to broader research questions.

Ava Fahn – Editor

Hi! I’m Ava and I’m a second year Communications major with a minor in Digital Humanities. As editor, it was my responsibility to ensure that all content on our website was accessible, coherent, and consistent, with no organizational, grammatical, or spelling errors.

Jessup Byun – Data Specialist

Hello, my name is Jessup and I’m a third-year Statistics and Data Science major. In my role as our data specialist, I cleaned and structured the data into an analysis-ready format. I also created a clear documentation to explain key insights and features to teammates.

Lester Wang – Content Developer

Hi, my name is Lester, I’m a fourth year Statistics major and I served as the Content Developer for our project. I was responsible for writing and organizing the website’s content. I focused on making the information clear, consistent, and easy to understand for our readers.

Aly Tan – Data Visualization Specialist

Hi, my name is Aly. I’m a fourth year Statistics and Data Science major. My role as a data visualization specialist is to identify meaningful patterns and trends and designing interactive visualizations using tools like Tableau.

Lilly Zhang – Web Designer

Hi, my name is Lilly and I’m a third year Computer Engineering major! As web designer, I oversaw the structure and design of the website and ensured the site performs to the team’s expectations.

Acknowledgements

We’d like to thank Dr. Sabo for providing the resources, guidance, and support that gave us a strong foundation to conduct our research and analysis.

We’d also like to thank Kai Nham for his guidance throughout the project. His advice, feedback, and support on all aspects of our work were invaluable to our success.