Hi, I’m Sarah Jane Fullerton

A passionate Data Scientist and Business Intelligence professional with hands-on experience in SQL, Python, R, Tableau, BigQuery, and data visualization. I’ve built predictive models, dashboards, and data pipelines across healthcare, tech, and aviation industries to drive data-informed decisions.

What I Do:

  • Data Science: Creating models to predict outcomes and uncover hidden insights
  • Data Engineering: Designing pipelines and systems for efficient data flow
  • Visualization & BI: Building dashboards and reports to communicate key metrics

Explore my portfolio to see my projects, certifications, and publications in action.

Technical Skills

Category Skills
Programming Languages SQL, R, Python, JavaScript, BigQuery
Big Data & Machine Learning PostgreSQL, BigQuery, Linux, Docker, Spark, Kafka, Hadoop, Data Modeling, Data Mining, ETL, Data Pipelines, Data Warehousing, Database Design, Data Integration, PyTorch, TensorFlow, scikit-learn
Visualization Tableau, Power BI, Dashboard Reporting, Matplotlib, Plotly, D3.js, Bokeh, ggplot2
Other SQLite, HTML, CSS, Node.js, DevOps, Git, ArcGIS, Stata, Salesforce, Google Gemini, Applied Epic, Power Automate/Flow

My Favorite Projects

Time Series Analysis on U.S. Natural Disasters and Mental Health Disorders

GitHub: Time-Series

Nov 2024

  • Performing exploratory data analysis (EDA) to uncover trends, correlations, and patterns in mental health data and extreme weather events from 1953 to 2023.
  • Implementing time-series analysis to predict the impact of future weather events on mental illness rates.
  • Visualizing data to communicate findings, and using sentiment analysis to explore public emotional responses to natural disasters as a proxy for mental health outcomes.

Reddit Sentiment Analysis with API Web Scraper and ML Models

GitHub: Sentiment-Analysis

Nov 2024

  • Developed a Python-based API web scraper using PRAW to collect Reddit data on Hurricane Helene, followed by preprocessing and manual labeling of 200 samples.
  • Fine-tuned transformer models to automate sentiment labeling of the dataset, leveraging pandas for data manipulation and transformers for model training.
  • Used sentiment analysis to study how the public's emotional responses to the hurricane shifted over time, providing insights into community perceptions of natural disasters.

Spotify Music Trends Analysis: ML Approach to Understanding Global Shifts

GitHub: Spotify-Analysis

Nov 2023

  • Conducted a comprehensive analysis of Spotify's top 200 global dataset (2017-2023), exploring trends in audio features, genre evolution, and the impact of the COVID-19 pandemic on music preferences.
  • Applied clustering techniques such as spectral clustering to analyze genre patterns and investigated seasonal and collaborative trends in music.
  • Used random forest regression to explore predictors of song popularity, providing actionable insights for improving user experience and recommendations on the platform.

Experience

Download my resume here!

I have hands-on experience in data science, analytics, and business intelligence through internships and project-based work across transportation, healthcare, tech, and non-profit sectors.

My work includes building predictive models, developing dashboards, and maintaining databases to support data-driven decision-making and operational improvements.

I am passionate about leveraging data to uncover insights that drive business growth, optimize processes, and improve outcomes.

Publications

Extremal Polynomial Norms of Graphs

Authors: Bouthat, L., Chávez, Á., Fullerton, S., LaFortune, M., Linarez, K., Liyanage, N., Son, J., Ting, T.

Published: Nov 8, 2023. Read the full paper

Developed new norms on Hermitian matrices using complete homogeneous symmetric polynomials (CHS), enabling refined graph distinctions. Proved CHS norms are minimized by paths and maximized by complete graphs and stars, making findings accessible to a broad mathematical audience.

Exploring U.S. Natural Disasters and Psychological Distress: Time Series & Machine Learning Insights

Author: Sarah Jane Fullerton

Submitted: Dec 1, 2024. Read the full paper

Senior thesis investigating historical trends of psychological distress in the U.S. in relation to natural disasters. Applied exploratory data analysis (EDA), time series analysis, and machine learning models on real-time Reddit data to measure public sentiment. Built custom Reddit web scraper and labeled dataset for sentiment analysis, providing actionable insights on emotional responses to disasters and potential future crisis response applications.

Education

Claremont McKenna College - December 2024

Bachelor’s Degree in Data Science

  • Data Science: Big Data, Applied Machine Learning, Data Mining, Data Structures & Algorithms
  • Business Intelligence: Remote Databases, Probability, Statistical Inference, Capstone
  • Engineering and other: Linear Algebra, Discrete Mathematics, Calculus I–III

Certifications

Google Business Intelligence Professional Certificate

Platform: Coursera | Completed: Nov 2025

  • Completed 4-course program covering SQL, Tableau, BigQuery, ETL, and data visualization.
  • Module 1 Project: Applied BI skills to solve a business problem and prepared project planning documents.
  • Module 2 Project: Created a data pipeline to deliver data to a target table, developed reports, and ensured data quality safeguards.
  • Capstone Project: Built dashboards for Google Fiber and Cyclistic to showcase end-to-end BI skills.

My other interests and hobbies include:

  • Contortion & Acrobatics - Flexibility and strength-building practices.
  • Writing - Crafting reflections, research, and creative projects.
  • Baking - Experimenting with recipes and sharing delicious treats.
  • Dancing - An outlet for energy and self-expression.
  • Embroidery - Check out my embroidery projects.

Get in touch

I would love to connect with you! Feel free to send me an email or find me on Linkedin :-).