Experience

  1. Research Fellow

    Stanford RegLab

    Responsibilities include:

    • Training PyTorch Geometric GraphSAGE GNN to classify partnerships’ risk of noncompliance using networks of taxpayer structures
    • Implementing self-supervision tasks on heterogeneous graph such as link prediction to improve performance of GNN
    • Analyze model predictions to ensure businesses identified provide greatest opportunity for increased revenue
    • Utilize current ML and accounting research to guide decision making to improve the IRS’s audit selection process
  2. Data Science Intern

    Mercury Insurance

    Responsibilities include:

    • Analyzed XGBoost auto underwriting model to identify key areas for improvement, increasing predicted profit from model by 28%
    • Created 15 new features using SQL queries, resulting in a 23% increase in predictive accuracy for high-risk policies
    • Optimized models using SHAP and XGBoost feature importances, maintaining performance after removing 60 features
    • Built R-shiny dashboard with lift charts and profit improvement visualizations, facilitating decision making by stakeholders
  3. Software Engineering Intern

    JP Morgan Chase & Co.

    Responsibilities include:

    • Implemented contract testing framework, allowing for scalable testing of all microservice applications within data pipelines
    • Developed 2 Rest APIs and updated functionality of existing APIs responsible for handling $2 trillion in consumer payments daily
    • Built a new microservice to maintain logs using Java, Spring Boot, and Kafka
  4. Software Engineering Intern

    JP Morgan Chase & Co.

    Responsibilities include:

    • Automated 6 data pipelines using ETL framework, ingesting and transforming consumer data using Spark SQL and JPMC libraries
    • Tested pipeline transformations and implemented step definitions using Cucumber files and deploying to DPL server

Education

  1. MSc Social Data Science

    University of Oxford

    Grade: Distinction (First Class Honours)

    Courses included:

    • Applied Machine Learning
    • Network Analysis
    • Applied Analytical Statistics
    • Data Analytics at Scale

    Thesis on the application of Siamese GNNs to the International Trade Network for the purpose of identifying financial crises. Also invited to Northeastern University London’s Networks and Time II Conference for paper that used trade and migration networks to predict links in terrorism networks.

    Read Thesis
  2. BA Data Science and Economics

    Wellesley College

    GPA: 3.94/4.0

    Courses included:

    • Natural Language Processing
    • Machine Learning
    • Multivariate Data Analysis
    • Applied Data Analysis and Statistical Inference

    Thesis on fine-tuning DistilBERT for classifying forms of advocacy within 21 million tweets related to the Black Lives Matter movement. Achieved F1-score of 0.89, which was a 25% increase from traditional NLP methods. Identified shifting trends within the movement, including a increased emphasis on disruptive rather than within-the-system forms of action.

    Read Thesis
Skills
Python
SQL
Java
R
Stata
Machine Learning
Deep Learning
NLP
Graph ML
Hypothesis Testing
Statistical Inference
Multi-level Modeling
Data Visualization
TensorFlow
PyTorch
PyTorch Geometric
Spark
NumPy
Scikit-Learn
Git
Networkx