Back to home

All Projects

A collection of my work across data analytics, engineering, and product development.

  • Developed ETL workflows to process and validate CMS healthcare data, transforming raw datasets into structured fact and dimension tables for analysis.
  • Built SQL analyses and Tableau dashboards to evaluate hospital performance, uncovering insights on profitability, cost efficiency, and regional trends to support business decision-making.
SQLETLData WarehousingTableau
View details
  • Processed and analyzed a 3M+ token dataset using Python and SQL, applying exploratory data analysis and model evaluation to improve response accuracy to 85%.
  • Developed end-to-end data and analytics pipelines for ingestion, inference tracking, monitoring, and optimization, reducing latency by 75% and infrastructure costs by 82%.
PythonSQLAzureAWS
View details
  • Developed automated data analytics pipeline to collect, clean, validate, and process 1,000+ new internship postings daily, delivering insights to hundreds of students.
  • Applied automated classification models (GPT-4o, Google Gemini) to categorize job data and filter low-quality entries, increasing pipeline efficiency by 90%.
PythonSQLAWSData Pipelines
View details
  • Analyzed 116,000+ global immunization records to identify patterns in vaccination coverage and support data-driven public health decisions.
  • Cleaned and transformed complex survey data, engineered features, and addressed class imbalance using SMOTE to improve detection of low-coverage regions.
  • Evaluated multiple models and identified XGBoost as top performer (F1-score: 96.7%, AUC: 0.99), generating insights to prioritize high-risk regions and optimize resource allocation.
PythonXGBoostscikit-learnSMOTEMachine Learning
View details
  • Developed a skincare recommendation system using Python, TensorFlow, scikit-learn, and Streamlit to compare skincare products based on user preferences and skin concerns, with data sourced from Sephora through API integration.
  • Enhanced recommendation engine with data analysis and machine learning skills, ensuring seamless user interaction via Streamlit and robust data integrity with thorough preprocessing, handling 30% null values in different features.
PythonTensorFlowscikit-learnStreamlitAPI Integration
View details
  • Integrated 10+ global datasets in Tableau to analyze AI and labor trends, identifying a 2,700% growth in generative AI investment.
  • Developed dashboards using calculated fields and trend analysis, revealing a 25% drop in worker confidence despite AI hiring expansion.
  • Evaluated automation risk across occupations, highlighting 48% exposure in routine jobs and uneven regional workforce impact.
TableauData VisualizationAnalyticsWorkforce Analytics
View details