My Profile
Table of Contents
Remove
/mindpalace...
to see my resume website! 👀
👋 Hi there, I’m Khoa Pham!
🎓 I’m a Master’s student majoring in Business Analytics and Data Science.
📊 I’m passionate about using data to uncover insights and support strategic decision-making.
📈 My experience in finance and healthcare has enabled me to strengthen my data analysis and research skills.
🧑💻 Proficient in SQL, Python, R, Tableau and Git; while actively enhancing my expertise in Data Engineering and Cloud technologies.
- Contact Info:
- 📍 Based in Los Angeles, CA
📞 Contact me 714-858-7494
📧 kdpham@umass.edu (School)
💌 kdpham1002@gmail.com (Personal)
Education
University of Massachusetts - Amherst | Sep 2019 - May 2024
- M.B.A in Business Analytics (Accelerated 4+1 Program)
- B.S in Informatics Data Science | Minor: Statistics | Pre‑Med Track
- GPA: 3.9/4.0
- Dean’s List (all year), Chancellor’s Award Scholarship
- Relevant Courses: Database Management (SQL), Machine Learning & Data Science for Business (Python), Applied Statistics (R), Linear Algebra, Discrete Math, Data Structures & Algorithms (Java), Web Programming (HTML/CSS/JavaScript), Project Management, Media Marketing
Skills
- Programming & Tools: SQL, Python, R/RStudio, Tableau/Power BI, Git/GitHub; MS Office (Excel, Word, PowerPoint, Access, Outlook, Project)
- Cloud & Big Data: AWS (S3, Redshift, Glue, QuickSight), Snowflake, Spark (PySpark)
- Libraries & Frameworks: Pandas, NumPy, Matplotlib, Seaborn, Scikit‑learn, Statsmodels, SciPy, PyTorch
Techniques: Statistical Analysis, Hypothesis Testing, Predictive Modeling, AI/ML, ETL/ELT Pipelines, Data Integration, Data Visualization, Data Warehousing
- Project Management: Agile/Scrum, SOP Documentation
- Healthcare Knowledge: EMR/EHR Systems, ICD-10/CPT Coding Familiarity, Claims Processing, 340B Drug Pricing Program, HIPAA Compliance, Medicaid/Medicare Regulations
Exams & Certifications
- Actuarial Exams: P, FM, SRM (in progress)
- VEEs: Economics, Accounting and Finance, Mathematical Statistics
Projects
Customer Purchase Behavior Analysis
GitHub | Report
Tech Stack: Python, Pandas, Scikit-Learn, Matplotlib, Seaborn, Plotly, Statistical Analysis, Git/GitHub
- Performed EDA on Walmart sales data to analyze spending trends by gender, age, and marital status.
- Applied statistical methods (CLT, confidence intervals) to compare demographic spending patterns.
- Developed visualizations using Matplotlib & Seaborn to showcase purchase behavior insights.
- Generated recommendations to improve customer acquisition, retention, and marketing strategies.
Ecommerce Data Pipeline and Forecasting Model
GitHub | Report
Tech Stack: Python (Pandas, NumPy, Matplotlib, Scikit-learn), Spark (PySpark), SQL, Git/GitHub
- Built an OOP-based inventory tracking system for 1,000+ products, ensuring efficient inventory control.
- Implemented a PySpark retail data pipeline to clean and transform large‑scale order records for analysis.
- Applied Random Forest and Regression models to predict future product demand and sales trends.
- Engineered features from time-series data, leveraging weekly sales trends for improved forecasting accuracy.
Banking Marketing and Investment Optimization
GitHub | Report
Tech Stack: Python (Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, PyPortfolioOpt), SQL, Git/GitHub
- Developed a credit card approval model with 81.8% accuracy, automating risk assessment for banks.
- Optimized data pipeline for bank marketing analysis, enabling more precise customer targeting.
- Analyzed financial ratios, identifying industry‑specific risk trends for investment decision‑making.
- Optimized a FAANG stock portfolio using mean‑variance optimization, balancing risk and returns.
Healthcare Employee Attrition Prediction
GitHub | Report
Tech Stack: Python (Pandas, Scikit-Learn, Matplotlib, Seaborn), Decision Tree, Logistic Regression, Git/GitHub
- Developed Decision Tree and Logistic Regression models to predict high‑risk employee attrition.
- Achieved classification accuracy of 80%, providing insights for targeted employee retention strategies.
- Engineered demographic, work-related, and compensation features, improving model interpretability.
- Conducted statistical analysis on work‑life balance, job involvement, and salary trends to support HR decisions.
Customer Subscriber Churn Prediction
GitHub | Report
Tech Stack: Python (Pandas, Scikit-Learn, Matplotlib, Seaborn) Logistic Regression, Decision Tree, Random Forest, K-Means Clustering, Git/GitHub
- Performed EDA on Walmart sales data to analyze spending trends by gender, age, and marital status.
- Applied statistical methods (CLT, confidence intervals) to compare demographic spending patterns.
- Developed visualizations using Matplotlib & Seaborn to showcase purchase behavior insights.
- Generated recommendations to improve customer acquisition, retention, and marketing strategies.
University Mental Health Research Study
GitHub | Report
Tech Stack: SQL, Python (Pandas, Seaborn, Scikit‑learn, SciPy, Statsmodels), Git/GitHub
- Conducted statistical analysis on 200+ international students, uncovering mental health trends.
- Developed a Random Forest model predicting student depression risk with 75% accuracy, providing early intervention and recommendations to enhance peer‑support programs.
Pharmaceutical Market Share & Pricing Analysis
Vertex Pharmaceutical OA | Spring 2025
Technology: Excel (Pivot Tables), Python (Pandas, Seaborn), Tableau (dashboards)
- Analyzed US pharma sales data to assess market share, pricing, and revenue across regions and therapeutic areas; visualized trends to distinguish volume-driven vs. premium-priced drugs.
- Delivered insights and recommendations on pricing strategy and affordability, supporting market access decisions based on therapeutic value.
Finance Department’s New Hire Request
Isenberg School of Management | Fall 2024
Technology: SQL (MySQL), Tableau (dashboards), Excel & PowerPoint (reporting)
- Analyzed 40,000+ admissions from university’s factsheets and data table tables. Identified a 41% growth in undergraduate enrollments and identify gaps in faculty to support resource allocation for Finance Department.
- Developed dashboards to present 1000+ enrollments, integrated forecasting models leading to recommendations for hiring faculty specialized in risk management to address industry demands and enhance program prestige.
Car Sales, Insurance Claims & Charging Behaviors
College of Information & Computer Science | Summer 2024
Technology: Python (Pandas, NumPy, Statsmodels), SQL (PostgreSQL, SQLAlchemy), Git/GitHub
- Designed and implemented an SPC‑based monitoring system, reducing manufacturing defects by identifying deviations in control limits.
- Built a predictive model for car insurance claims, pinpointing driving experience as the strongest predictor with 77.71% accuracy.
States Economic Dynamics Analysis
MGMT 601: Data Management | Spring 2024
Technology: SQL (MySQL), Tableau (dashboards), Excel & PowerPoint (reporting)
- Designed interactive dashboards analyzing state-level income, expenses, unemployment rates, cost of living, and population trends.
- Integrated and processed datasets covering diverse economic indicators (e.g., median income trends from 2012–2023, cost of living indices, and unemployment rates).
Experience
Center for Teaching and Learning | Amherst, MA
Data Management Coordinator | Sep - Dec 2023
- Managed and cleaned survey data for workshop participation analysis, ensuring data accuracy. Developed automated reports recognizing 300+ professors and lecturers for Distinguished Teaching Awards.
Biomedical NLP Processing Laboratory | Amherst, MA
Biomedical Research Assistant | Feb - May 2022
- Processed unstructured biomedical data, including EHR notes and scientific articles, to support NLP research. Assisted in developing text‑mining algorithms to identify key medical terms in clinical documentation.
- Optimized large-scale information retrieval by implementing MapReduce and Spark‑based inverse index tables, enhancing query efficiency.
Data Science Track | Amherst, MA
Program Mentor & Peer Tutor | Sep 2021 - May 2022
- Guided students in curriculum planning, career development, technical preparation for data science and analyst roles, connecting them with mentors from the Computer Science Department.
CS326: Web Programming | Amherst, MA
Data Science Interview Preps Platform | Sep 2021 - May 2022
Technology: SQL, Python, HTML, CSS, JavaScript, Ruby, VSCode (Jupyter Notebook), nbconvert, Git/GitHub
- Developed a Q&A interview practice platform for Python, SQL, and cloud technologies, leveraging Feynman Technique and Active Recall to improve retention of technical concepts and problem‑solving skills.
Five College Language Program | Amherst, MA
Teaching Assistant | Sep 2021 - Feb 2022
- Conducted weekly lessons to improve students’ language proficiency, customized curriculum to meet diverse learning needs, prepared them for final exams and provided detailed weekly progress reports.
Millennium Dance Complex | Orange, CA
Digital Media Manager | Summer 2024
- Filmed, edited, and managed video database for dance classes, implementing efficient categorization to enhance digital workflows.
- Analyzed social media performance, delivering weekly insights and recommendations that drove increased audience engagements and social media growth.
Humans of CICS | Amherst, MA
Social Media Outreach | Spring 2020
- Highlighted professional stories of the CS Department’s faculty, professors, and students. Collaborated with alumni and staff to expand and foster stronger connections within the CICS community.
Interests
- Photography: I like capturing dance movements. Check (VSCO)!
- Videography: I also film for dance classes! Check (@teenee_archives)!
- Dance: …a little at MDC Dance (@mdcdance)!
- Blogging: …work in progress… 🤧 -> Visit my first blog instead!
Add
/mindpalace
aftergithub.io
to have a peek of another me! 🙋♂️