About me

As a data scientist and machine learning engineer with a Ph.D. in Computer Science, I bring a wealth of expertise in spearheading and advancing cutting-edge projects. My focus lies in leveraging advanced MLOps, statistical modeling, machine learning, and data analysis techniques to unearth actionable insights and deploy data-driven solutions within commercialized cloud-based environments. With a deep-rooted commitment to the realms of data science and academic innovation, I actively contribute to prestigious conferences and journals such as NeurIPS and Pattern Recognition, consistently pushing the boundaries of knowledge and discovery.

Professional Experience

Senior Data Scientist

AGL Energy (October 2022 - April 2024)

During my time at AGL Energy, one of the leading utility companies with 22.26% esidential electricity market share in Australia, I led a team of seven data scientists, guiding them through the intricacies of end-to-end data science and MLOps solutions. From the initial stages of pitching ideas to the final stages of post-deployment support, I facilitated every aspect of the process. Over the course of my tenure, I spearheaded the design, development, and delivery of more than ten scalable cloud-based data-driven solutions, each tailored to meet the company’s evolving needs. Ensuring seamless alignment with business objectives, I adeptly communicated the results and insights gleaned from our endeavors to stakeholders.

I thoroughly enjoyed my time at AGL and felt privileged to contribute to various projects that led to significant accomplishments. I initiated of multiple Gen-AI proof of concepts, which yielded substantial enhancements across various facets of AGL’s operations. Additionally, I was onvolved in developing internally developed Python packages and toolkits which resulted in noticable reduction of pilot-to-production duration. Through the implementation of robust logging and model monitoring pipelines, I significantly reduced the Mean Time to Resolve (MTTR) of maintenance tickets. Furthermore, I scoped and developed a DataBricks feature store, integrating PySpark, and deploying MLOps stacks leading to elimination of ingestion pipeline failures. Leveraging advanced Natural Language Processing (NLP) and Large Language Models (LLMs), I orchestrated a notable reduction in service desk time allocation, showcasing the transformative impact of cutting-edge techniques in our operational efficiency.

Data Scientist Researcher

RMIT University/ Ford Company (June 2021 - October 2022)

At RMIT University and Ford Company, I worked in a team of 5 engineers and researchers in implementing a versatile quality control technology. Our collaborative efforts focused on detecting defects in industrial parts, thereby increasing company revenue. Leveraging domain adaptation methods and deep neural networks, including ResNet50 as the backbone, I utilized tools like PyTorch, OpenCV, and Torch-vision to achieve efficient defect detection.

Machine Learning Specialist

Keylead Health (January 2021 - June 2021)

During my tenure at Keylead Health, I collaborated with a multidisciplinary team to analyze healthcare data, aiming to enhance clinical trial and medical research processes. Working alongside data engineers and clinicians, I contributed to the development of strategies to improve healthcare outcomes.

Skills

Machine Learning and Data Science:

  • Machine learning concepts: Classification, Regression, Clustering, Transfer Learning, Time Series Forecasting.
  • Natural Language Processing: Foundational and Generative LLMs, Generative AI, Langchain, RAG.
  • Deep Neural Networks: CNNs, RNNs, Attention, Transformers (Text, Vision, Multi-modal), LSTM.
  • Python packages and frameworks: PyTorch, NLTK, Scikit-learn, SpaCy, OpenCV, SciPy, NumPy, Pandas

MLOps in Cloud Platforms:

  • Azure: Azure resources, Blob Storage, ML Studio, Monitoring, Logging, Online and Batch Endpoints, DevOps practices, Managed Feature Store, Azure Cognitive Services.
  • Databricks: MLOps-Stacks, Unity Catalogue, Feature Store, MLflow.
  • Vector Databases: MongoDB, Faiss.

Software Engineering Practices and Tools: Git, Continuous Integration (CI), Agile, Design Patterns

Big Data: SQL (MySQL) and PySpark, Dask.

Soft Skills: Stakeholder Management, Critical Thinking, Agile Mindset, Teamwork.

Selected Projects

AI/ML Project Template Repository

This repository is designed to serve as an initial structured template for AI and machine learning projects. Whether you're a beginner or an experienced practitioner, this template provides a solid starting point for organizing your project files, documentation, and workflows.

Repo link - Python

LLM-serving

This repository is inspired by the course Efficiently Serving LLMs from Efficiently serving llms from DeepLearning.AI for efficient LLM serving using transformers and PyTorch. The code for several different versions of LLM serving is implemented so they can be chosen and used based on preferences and capacity.

Techniques: KV caching, Batching, Continuous Batching

Technologies: Transformers, Gpt2, PyTorch, Tokenizer

Repo link - Python

Income-Level-Prediction-US-Census-binary-classification

This repository has been established for a technical assessment assignment during an interview process. The goal is to deliver outcomes for a stakeholder presentation.

This repository contains the code and documentation for a data science project focused on predicting income levels based on demographic and economic data. The project aims to address a binary classification problem, determining whether an individual's income exceeds $50,000.

Techniques: Binary Classification, Feature selection, EDA, Model Selection, Hyperparameter tuning

Technologies: Scikit-learn, pandas, scipy, seaborn, matplotlib

Repo link - Python

Optimal visual search based on a model of target detectability in natural images
Hao Tang, Shima Rashidi, Krista Ehinger, Andrew Turpin, Lars Kulik
Advances in Neural Information Processing Systems, 2020
Neurips Paper Python Code

Developed a biological model of human vision using deep neural networks and transfer learning (results published in Neurips2020, Pattern Recognition Journal, Scientific Reports Journal).

Techniques: Deep Neural Networks, CNNs (Alexnet, Resnet, VGG16), transfer learning, Bayesian
Technologies: PyTorch and OpenCV

For more info

Check my LinkedIn account: (Link)