Cancer subtype classification with sparse dimensionality reduction methods
This project compares the perfomance and usability of alternative sparse formulations derived from PCA, with an application to cancer subtype classification using high-dimensional gene expression data.
-
Machine Learning
I take a large interest in Machine Learning and I believe that building the effective solutions using models starts from a fundamental understanding of the underlying mathematical and statistical concepts. Moreover, I thoroughly enjoy using interest and expertise to give trainings and teach others on topics in this field, both to specialists and business professionals.
-
Data Science
Data Science is a field that brings my love of programming and my passion for solving complex problems together. I enjoy designing and building actionable solutions to solve real problems using data and Machine Learning.
-
Tutoring
During my time at university, I developed a strong affinity for mathematics. I found that math courses were the most engaging for me, and I began to enjoy them most in the curriculum. This led me to start tutoring other students in math, and I discovered that the art of teaching and presenting complex material in an understandable manner gives me a lot of satisfaction.
-
Training & Presenting
Building upon the tutoring and Machine Learning experience, I started giving training to business professionals on ML and Data Science concepts. Through this work, I have developed strong presentation skills, allowing me to effictively communicate technical ideas to a non-technical audience. I have a passion for sharing my knowledge and helping others to understand the fun and the power of these technologies.
-
ML engineering
I have worked with many libraries, frameworks, and services used for ML engineering and Data Science, such as the standard Python ML stack, but also more specialized tooling such as MLflow, Kedro, AWS SageMaker, Azure ML, FastAPI, and more.
-
Data Engineering
During several projects, I have gained extensive hands-on experience creating and maintaining data pipelines. Using tools such as Spark, AWS Glue, dbt, and Polars or platforms like Dataiku. I am familiar with best practices and principles of data engineering.
-
VCS
I have experience in working with version control systems such as Git and respective hosting services. I have used these tools extensively in all of my projects and professional work for collaboration, systematically keeping track of changes, and efficient integration.
-
CI/CD
I possess hands-on experience in establishing CI/CD pipelines for various projects and professional tasks. Through these pipelines, I've automated both testing and deployment processes, ensuring swift releases and maximizing application uptime.
-
Containerization
I have experience with Docker to containerize applications, services, and to serve Machine learning models. I have used Docker to create a reproducible environment, facilitating easier deployment, scaling and collaboration with team members.
-
Cloud
I've worked extensively with Machine Learning and Data pipelines in the cloud, primarily using AWS. My experience includes deploying, managing, and optimizing these pipelines to ensure efficiency and scalability. Additionally, I've employed Infrastructure as Code practices, using tools like Terraform, to automate and standardize cloud infrastructure.





