Author Image

Hi, I am Thiago

Thiago F. Miranda

Data Scientist at Catho Online

I am a data scientist with 6 years of experience, specializing in statistics. My skills cover data analysis, extracting actionable insights, predictive modeling, and crafting innovative solutions.

I excel in decoding complex datasets to uncover valuable trends, and I’m adept at using predictive modeling to foresee future patterns and optimize strategies. My strength lies not just in understanding data but in turning insights into practical solutions.

Passionate about continuous learning, I stay updated on the latest in data science to ensure I bring the most effective strategies to the table. I approach my work with a clear focus on using data to drive precision, efficiency, and innovation.

Data-Driven
Problem Solving
Leadership
Communication
Team Work
Hard Working

Skills

Experiences

1
Catho Online

May 2019 - Apr 2024

São Paulo

Catho is a leading Brazilian online platform that connects job seekers with employers, offering a wide range of career opportunities across various industries.

Data Scientist

Sep 2021 - Apr 2024

Responsibilities:
  • Developed and supported machine learning models for tasks such as scoring, ranking, and content recommendations, using Python, R, and AWS tools.
  • Increased paying user conversions by 11% by identifying key features based on free user behaviours and recommending targeted features for paid users.
  • Boosted job applications from paying users by 30% by automating CV submissions through a job-CV scoring model.
  • Built a data framework that automated the end-to-end reporting process, from data extraction to delivering actionable business insights, using tools like Tableau, Power BI, and Miro to communicate insights clearly to stakeholders.
  • Streamlined data analysis and reporting workflows by integrating diverse data sources, including MySQL, AWS S3, and Google Analytics.
Data Analyst

May 2019 - Aug 2021

Responsibilities:
  • Created a user conversion propensity model to increase lead conversions, leveraging Python and AWS SageMaker.
  • Led A/B testing initiatives for product teams, optimising pricing strategies, business models, and new product launches with a focus on enhancing customer experience.
  • Performed in-depth quantitative analysis across multiple data sources to uncover marketplace insights and identify growth opportunities.
  • Collaborated with cross-functional teams, including product, engineering, and marketing, to ensure data-driven decision-making.

Avalia Educacional

Aug 2018 - Apr 2019

São Paulo

Avalia Educacional, part of Grupo Santillana, is a leading educational assessment provider in Latin America, having reached around 10 million students across 22 countries.

Data Analyst

Aug 2018 - Apr 2019

Responsibilities:
  • Conducted Item Response Theory (IRT) analyses to evaluate item characteristics and test reliability.
  • Designed and maintained databases for storing assessment data, ensuring data integrity and scalability.
  • Collaborated with psychometricians and educational experts to refine assessment models and improve test accuracy.
  • Automated data extraction and processing pipelines, reducing manual workload and improving efficiency.
  • Provided insights and recommendations based on data trends and IRT model outputs to enhance decision-making.
  • Presented data findings and technical results to both technical and non-technical stakeholders
2

3
Vunesp Foundation

Dec 2017 - Jun 2018

São Paulo

Fundação Vunesp is responsible for organizing and conducting large-scale educational assessments and public selection processes in Brazil.

Data Analyst

Dec 2017 - Jun 2018

Responsibilities:
  • Conducted Item Response Theory (IRT) analysis to estimate student abilities and evaluate item performance for 3 large-scale assessments in São Paulo and Pará in Brazil.
  • Developed and delivered detailed educational performance reports for 645 cities in São Paulo, offering schools actionable insights into student achievement and areas for improvement.
  • Utilised statistical expertise to clean, analyse, and present results from large-scale student databases using MySQL and R, contributing to educational assessments post-degree.
  • Presented key findings to stakeholders, including government officials and educational institutions, ensuring clear communication of assessment outcomes and recommendations.

Education

M.Sc. in Statistics
Taken Courses:
  • Data Science Introduction
  • Multivariate Analysis
  • Statistical Inference
  • Probability Theory
  • Regression Models
  • Latent Trait Models Introduction
B.Sc. in Statistics
Taken Courses:
  • Probability Theory
  • Statistical Inference
  • Mathematical Statistics
  • Linear Algebra
  • Calculus
  • Applied Statistics
  • Regression Analysis
  • Time Series Analysis
  • Experimental Design
  • Multivariate Analysis
  • Non-parametric Statistics
  • Statistical Computing
  • Survey Sampling
  • Optimization
  • Operations Research
  • Data Visualization
  • Data Mining
  • Mathematical Modeling
  • Statistical Programming

Projects

Machine Learning Models
Machine Learning Models
Author Nov 2024 - Present

This project is dedicated to the study, training, and storage of machine learning models. It provides Jupyter notebooks for model exploration and training, as well as a structure for storing datasets and trained models ready for deployment. These models can be utilized by the Machine Learning API project for serving predictions and model-based services.

Machine Learning API on AWS
Machine Learning API on AWS
Author Nov 2024 - Present

This project aims to create a Python API using FastAPI and deploy it to AWS using AWS Lambda, and it is designed to serve predictions based on models imported from an external repository: Machine Learning Models Repository. The application is containerized using Docker to provide an isolated and reproducible development environment. AWS SAM CLI is utilized for local testing, building, and deploying the serverless application to AWS. The project demonstrates how to efficiently build and deploy a scalable API with FastAPI on AWS Lambda, leveraging the simplicity and flexibility of Docker and AWS serverless services.

Machine Learning API
Machine Learning API
Author Nov 2024 - Present

This project provides a Machine Learning API that serves machine learning models through a robust and scalable web interface. The API is built using Docker and Docker Compose and is designed to serve predictions based on models imported from an external repository: Machine Learning Models Repository.

Machine Learning Observability Stack
Machine Learning Observability Stack
Author Nov 2024 - Present

This project sets up a full observability stack to keep an eye on and analyze your Machine Learning (ML) Model API. Using Elasticsearch, Kibana, Logstash, Filebeat, Metricbeat, and Heartbeat, it makes it easy to collect, explore, and visualize logs and metrics. Everything runs with Docker Compose to manage all the containerized services smoothly, based on the Elastic Stack (ELK) on Docker. It’s designed to monitor and support the Machine Learning API Environment, giving you better insights and performance tracking.

Featured Posts

Recent Posts