Thiago F. Miranda

Experiences

Catho Online

May 2019 - Apr 2024

São Paulo

Catho is a leading Brazilian online platform that connects job seekers with employers, offering a wide range of career opportunities across various industries.

Data Scientist

Sep 2021 - Apr 2024

Responsibilities:

Developed and supported machine learning models for tasks such as scoring, ranking, and content recommendations, using Python, R, and AWS tools.
Increased paying user conversions by 11% by identifying key features based on free user behaviours and recommending targeted features for paid users.
Boosted job applications from paying users by 30% by automating CV submissions through a job-CV scoring model.
Built a data framework that automated the end-to-end reporting process, from data extraction to delivering actionable business insights, using tools like Tableau, Power BI, and Miro to communicate insights clearly to stakeholders.
Streamlined data analysis and reporting workflows by integrating diverse data sources, including MySQL, AWS S3, and Google Analytics.

Data Analyst

May 2019 - Aug 2021

Responsibilities:

Created a user conversion propensity model to increase lead conversions, leveraging Python and AWS SageMaker.
Led A/B testing initiatives for product teams, optimising pricing strategies, business models, and new product launches with a focus on enhancing customer experience.
Performed in-depth quantitative analysis across multiple data sources to uncover marketplace insights and identify growth opportunities.
Collaborated with cross-functional teams, including product, engineering, and marketing, to ensure data-driven decision-making.

Avalia Educacional

Aug 2018 - Apr 2019

São Paulo

Avalia Educacional, part of Grupo Santillana, is a leading educational assessment provider in Latin America, having reached around 10 million students across 22 countries.

Data Analyst

Aug 2018 - Apr 2019

Responsibilities:

Conducted Item Response Theory (IRT) analyses to evaluate item characteristics and test reliability.
Designed and maintained databases for storing assessment data, ensuring data integrity and scalability.
Collaborated with psychometricians and educational experts to refine assessment models and improve test accuracy.
Automated data extraction and processing pipelines, reducing manual workload and improving efficiency.
Provided insights and recommendations based on data trends and IRT model outputs to enhance decision-making.
Presented data findings and technical results to both technical and non-technical stakeholders

Vunesp Foundation

Dec 2017 - Jun 2018

São Paulo

Fundação Vunesp is responsible for organizing and conducting large-scale educational assessments and public selection processes in Brazil.

Data Analyst

Dec 2017 - Jun 2018

Responsibilities:

Conducted Item Response Theory (IRT) analysis to estimate student abilities and evaluate item performance for 3 large-scale assessments in São Paulo and Pará in Brazil.
Developed and delivered detailed educational performance reports for 645 cities in São Paulo, offering schools actionable insights into student achievement and areas for improvement.
Utilised statistical expertise to clean, analyse, and present results from large-scale student databases using MySQL and R, contributing to educational assessments post-degree.
Presented key findings to stakeholders, including government officials and educational institutions, ensuring clear communication of assessment outcomes and recommendations.

Education

		University of São Paulo - USP 2023-2025 M.Sc. in Statistics Taken Courses: Data Science Introduction Multivariate Analysis Statistical Inference Probability Theory Regression Models Latent Trait Models Introduction
		Federal University of Pará - UFPA 2013-2017 B.Sc. in Statistics Taken Courses: Probability Theory Statistical Inference Mathematical Statistics Linear Algebra Calculus Applied Statistics Regression Analysis Time Series Analysis Experimental Design Multivariate Analysis Non-parametric Statistics Statistical Computing Survey Sampling Optimization Operations Research Data Visualization Data Mining Mathematical Modeling Statistical Programming

Projects

Machine Learning Models

Author Nov 2024 - Present

This project is dedicated to the study, training, and storage of machine learning models. It provides Jupyter notebooks for model exploration and training, as well as a structure for storing datasets and trained models ready for deployment. These models can be utilized by the Machine Learning API project for serving predictions and model-based services.

professional machine learning api cloud

Machine Learning API on AWS

Author Nov 2024 - Present

This project aims to create a Python API using FastAPI and deploy it to AWS using AWS Lambda, and it is designed to serve predictions based on models imported from an external repository: Machine Learning Models Repository. The application is containerized using Docker to provide an isolated and reproducible development environment. AWS SAM CLI is utilized for local testing, building, and deploying the serverless application to AWS. The project demonstrates how to efficiently build and deploy a scalable API with FastAPI on AWS Lambda, leveraging the simplicity and flexibility of Docker and AWS serverless services.

professional machine learning api cloud

Machine Learning API

Author Nov 2024 - Present

This project provides a Machine Learning API that serves machine learning models through a robust and scalable web interface. The API is built using Docker and Docker Compose and is designed to serve predictions based on models imported from an external repository: Machine Learning Models Repository.

professional machine learning api cloud

Machine Learning Observability Stack

Author Nov 2024 - Present

This project sets up a full observability stack to keep an eye on and analyze your Machine Learning (ML) Model API. Using Elasticsearch, Kibana, Logstash, Filebeat, Metricbeat, and Heartbeat, it makes it easy to collect, explore, and visualize logs and metrics. Everything runs with Docker Compose to manage all the containerized services smoothly, based on the Elastic Stack (ELK) on Docker. It’s designed to monitor and support the Machine Learning API Environment, giving you better insights and performance tracking.

professional machine learning observability cloud

Hi, I am Thiago

Thiago F. Miranda

Data Scientist at Catho Online

Skills

R

Python

Jupyter

SQL

ETL Process

Google Analytics

Git / GitHub

AWS

Docker

BI Tools

JIRA

MS Office

Agile Methodologies

Experiences

Catho Online

Data Scientist

Responsibilities:

Data Analyst

Responsibilities:

Avalia Educacional

Data Analyst

Responsibilities:

Vunesp Foundation

Data Analyst

Responsibilities:

Education

University of São Paulo - USP

M.Sc. in Statistics

Taken Courses:

Federal University of Pará - UFPA

B.Sc. in Statistics

Taken Courses:

Projects

Machine Learning Models

Machine Learning API on AWS

Machine Learning API

Machine Learning Observability Stack

Featured Posts

Introduction

Recent Posts

Machine Learning API Test

Introduction