My role: Senior Software Engineer
Project description: Designed and implemented a serverless AI pipeline to cluster large-scale datasets efficiently, enabling scalable data analysis without managing infrastructure.
Skills and deliverables
Overview
Designed and implemented a serverless AI pipeline to efficiently cluster large-scale datasets, processing over one million articles per day while enabling scalable and high performance data analysis.

Problem
The system needed to process and cluster high-volume data with unpredictable workloads while keeping infrastructure costs low and scaling automatically.
Solution
Built a fully serverless architecture where data preprocessing and clustering tasks were distributed across AWS Lambda functions, orchestrated through event-driven workflows.
My Role
AI Engineer / Cloud Architect
Responsible for serverless system architecture, development, and performance optimisation.
Tech Stack
Python
AWS Lambda
AWS SQS
OpenSearch
OpenAI Embedding model
Machine learning clustering algorithms
Results
Data clustering significantly reduced operational costs for the client’s business when handling high-volume datasets.
Client
Private project
