# Nikolas Bakalis

Senior Software Engineer, Applied AI & Backend  
New York, NY  
me@nikolasbakalis.com  
https://nikolasbakalis.com/

## Summary

I build production AI, ML, and backend systems end-to-end, from early product development through scaled production operations, covering problem framing, evaluation, deployment, monitoring, and business impact.

My work sits between applied AI, backend engineering, and product-facing delivery. At Channel Factory, I build applied AI, automation, and data systems across early product development and scaled production workflows, working with datasets spanning tens of millions of YouTube channels and billions of videos. Before that I shipped ML and full-stack systems for logistics and life-sciences workflows. This site is the deeper version of my resume: it shows how I evaluate tradeoffs, measure systems, and turn ambiguous technical problems into working software.

## Core Skill Areas

- Production AI systems
- Applied machine learning
- Backend engineering
- AWS data platforms
- RAG and agentic workflows
- MCP integrations
- LLM evaluation
- NLP and language detection
- Computer vision
- Adversarial machine learning
- Database serving architecture
- API performance benchmarking
- Data lakehouse architecture
- Model monitoring and evaluation

## Selected Technologies

- AWS
- AWS Bedrock
- AWS Glue
- S3
- Athena
- RDS
- PostgreSQL
- StarRocks
- OpenSearch
- Iceberg
- S3 Tables
- Redis
- Python
- TypeScript
- Go
- Node.js
- Bun
- Docker
- FastAPI
- Django
- Fastify
- Fiber
- BlackSheep
- Celery
- TensorFlow/Keras
- ResNet50
- Computer vision
- Adversarial ML
- RAG
- MCP
- LLMs
- NLP
- XLM-RoBERTa
- FastText
- Language detection
- Translation evaluation
- COMET-QE
- BERTScore
- LLM evaluation
- Model monitoring
- Backend performance

## Recruiter Search Fit

- Senior Software Engineer Applied AI
- Applied AI Engineer
- Backend Engineer AWS
- Machine Learning Engineer backend systems
- RAG engineer
- MCP engineer
- LLM evaluation engineer
- Data platform engineer
- StarRocks PostgreSQL OpenSearch engineer
- AI backend engineer New York

# Experience

## Channel Factory

**Senior Software Engineer, Applied AI & Backend**  
Nov 2024 - Present

- Builds production AI, data, and backend systems for advertising workflows across tens of millions of YouTube channels and billions of videos.
- Delivers early-stage MVPs and scaled production systems with measurable revenue and operational impact.
- Works with foundation models, RAG, MCP, data lakehouse architecture, and service contracts.

## ACERTUS

**Software Engineer, ML Engineering & Full Stack**  
May 2022 - Nov 2024

- Built production ML for ETA, delay, and fraud prediction in logistics workflows.
- Owned cloud deployments with Python, AWS, Docker, and production monitoring.

## Vivpro

**Full Stack Engineer, ML Engineering**  
Jun 2021 - May 2022

- Built ML and full-stack systems for regulatory and clinical-trial intelligence workflows.
- Worked on natural-language interfaces before commercial LLM tooling became mainstream.

# Education

- M.S. Artificial Intelligence and Big Data, Anglia Ruskin University
- Integrated B.Eng. and M.Eng. Computer Engineering, University of Patras
- Thesis work across adversarial attacks, image classification, and database tooling

# Selected Work

## Translation Model Benchmark for Multilingual Video Transcripts

- URL: https://nikolasbakalis.com/work/translation-model-benchmark-video-transcripts/
- Type: Benchmark
- Date: 2026-05-25
- Topic: Translation
- Summary: A multilingual benchmark comparing Google Translate, DeepL, and Llama Maverick 4 on noisy video transcript data across 15 languages.
- Technologies: Python, Google Translate, DeepL, Llama Maverick 4, Together AI, COMET-QE, BLEU, BERTScore
- Tags: Translation, LLMs, Evaluation, Transcripts

## RDS vs StarRocks 20M Serving and Aggregation Benchmark

- URL: https://nikolasbakalis.com/work/rds-starrocks-serving-aggregation-20m-benchmark/
- Type: Benchmark
- Date: 2026-05-22
- Topic: Data Systems
- Summary: A 20 million row benchmark comparing RDS/Postgres serving tables with StarRocks OLAP tables and async materialized views.
- Technologies: AWS, RDS/PostgreSQL, StarRocks, Iceberg, S3 Tables, Materialized Views, Python
- Tags: RDS, StarRocks, OLAP, Benchmark

## Materialized View API Serving Benchmark

- URL: https://nikolasbakalis.com/work/materialized-view-api-serving-benchmark/
- Type: Benchmark
- Date: 2026-05-21
- Topic: Data Systems
- Summary: An API-level comparison of denormalized RDS tables and StarRocks async materialized views across 100k, 1m, and 10m row scales.
- Technologies: PostgreSQL, StarRocks, Iceberg, S3 Tables, Materialized Views, REST APIs, Docker
- Tags: RDS, StarRocks, API, Materialized views

## IAB 3.0 Content Classifier Training Report

- URL: https://nikolasbakalis.com/work/iab-3-content-classifier-training-report/
- Type: Technical Note
- Date: 2026-03-20
- Topic: ML/NLP
- Summary: A training report for an in-house hierarchical IAB 3.0 content classifier built to replace external classification dependencies.
- Technologies: Python, NLP, IAB 3.0, Hierarchical Classification, Model Evaluation, Taxonomy Modeling
- Tags: IAB 3.0, Classification, NLP, Training

## API Framework Benchmark for Data-Intensive Services

- URL: https://nikolasbakalis.com/work/api-framework-benchmark-data-intensive-services/
- Type: Benchmark
- Date: 2026-03-16
- Topic: Backend
- Summary: A benchmark of API frameworks and datastore access patterns for data-intensive services spanning PostgreSQL, StarRocks, and OpenSearch.
- Technologies: Go, Python, Node.js, Bun, Docker, PostgreSQL, StarRocks, OpenSearch
- Tags: API, Backend, PostgreSQL, OpenSearch

## IAB 3.0 Classification and Language Detection Pipeline Proposal

- URL: https://nikolasbakalis.com/work/iab-3-classification-language-detection-pipeline-proposal/
- Type: Proposal
- Date: 2026-03
- Topic: ML/NLP
- Summary: A project proposal for replacing external classification APIs with in-house IAB 3.0 classification and language detection pipelines at media-corpus scale.
- Technologies: AWS Glue, Iceberg, S3, Celery, GCLD3, FastText, XLM-RoBERTa, IAB 3.0
- Tags: IAB 3.0, Language detection, Classification, Pipeline

## Adversarial Robustness in Medical Image Classification

- URL: https://nikolasbakalis.com/work/adversarial-ml-medical-image-classification/
- Type: Thesis
- Date: 2023-09-15
- Topic: Adversarial ML
- Summary: Master's thesis work evaluating adversarial attacks and defensive training strategies for cervical cancer screening image classification models.
- Technologies: Python, TensorFlow/Keras, ResNet50, Adversarial ML, FGSM, BIM, Gaussian noise, Medical imaging
- Tags: Adversarial ML, Medical imaging, ResNet50, Model robustness

# Links

- Email: mailto:me@nikolasbakalis.com
- GitHub: https://github.com/NikosBakalis
- LinkedIn: https://www.linkedin.com/in/nikolas-bakalis/
- Resume: https://nikolasbakalis.com/resume/Nikolas_Bakalis_CV.pdf
