Agnieszka Mikołajczyk-Bareła

<hello/> _

Agnieszka
Mikołajczyk-Bareła

PhD · Senior AI Engineer · Researcher

Senior AI Engineer building Reedy at Chaptr.AI — LLMs, RAG, and multimodal pipelines. Formerly led the NLP team at VoiceLab.AI that shipped TRURL, Poland's first large-scale generative model. Researcher, speaker, and mentor with a track record of leading AI-for-Good initiatives.

3 200+ citations across 30+ papers and 10+ open-source projects. Named among Top Women in AI Poland.

Skills

LLMs & AI Agents LLM Training · Fine-tuning · RAG · LangChain · Agentic Pipelines · Prompt Engineering · Evaluation · Pydantic · OpenAI / Claude / Gemini / Llama
NLP & Speech Transformers · BERT · GPT · TTS · ASR · Sentiment Analysis · NER
Computer Vision & Multimodal EfficientDet · DETR · Mask R-CNN · GANs · Vision Transformers · Image Generation · Video Generation
Explainable AI Attention Maps · Counterfactual Analysis · Bias Detection · Global Explanations
Data & Annotation Dataset Curation · Annotation Pipelines · Data Cleaning · Benchmarking · Crowdsourcing
ML Engineering Python · PyTorch · HuggingFace · ONNX · Model Training & Deployment

Projects

Research, open-source, and industry work

featured

Reedy — AI Metadata Platform for Publishers

LLM RAG NLP Multimodal

Building Reedy at Chaptr.AI — an AI engine that generates keywords, THEMA codes, and SEO descriptions for book catalogs. RAG pipelines, trend analysis, and multimodal AI across TTS, ASR, and image/video generation.

featured

HearAI — Sign Language Recognition

Deep Learning AI4Good

Making the world more accessible for the Deaf community through deep learning-based sign language recognition.

featured

Detect Waste in Pomerania

Computer Vision AI4Good Deep Learning

WiMLDS Trójmiasto AI4Good project training custom PyTorch detection & segmentation models (EfficientDet, DETR, Mask R-CNN, Faster R-CNN) to localize waste in the environment. Published in Waste Management (IF 6.1). 220+ GitHub stars.

featured

GEBI — Global Explanations for Bias Identification

XAI Research Grant Deep Learning

PhD research (NCN Preludium 18, 138k PLN). Built GEBI — attention-based framework for detecting bias in data using global explanations, counterfactual bias insertion, and spectral clustering. Validated on skin lesion datasets.

featured

Data Augmentation Review

Deep Learning Datasets Open Source

Curated collection of data augmentation resources — techniques, libraries, papers, and code. One of the most popular ML repos on GitHub with 1.6k+ stars.

Skin Lesion Classification & GAN Debiasing

Deep Learning Medicine GANs XAI

Trained CNNs, EfficientNets, and StyleGAN2-ADA for skin cancer detection and bias analysis. MICCAI 2022 paper studied how GAN-generated data amplifies or reduces biases in medical imaging.

Punctuation Restoration (PolEval 2021)

NLP Datasets

Created the WikiPunct dataset and organized the first Polish punctuation restoration shared task at PolEval 2021.

Hack4Environment

Hackathon AI4Good

Co-organized a WiMLDS Trójmiasto hackathon focused on waste detection and environmental AI solutions.

Waste Datasets Review

Datasets Computer Vision AI4Good

Comprehensive list of image datasets for litter, garbage, and waste detection and classification research.

Tiny Hero — Pixel Character Generation with GANs

Deep Learning Datasets

Generating 64×64 retro-pixel characters using Generative Adversarial Networks.

Bird Song Classification

Audio Deep Learning

WiMLDS Trójmiasto project — sound-based bird species classification using deep learning on audio spectrograms.

Machine Learning Acronyms

Open Source Community

Community-maintained reference of ML and AI acronyms and abbreviations.

Get in touch

Want to collaborate or just say hello?

When not training models, you'll find me with a book, in the kitchen, or being supervised by two cats.

LinkedIn