Agnieszka Mikołajczyk-Bareła

<hello/> _

Agnieszka
Mikołajczyk-Bareła

PhD · Senior AI Engineer · Researcher

Senior AI Engineer working on Reedy at Chaptr.AI — LLMs, RAG, and multimodal pipelines. Previously part of the NLP team at VoiceLab.AI that shipped TRURL, Poland's first large-scale generative model. Researcher, speaker, and co-organizer of AI-for-Good initiatives.

3 200+ citations across 30+ papers and 10+ open-source projects. Named among Top Women in AI Poland.

Skills

LLMs & AI Agents LLM Training · Fine-tuning · RAG · LangChain · Agentic Pipelines · Prompt Engineering · Evaluation · Pydantic · OpenAI / Claude / Gemini / Llama
NLP & Speech Transformers · BERT · GPT · TTS · ASR · Sentiment Analysis · NER
Computer Vision & Multimodal EfficientDet · DETR · Mask R-CNN · GANs · Vision Transformers · Image Generation · Video Generation
Explainable AI Attention Maps · Counterfactual Analysis · Bias Detection · Global Explanations
Data & Annotation Dataset Curation · Annotation Pipelines · Data Cleaning · Benchmarking · Crowdsourcing
ML Engineering Python · PyTorch · HuggingFace · ONNX · Model Training & Deployment

Projects

Research, open-source, and industry work

featured

Schema-First Prompting [Claude Skill]

LLM Prompt Engineering Open Source

Modern prompt engineering, designed the way a human would — clean, minimal, elegant. An open-source skill for Claude and Cursor that encodes best practices for structured outputs with Pydantic models.

npx skills add AgaMiko/schema-first-prompting
featured

The Quiet Hours [Game]

Three.js Web Game JavaScript Vibe Jam

Vibe Jam 2026: I shipped a browser 3D game—you’re the cat, she’s blocked, fetch objects and watch her novel grow. Three.js, vanilla JS, cozy until the building hints otherwise.

featured

Reedy — AI Metadata Platform for Publishers

LLM RAG NLP Multimodal

Working with the team on Reedy at Chaptr.AI — an AI engine that generates keywords, THEMA codes, and SEO descriptions for book catalogs.

featured

Model training

Deep Learning NLP Computer Vision Audio Open Source

Trained models across computer vision (classification, detection, segmentation), NLP and generative text, and audio; with the VoiceLab team shipped TRURL 7B/13B (incl. 8-bit and academic), vlt5-base-keywords, herbert-base-cased-sentiment, and datasets on Hugging Face.

featured

Data Augmentation

Deep Learning Datasets Bias Open Source

Curated review of augmentation techniques, libraries, and papers (1.6k+ stars), plus research code for Targeted Data Augmentation (TDA) and Counterfactual Bias Insertion (CBI) on dermoscopy and face data.

featured

Biomedical imaging

Deep Learning Medicine XAI Research

Dermoscopy / melanoma bibliography (classical features through NAS, self-supervision, augmentation, bias) plus deep learning for other biomedical imaging—microbleeds, blood smear, erythrocytes—with links added over time.

featured

Bias in Machine Learning

Bias XAI Research Medicine Deep Learning

PhD thesis on arXiv (bias, XAI, skin-lesion case studies, style transfer / targeted augmentation / attribution feedback); IEEE Access survey with M. Grochowski; GEBI; MICCAI 2022 GAN debiasing in dermoscopy.

Waste detection

Computer Vision AI4Good Deep Learning Datasets Hackathon Open Source

WiMLDS Trójmiasto AI4Good: PyTorch detection and segmentation (EfficientDet, DETR, Mask R-CNN, Faster R-CNN) on merged litter benchmarks, Waste Management paper, curated waste image datasets list (338+ stars), and co-organized Hack4Environment (waste & environmental literacy with DIH4.AI).

HearAI — Sign Language Recognition

Deep Learning Computer Vision AI4Good Open Source

Making the world more accessible for the Deaf community through deep learning-based sign language recognition.

Punctuation Restoration [PolEval]

NLP Datasets

Created the WikiPunct dataset and organized the first Polish punctuation restoration shared task at PolEval 2021.

Tiny Hero — Pixel Character Generation with GANs

Deep Learning GANs Datasets

Generating 64×64 retro-pixel characters using Generative Adversarial Networks.

Machine Learning Acronyms

Open Source Community

Community-maintained reference of ML and AI acronyms and abbreviations.

Bird Song Classification

Audio Deep Learning

WiMLDS Trójmiasto project — sound-based bird species classification using deep learning on audio spectrograms.

Get in touch

Want to collaborate or just say hello?

When not training models, you'll find me with a book, in the kitchen, or being supervised by two cats.

LinkedIn