2020 — 2025
Bias in machine learning
Thesis and follow-on papers: bias in ML pipelines, explainability for discovery, lesion case studies, and augmentation-based mitigation.
Fairness discussions often skip why models pick up spurious cues. This thread ties together a taxonomy-oriented survey, an explainability stack aimed at bias discovery in vision, and controlled experiments on whether synthetic dermoscopy data helps or hurts bias metrics—each piece feeding the next. A wider chronological list of the lesion-imaging line (citations, links TBD) lives under biomedical imaging.
Survey — IEEE Access (2025)
With Michał Grochowski: A Survey on Bias in Machine Learning Research (IEEE Access, vol. 14, pp. 3284–3311). We connect historical notions of bias as systematic error with today’s ML fairness literature, and organize over forty potential sources of bias and error across data and model stages—each with concrete examples—so mitigation and detection can target the right failure modes.
The abstract framing: understanding sources and consequences supports fairer, more transparent, and more accurate models—not only metric tuning on a fixed benchmark.
PhD dissertation — arXiv (2023)
Data augmentation and explainability for bias discovery and mitigation in deep learning (arXiv:2308.09464) develops a through-line from bias sources in data and models to mitigation. The first part builds a taxonomy of pipeline bias, surveys Explainable AI as a lens on predictions, walks a manual skin‑lesion inspection as a laborious baseline, and introduces global explanations for bias identification as a semi‑automatic alternative—together with metrics for how bias affects decisions.
The second part turns to mitigation: style‑transfer data augmentation to stress shape vs. texture cues; targeted data augmentations to randomize inserted artifacts and weaken spurious correlations; and attribution feedback to fine‑tune away obvious mistakes by down‑weighting irrelevant input via an attribution loss. The aim is to reduce bias influence, not claim it can be erased outright.
GEBI — NCN Preludium (2020)
PhD research (NCN Preludium 18, 138k PLN): Global Explanations for Bias Identification combines bulk attribution patterns with local evidence, counterfactual stress tests, and clustering in explanation space—validated heavily on skin‑lesion imaging where dataset bias is a known failure mode.
Open code accompanies the journal paper; collaborators span explainability, dermatology‑grade imaging, and robust evaluation of debiasing strategies.
Lesions & GAN debiasing — MICCAI (2022)
Automated melanoma screening from dermoscopy faces imbalance, subtle variation, artifacts, and shift. We moved from classical ABCD‑style features plus shallow nets to end‑to‑end CNNs and EfficientNets, then asked whether StyleGAN2‑ADA‑style synthetic data amplifies or mitigates spurious correlations on held‑out populations—linking dermatology‑safe modelling to the same explainability questions GEBI was built to probe.