2018 — 2025

Data augmentation

A long-running thread: community resources for practitioners, then research code for controlled bias in vision models.

Deep Learning Datasets Bias Open Source

Data augmentation is one of the cheapest levers for better generalization—but the design space is huge. This page groups two outputs: a curated review that grew with the field, and later targeted augmentation + evaluation aimed at bias and robustness in medical and face imagery.

Curated review

The data-augmentation-review repository lists techniques, libraries, papers, and code pointers—useful when you need a map of classical transforms, generative and hybrid approaches, and domain-specific recipes. It has stayed community-facing and open on GitHub since 2018.

Targeted augmentation & CBI

Targeted Data Augmentation (TDA) overlays controlled artifacts (e.g. frames and rulers on lesions, glasses on faces) during training with tunable probability, so the model sees spurious structure in a deliberate regime. Counterfactual Bias Insertion (CBI) reuses the same overlays at test time to measure how often predictions flip when bias is inserted—training and evaluation share one PyTorch codebase.

Scripts cover ISIC‑2020 and a gender classification face set, with DenseNet121, EfficientNet‑B2, and ViT. Masks and bias annotations live in the linked open research data deposit, not in the GitHub tree.

Review (GitHub) → TDA / CBI code → Open data →

← All projects