%e2%80%9calgorithmic Sabotage%e2%80%9d Access
Navigating the Digital Friction: Understanding Algorithmic Sabotage
- Recommender systems: poisoning reviews/ratings to promote harmful products or demote competitors.
- Moderation systems: crafting text/images that bypass content filters or trigger wrongful takedowns.
- Credit scoring: manipulating application fields or supporting documents to change automated risk decisions.
- Autonomous vehicles: printing stickers or patterns that cause misclassification of signs.
- Fraud detection: generating synthetic user behavior patterns that mimic benign users to evade detection.
- Hiring tools: injecting biased resumes/labels to make the model prefer or reject certain demographics.
- Medical diagnosis models: altering image metadata or training records to reduce detection of specific conditions.
Consider a sabotaged news aggregator. An attacker floods the algorithm with clicks on low-quality, fake articles. The algorithm learns that "fake news" is what users want. It then aggressively seeks out more fake news to recommend. The sabotage doesn't just pollute the present; it corrupts the future iteration of the model. %E2%80%9Calgorithmic sabotage%E2%80%9D
The Inevitable Counter-Sabotage