X's Trust & Safety Controls: A Platform Teardown

Tech · 7 min read

X introduced granular mute and reporting flows, along with contextual prompts that encourage users to pause before sending potentially harmful replies. The platform uses inline warnings and friction for repeat offenders rather than blanket temporary bans in many cases.

The interface surfaces neighborhood moderation—allowing community-moderated muting and phrase-level filters at the account level. This decentralizes moderation but adds complexity to the user experience when multiple filter sets interact across replies and DMs.

The teardown argues X is experimenting with behavioral nudges instead of heavy-handed enforcement to preserve synthetic engagement. Early telemetry shows reductions in impulsive replies but mixed results on long-term toxicity, suggesting more iteration is needed on escalation and transparency.