Flag toxic, sexual, violent, or otherwise sensitive comments before they reach the moderation queue, using the OpenAI Moderation API.
How it works
Comment moderation hooks into the WordPress comment-submission pipeline and sends every incoming comment to OpenAI’s free Moderation API before the comment is saved. The API returns a category breakdown (hate, harassment, sexual content, violence, self-harm, and so on) along with per-category confidence scores, and ClassifAI uses those scores to decide whether to approve, hold for moderation, or mark as spam. Flagged comments are stored with their moderation report visible in the standard comments admin so a human moderator can override the decision.

Configuration
- Per-category thresholds — aggressive about hate speech, lenient about strong language, configured independently.
- Action taken on a flag (hold for moderation, mark as spam, or both).
- Allowed roles and an allowed-users list for granular access control.
Providers
Comment moderation is the only ClassifAI feature with a single supported provider:
- OpenAI Moderation — the only first-party provider whose moderation API exposes per-category scores in the format the feature consumes. The API is currently free of charge for OpenAI customers, which makes this one of the cheapest ClassifAI features to run at scale.
Use cases
Nearly every public-facing comment thread benefits from a first-line filter. Typical deployments are news sites, community blogs, and product-review threads where the comment volume exceeds what a human moderator can keep up with in real time.
