Pull readable text out of screenshots, scans, and embedded images so it shows up in search and can be reused as caption or description text.
How it works
Image text extraction (OCR) pulls any readable text out of an uploaded image and stores it in the attachment’s description field, where it becomes searchable through the standard WordPress search index, ElasticPress, and any custom REST consumer. The action runs automatically on upload by default and can also be triggered from the attachment edit screen or in bulk via WP-CLI. Extraction works on photographs of signage, screenshots of slides, scanned forms, and infographics — anywhere the visual contains words you would otherwise have to retype.
Configuration
- Which attachment field receives the extracted text.
- Confidence threshold for accepted results.
- Provider and model selection.
- Allowed roles and an allowed-users list for granular access control.
Providers
- Microsoft Azure AI Vision (the most accurate for printed text and the original integration)
- OpenAI ChatGPT (vision)
- Ollama (locally hosted vision models)
Use cases
- Documentation sites whose tutorials are full of screenshotted UI text that previously had no text representation.
- News archives where infographics carry headlines and captions only inside the image.
- Records management of scanned forms, contracts, and historical documents.
