Removing (cropping) text from image

Question

I am trying to implement an image recognition program and I need to remove (or "crop") all text, present on the image, so for example from that:

to that:

I already tried the Keras OCR method, but firstly I don't need the background blur I simply need to delete the text, and secondly it takes a lot of time and CPU power. Is there an easier way to detect those text regions and simply crop them out of the picture?

Approach #1: load image, grayscale, Otsu's threshold, find contours, filter using contour area threshold, effectively remove all letters/characters by filling them in with drawContours — nathancy, Feb 28 '22 at 22:27
Approach #2: find horizontal and vertical contours. Load image, grayscale, Otsu's threshold, create horizontal and vertical structuring element, then isolate horizontal/vertical lines to remove letters/characters — nathancy, Feb 28 '22 at 22:28

score 0 · Answer 1 · answered Feb 28 '22 at 19:28

One way is to detect the text with findContours - the ones with an area < threshold are letters, then paint over these areas, or/and first find their bounding rectangle and paint a big one.

Text Extraction from image after detecting text region with contours

There is also pytesseract to detect letters and their region, but I guess it will be heavier than the contours.

Here is an example project where I worked with pytesseract: How to obtain the best result from pytesseract?

Removing (cropping) text from image

1 Answers1