0

I am trying to implement an image recognition program and I need to remove (or "crop") all text, present on the image, so for example from that:

enter image description here

to that:

enter image description here

I already tried the Keras OCR method, but firstly I don't need the background blur I simply need to delete the text, and secondly it takes a lot of time and CPU power. Is there an easier way to detect those text regions and simply crop them out of the picture?

coffee-and-code
  • 205
  • 2
  • 12
  • Approach #1: load image, grayscale, Otsu's threshold, find contours, filter using contour area threshold, effectively remove all letters/characters by filling them in with drawContours – nathancy Feb 28 '22 at 22:27
  • Approach #2: find horizontal and vertical contours. Load image, grayscale, Otsu's threshold, create horizontal and vertical structuring element, then isolate horizontal/vertical lines to remove letters/characters – nathancy Feb 28 '22 at 22:28

1 Answers1

0

One way is to detect the text with findContours - the ones with an area < threshold are letters, then paint over these areas, or/and first find their bounding rectangle and paint a big one.

Text Extraction from image after detecting text region with contours

There is also pytesseract to detect letters and their region, but I guess it will be heavier than the contours.

Here is an example project where I worked with pytesseract: How to obtain the best result from pytesseract?

Twenkid
  • 825
  • 7
  • 15