2

I am trying to figure out if text metadata like font-size, font-family, bold/italic etc. can be captured using Tesseract. Below is the code I used to try it but that did not work and returned "None". Using, Tesseract version = 4.1.1, Tesseract-OCR engine version = 5.0.0

with open(Image_file_location, "rb") as image:
f = image.read()
b = bytearray(f)

with tesserocr.PyTessBaseAPI() as api:
    image = Image.open(io.BytesIO(b))
    api.SetImage(image)
    api.Recognize()
    iterator = api.GetIterator()
    print(iterator.WordFontAttributes())

Currently, using Tesseract, I was able to capture text properly but not meta-data. I have attached a sample image file and example expected output.

enter image description here

Expected Output: [Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] GCEO Review

[Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] Dear Shareholders,

[Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] TURNING THE....

[Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] We have executed well and gained mobile share in our core.........

So, basically, wherever there is a change in meta-data, we should be able to capture the information and prepend that information before that sentence.

Crusader
  • 313
  • 2
  • 7

0 Answers0