I am trying to figure out if text metadata like font-size, font-family, bold/italic etc. can be captured using Tesseract. Below is the code I used to try it but that did not work and returned "None". Using, Tesseract version = 4.1.1, Tesseract-OCR engine version = 5.0.0
with open(Image_file_location, "rb") as image:
f = image.read()
b = bytearray(f)
with tesserocr.PyTessBaseAPI() as api:
image = Image.open(io.BytesIO(b))
api.SetImage(image)
api.Recognize()
iterator = api.GetIterator()
print(iterator.WordFontAttributes())
Currently, using Tesseract, I was able to capture text properly but not meta-data. I have attached a sample image file and example expected output.
Expected Output: [Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] GCEO Review
[Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] Dear Shareholders,
[Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] TURNING THE....
[Font:"some_font", Font_family:"some_font_family", Bold, font_size:"some_font_size] We have executed well and gained mobile share in our core.........
So, basically, wherever there is a change in meta-data, we should be able to capture the information and prepend that information before that sentence.