Question
Akshay Pandey · Sep 15

How to read a pdf file which has images in it

Hi team,

I have a PDF file which has come report data and images (like XRAY) in it. How I can read text from a PDF file in  cache

Thanks

Akshay

$ZV: IRIS for Windows (x86-64) 2020.1 (Build 215U) Mon Mar 30 2020 20:14:33 EDT [HealthConnect:2.1.0]
0
0 133
Discussion (5)1
Log in or sign up to continue

@Eduard Lebedyuk 

1. I have a PDF file which I need to read  from a folder location as text and put data from PDF into HL7 message and send it to downstream system.

2. I have a PDF file which I need to read  from a folder location encode it in base64 and put in OBX.5  of MDM message

1. I have a PDF file which I need to read  from a folder location as text and put data from PDF into HL7 message and send it to downstream system.

Do you mean OCR/text layer extraction?

2. I have a PDF file which I need to read  from a folder location encode it in base64 and put in OBX.5  of MDM message

Do it like this.

Do you mean OCR/text layer extraction?  yes.

If there's a text layer use LibreOffice to convert to txt (InterSystems IRIS wrapper), for OCR you'll need some thirdparty tool, for example Tesseract can be easily used with Embedded Python.