Question
· Dec 8, 2016

Storing a PDF in a Stream and Seaching for Values within that Stream

Using the following...

//Get PDF stream object previously saved
   Set pdfStreamContainer = ##Class(Ens.StreamContainer).%OpenId(context.StreamContainerID)
   Try {
     Set pdfStreamObj = pdfStreamContainer.StreamGet()
   }
   Catch {
      $$$TRACE("Error opening stream object ID = "_context.StreamContainerID)
      Quit
   }

Would it be possible to search the pdfStreamObj for certain values like a Medical Record, or Patient Name?

Thanks

Scott

Discussion (1)1
Log in or sign up to continue

Hi Scott,

while it is generally possible to search through a stream object for certain strings, it would really depend on the pdf you're putting in here. 

PDF documents are notoriously weird ;) Sometimes the containing text is actually contained as text (similar to postscript), but more regularly pdfs are containing vectorized graphics. In that case you'd have to run  OCR on the document first to get textual information out of it. 

A good test would be to run strings on the document and see if you can spot the information you need. Or you can open the pdf document in a text editor and take a look manually. 

Caché doesn't have a pdf rendering engine itself, so from its perspective it would just be a binary stream of data and you need to interpret it in your code. 

Cheers,

Fab