· Jun 2, 2017

Decoding Base64 PDF File

Hello, I am writing some cache code that will pick up a PDF file, Base64 encode the contents and then send on to a third party system within a Long String (via their API). I have been testing this and discovered that the PDFs do not open within the supplier system (I get an error saying that it hasn't been decoded correctly). I wanted to prove that the issue does not lie with the way that I have Base64 encoded it within Ensemble, and therefore as a test wanted to Encode the PDF stream, then decode the stream and write out to a new file. Unfortunately after my code has created the file locally, when I go to open the file in Adobe, I get the same error message. Therefore I am trying to work out if I have not Encoded it properly, or have not Decoded it properly (or both!). Below is a code snippet of how I am performing this test within a BPL code block; ~~~ //context.streamPDF is a %StreamGlobalBinary containing the PDF stream. //context.streamPDFbase64 is a %Stream.GlobalBinary set stream2=##class(%Stream.GlobalBinary).%New() do context.streamPDF.Rewind() while 'context.streamPDF.AtEnd { set buffer = context.streamPDF.Read(4000) do context.streamPDFbase64.Write($System.Encryption.Base64Encode(buffer)) } do context.streamPDFbase64.Rewind() while 'context.streamPDFbase64.AtEnd { set temp=context.streamPDFbase64.Read(4000) set temp=$system.Encryption.Base64Decode(temp) do stream2.Write(temp) } //Output decoded to pdf file do stream2.Rewind() Set file=##class(%File).%New("C:\ANewPDF.pdf") Do file.Open("WSN") // The W means Write, S means put the file in stream mode, and N means create if not there already Set sc=file.CopyFrom(stream2) Do file.%Save() Do file.Flush() Do file.Close() ~~~ Any help greatly appreciated!

Discussion (13)2
Log in or sign up to continue

By default, Base64Decode and Base64Encode are functions used to decode and encode datatypes, or best saying... STRING.

Since you want to encode a stream the Decoder must understand that it should continue from the last chunk position instead of assuming a new string, otherwise you'll get a corrupted result.

Here's how XML Writer outputs an encoded binary.

/// <method>WriteBase64</method> encodes the specified binary bytes as base64 and writes out the resulting text.
/// This method is used to write element content.<br>
/// Argument:<br>
/// - <var>binary</var> The binary data to output. Type of %Binary or %BinaryStream.
Method WriteBase64(binary) As %Status
  If '..InRootElement Quit $$$ERROR($$$XMLNotInRootElement)

  If ..OutputDestination'="device" {
    Set io=$io
    Use ..OutputFilename:(/NOXY)

  If ..InTag Write ">" Set ..InTag=0

  If $isObject(binary) {
    Do binary.Rewind() Set len=12000
    While 'binary.AtEnd {
      Write $system.Encryption.Base64Encode(binary.Read(.len),'..Base64LineBreaks)
  } Else {
    Write $system.Encryption.Base64Encode(binary,'..Base64LineBreaks)

  If ..OutputDestination'="device" {
    Use io

  Set ..IndentNext=0

  Quit $$$OK

Great answer Rubens.

The class documentation makes no mention of the second parameter and I was not aware that it existed.

Fortunately I've only had to deal with documents under the large string size to date and did wonder how I would might need to work around that limitation at some point.

Question, the length the XML writer uses is set to 12000. Would this solution work for 12001 or does the size have to be divisible by 3? I'm wondering because 3 characters are represented by 4 characters in base64.


Thanks Rubens/Sean/Dmitry,

I decided to switch to the long strings route as the documents are small.  I couldn't get this working either but it was because I wasn't reading the stream in a number divisible by 3 (I was just doing a .ReadLine() ).  Have performed a Read(12000) and it now works a treat!  PDF opening up in the supplier's system.

do context.streamPDF.Rewind()
while 'context.streamPDF.AtEnd
  set line=context.streamPDF.Read(12000)
  set context.strDocument =context.strDocument_line
set context.strDocumentEncoded = $system.Encryption.Base64Encode(context.strDocument)

If you can't find the methods Vitaliy mentioned because (for example) your Cache version is too old, you can always reinvent the wheel and write your own encoder/decoder method. Where is the problem? This snippet could be a starting point

Parameter Base64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

ClassMethod Base64Enc(x)
    s f=-$l(x)#3, x=x_$e($c(0,0),1,f), y=..#Base64, z="" zt:$ziswide(x) "WIDE"
    f i=1:3:$l(x) s a=$a(x,i)*256+$a(x,i+1)*256+$a(x,i+2), c=262144 f j=1:1:4 s z=z_$e(y,a\c+1), a=a#c, c=c\64
    s:f z=$e(z,1,$l(z)-f)_$e("==",1,f) q z

Well we are in IRIS 2021 and that's the documentation I was looking in.  So I'm not sure what's going on.  

Here's where I landed:

Given a %Stream.FileBinary I calculate the size to read as such

s readSize=($J((stream.SizeGet()/12000),"",0)+1)*12000

And then 'cast' my stream to a %xsd.base64Binary datatype (ODBC type VARBINARY) as such

s base64string=##class(%xsd.base64Binary).LogicalToJSON(stream.Read(readSize,.sc))

In my command line testing I'm able to decode this base64string, write it to a file stream and save and I have a very much in tact PDF.  This is new to me however, so I hope I'm not tricking myself into thinking this is working correctly. 

When I run w ##class(%xsd.base64Binary).IsValid(myVarBinaryData) I get 1 so I think it's working correctly! 

Wondering however about the reading of 12,000 at a time.  Just as long as the read len is divisible by four it should work?

There are no such methods, because the size of the stream (file) can exceed the maximum length of the string, which at the moment is 3641144, so the type of string (%String, %VarString, %xsd.base64Binary, etc.) will not always be able to hold all the data.

But also it is not difficult to read the stream into a line:

ClassMethod StreamToStr(ByRef stream As %Stream.ObjectAs %String
  while 'stream.AtEnd {

PS: above I have given methods, some of which are available in older versions of Caché.

you can test this code,it will work

'$IsObject(file) d
len=513 ; text must be divisible by 3 and 57
While 'file.AtEnd {