Decoding Base64 PDF File

Hello, I am writing some cache code that will pick up a PDF file, Base64 encode the contents and then send on to a third party system within a Long String (via their API). I have been testing this and discovered that the PDFs do not open within the supplier system (I get an error saying that it hasn't been decoded correctly). I wanted to prove that the issue does not lie with the way that I have Base64 encoded it within Ensemble, and therefore as a test wanted to Encode the PDF stream, then decode the stream and write out to a new file. Unfortunately after my code has created the file locally, when I go to open the file in Adobe, I get the same error message. Therefore I am trying to work out if I have not Encoded it properly, or have not Decoded it properly (or both!). Below is a code snippet of how I am performing this test within a BPL code block; ~~~ //context.streamPDF is a %StreamGlobalBinary containing the PDF stream. //context.streamPDFbase64 is a %Stream.GlobalBinary set stream2=##class(%Stream.GlobalBinary).%New() do context.streamPDF.Rewind() while 'context.streamPDF.AtEnd { set buffer = context.streamPDF.Read(4000) do context.streamPDFbase64.Write($System.Encryption.Base64Encode(buffer)) } do context.streamPDFbase64.Rewind() while 'context.streamPDFbase64.AtEnd { set temp=context.streamPDFbase64.Read(4000) set temp=$system.Encryption.Base64Decode(temp) do stream2.Write(temp) } //Output decoded to pdf file do stream2.Rewind() Set file=##class(%File).%New("C:\ANewPDF.pdf") Do file.Open("WSN") // The W means Write, S means put the file in stream mode, and N means create if not there already Set sc=file.CopyFrom(stream2) Do file.%Save() Do file.Flush() Do file.Close() ~~~ Any help greatly appreciated!

  • 0
  • 0
  • 2028
  • 4
  • 3

Answers

By default, Base64Decode and Base64Encode are functions used to decode and encode datatypes, or best saying... STRING.

Since you want to encode a stream the Decoder must understand that it should continue from the last chunk position instead of assuming a new string, otherwise you'll get a corrupted result.

Here's how XML Writer outputs an encoded binary.

/// <method>WriteBase64</method> encodes the specified binary bytes as base64 and writes out the resulting text.
/// This method is used to write element content.<br>
/// Argument:<br>
/// - <var>binary</var> The binary data to output. Type of %Binary or %BinaryStream.
Method WriteBase64(binary) As %Status
{
  If '..InRootElement Quit $$$ERROR($$$XMLNotInRootElement)

  If ..OutputDestination'="device" {
    Set io=$io
    Use ..OutputFilename:(/NOXY)
  }

  If ..InTag Write ">" Set ..InTag=0

  If $isObject(binary) {
    Do binary.Rewind() Set len=12000
    While 'binary.AtEnd {
      Write $system.Encryption.Base64Encode(binary.Read(.len),'..Base64LineBreaks)
    }
  } Else {
    Write $system.Encryption.Base64Encode(binary,'..Base64LineBreaks)
  }

  If ..OutputDestination'="device" {
    Use io
  }

  Set ..IndentNext=0

  Quit $$$OK
}

Great answer Rubens.

The class documentation makes no mention of the second parameter and I was not aware that it existed.

Fortunately I've only had to deal with documents under the large string size to date and did wonder how I would might need to work around that limitation at some point.

Question, the length the XML writer uses is set to 12000. Would this solution work for 12001 or does the size have to be divisible by 3? I'm wondering because 3 characters are represented by 4 characters in base64.

Sean.

Which is also why your original code doesn't work (-> 4000)

Older Caché versions don't have that second parameter.
I also discovered about this on the hardest way: breaking the build.

Thanks Rubens/Sean/Dmitry,

 

I decided to switch to the long strings route as the documents are small.  I couldn't get this working either but it was because I wasn't reading the stream in a number divisible by 3 (I was just doing a .ReadLine() ).  Have performed a Read(12000) and it now works a treat!  PDF opening up in the supplier's system.

do context.streamPDF.Rewind()
while 'context.streamPDF.AtEnd
{
  set line=context.streamPDF.Read(12000)
  set context.strDocument =context.strDocument_line
}
set context.strDocumentEncoded = $system.Encryption.Base64Encode(context.strDocument)

See still of the source code for the following methods:

  • ##class(%Net.MIMEWriter).EncodeStreamBase64()
  • ##class(%Net.SMTP).EncodeStreamBase64()
  • ##class(%Atelier.v1.Utils.General).Base64FromStream()
  • ##class(%XML.Writer).WriteBase64()