· Nov 14, 2022

Image Download from URL


I'm currently struggeling with a HTTP request to a URL, which contains an jpeg image file. 

Testing the request with a browser or Postman results in the image being shown normally. 

Using a %Net.HttpRequest with different configurations has resulted in a corrupted file. 

My code works for some URLs from other servers perfectly fine, but with some it produces corrupted file contents which do not represent a jpeg. 

 Set REQ=##class(%Net.HttpRequest).%New()
 Set REQ.Server=""
 set REQ.SSLConfiguration=""
 SET REQ.FollowRedirect=1
 SET REQ.ContentType="image/jpeg;charset=UTF-8"
 DO REQ.SetHeader("Connection","keep-alive")
 DO REQ.SetHeader("Accept-Encoding","gzip,deflate,br")
 SET REQ.Port=443
 set REQ.Https=1
 SET STATUS=REQ.Get("/Web/WebShopImages/landscape_medium/_t/if/sortimentsboxen-1.jpg")
 SET STREAM=REQ.HttpResponse.Data

I have tried different approaches like http with Port 80, different SSL Configurations, using $system.Util.Decompress but STREAM always contains less data, than the amount I can see in Postman or my browser. The image should be 18 KB but STREAM only contains 13 KB. 
Converting to Base64 also yielded no displayable result.


Here are the headers sent and received with Postman. 


Product version: Caché 2017.1
$ZV: Cache for Windows (x86-64) 2017.2 (Build 744U) Fri Sep 29 2017 10:58:27 EDT
Discussion (5)1
Log in or sign up to continue

To download that image, you need just a few lines of code

Class Some.Class Extends %RegisteredObject
ClassMethod GetImage()
	s req=##class(%Net.HttpRequest).%New()
	s req.Server=""
	s req.SSLConfiguration="SSL" // use your SSL-Config-Name
	d req.Get("/Web/WebShopImages/landscape_medium/_t/if/sortimentsboxen-1.jpg",1)
	q req.HttpResponse

So your code is more or less OK, but the rest of the process is ominous

set rsp=##class(Some.Class).GetImage()
zw rsp
rsp=7@%Net.HttpResponse  ; <OREF>
+----------------- general information ---------------
|      oref value: 7
|      class name: %Net.HttpResponse
| reference count: 3
+----------------- attribute values ------------------
|    ContentBoundary = ""
|        ContentInfo = "charset=UTF-8"
|      ContentLength = 17759
|        ContentType = "image/jpeg;charset=UTF-8"
|               Data = "8@%Stream.GlobalCharacter"
|Headers("CACHE-CONTROL") = "max-age=0"
|Headers("CONTENT-LENGTH") = 17759
|Headers("CONTENT-TYPE") = "image/jpeg;charset=UTF-8"
|    Headers("DATE") = "Mon, 14 Nov 2022 10:57:10 GMT"
|    Headers("ETAG") = "a919e895229c7883864aecbfa2717516"
|Headers("LAST-MODIFIED") = "Thu, 01 Jan 1970 00:00:01 GMT"
|Headers("SET-COOKIE") = "visid_incap_2373370=1U8JalxbRJOzRFviQpi05AYfcmMAAAAAQUIPAAAAAADRm"
|Headers("STRICT-TRANSPORT-SECURITY") = "max-age=31536000; includeSubDomains; preload"
|   Headers("X-CDN") = "Imperva"
| Headers("X-IINFO") = "5-19652015-0 0CNN RT(1668423430526 51) q(0 -1 -1 0) r(0 -1)"
|        HttpVersion = "HTTP/1.1"
|       ReasonPhrase = "OK"
|         StatusCode = 200
|         StatusLine = "HTTP/1.1 200 OK"

The sender says, content type is "image/jpeg", which is OK, but charset=UTF-8 is, I think, a problem. A jpeg-image usually starts with (hex) bytes:

FF D8 FF E0 00 10 4A 46 49 46 ...

The HTTP-Response gives us

do rsp.Data.Rewind()
zzdump rsp.Data.Read(10)

3F 3F 3F 10 4A 46 49 46 00 01                           ???.JFIF..

But I'm in no way a web-expert, but it seems to me, Cache tries to decode (according to content-type =image/jpg; charset=UTF-8) the incomming raw (jpeg) data. The first byte, FF, will already give an error (no utf-8 encoded byte can start with FF) and returns an "?" char as a replacement. The next two "?" (hex: 3F) chars are also arised from (inpossible) decoding. Why the same page works, if you try it with Chrome or Firefox: I think, they either ignore the charset=UTF-8 or just show the raw data after the first decoding error.

I think, I have a solution for you

ClassMethod GetImage()
	s req=##class(%Net.HttpRequest).%New()
	s req.Server=""
	s req.SSLConfiguration="SSL"
	s req.ReadRawMode=1        //  <<---- this is your solution
	d req.Get("/Web/WebShopImages/landscape_medium/_t/if/sortimentsboxen-1.jpg")
	q req.HttpResponse

To get the image

s rsp=##class(Some.Class).GetImage()
i rsp.StatusCode=200 {
    s file="c:\temp\imageName.jpg"
    o file:"nwu":0
    i $t u file d rsp.Data.Rewind(),rsp.Data.OutputToDevice()
    c file

 That's all...