Hi Marc,

different languages have feature like the annotation you are describing. Most of these have one thing in common: they are quite young. Especially compared to M/COS.
That being said, some of the functionality you are envisioning is there already. Well, kinda, sorta ;)

You can use ZBREAK to add hooks to your existing code (even already compiled ones). This is more comparable to introspection than annotation as it happens at runtime. However, with the ZBREAK execute_code parameter it is easy to do anything from any point in your code.

To actually implement annotations the way you are thinking of here, you'd need to implement a wrapping solution around the compiler.
You might be able to hook into the source control utilities to do that: extending studio, however, modifying the sourcecode before passing it to the compiler would be difficult.

Using the wrapper approach, you could programmatically create shadow classes of your actual code and insert your annotation's code.
This will come with the drawback that you'd have to invoke your shadow classes instead of the direct code.
On the other hand it would allow you to have the annotations optional, so you're not slowing down your application once you're done developing.

You could possibly also use generator methods to move the wrapping into the class and generate new methods based on the code you put in and the annotations you added.

HTH,
Fab

Hi Mack,

there isn't an easy answer for this. The amount of storage heavily depends on how your data and your global structure can be compressed. In storing your globals there are a number of mechanisms that optimize the storage used. It might be useful to have a look at some of the internal mechanisms for that: https://community.intersystems.com/post/internal-structure-cach%C3%A9-database-blocks-part-1.

As you can see from this article (which is still pretty high level), it won't be easy to create an accurate prediction mechanism.

As such, the best way to try this out, would be to just use a small amount of your source data and store it in Caché. This will give you a baseline of how much overhead you can expect. Depending on your data structures there might also be additional indices being created.

So if you try and store 10MB,100MB,1GB,1TB of your source data on a test system, you'll be able to get a pretty good prediction curve out of it with a low error rate.
Any other approach either is going to be too much guesswork, or going to need a lot more detailed work, so it would probably not be worth the time.

I tried to include an actual path forward for you, so I hope this helps!

HTH,
Fab

As Dmitry already pointed out, it seems you should cover your basics first. 

A CSP page is first and foremost a static page which is being rendered on the server. The data model of webpages has traditionally been dominated by a request-response approach. This comes down to the basic stateless design of http: a client sends requests to GET some information from the server (very simplified). 

This means it is quite difficult to "push" updates from the server to the client. One solution, or rather, workaround to this problem, is to implement a timer on the client side which periodically asks the server if there is new data. This can easily be solved with a javascript timeout. 

Nowadays we have moved on a little bit from this rather clumsy approach. Dmitry already mentioned a possibility for this: websockets. Websockets give you a two way communication channel between your client and your server (see RFC6455). 

This introduced the exciting possibility to actually push events from the server to the client without the need for periodic pulling of information.  

Have a look at this basic example how to do that in Caché technology: asynchronous-websockets-quick-tutorial (shameless plug;) )

There have been quite a few advances and developments in this area, building on the basic websocket connection and building frameworks for the efficient handling and caching of even bigger datasets. React/Flux are just some frameworks you could look at as you make your way through the jungle of web technology. Good Luck!

 

-Fab

Hi Jonathan,

unfortunately there is no easy way to do this from within Caché. The PDF format is a rather complex binary format and Caché doesn't have a library to access it.  There are a couple of tools that allow the annotation of PDF documents, but none of them would allow you to easily integrate with the engine. 

PDF is in itself a rather complex format, so to directly edit it, say via opening it as %BinaryStream, would require you to implement your own PDF rendering engine. (Have a look at [this post](https://blog.idrsolutions.com/2013/01/understanding-the-pdf-file-format-...) to get a glimpse of the problems you'd be facing)

The solution to use a PDF as background for the rendering of a new report via FOP might be a way out. But you'd need to know the layout of the incoming PDF and be sure it doesn't change. Only then you could get the ZEN report to print into specific fields. 

-Fab

Here is a simple example to upload and store a file via a csp page:

<html>
<head></head>
<body>

<form enctype="multipart/form-data" method="post" action="upload.csp">
    Enter a file to upload here: <input type=file size=30 name=FileStream>
    <hr />
    <ul><input type="submit" value="Upload file"></ul>
</form>

<csp:if condition='($data(%request.MimeData("FileStream",1)))'>
    <h2>Saving file...</h2>

</csp:if>
</body>
</html>

Hi Sebastian,

so if I understand you correctly, you want to use your csp application to authenticate users for another application? 

In that case, I would recommend having a look at the oauth article over here and here. Using this SSO approach, you get rid of the problem of transmitting usernames and passwords in cleartext altogether. And it allows your two different applications to use the same credentials. 

Hope this is helps! 

-Fab

Hi Scott,

while it is generally possible to search through a stream object for certain strings, it would really depend on the pdf you're putting in here. 

PDF documents are notoriously weird ;) Sometimes the containing text is actually contained as text (similar to postscript), but more regularly pdfs are containing vectorized graphics. In that case you'd have to run  OCR on the document first to get textual information out of it. 

A good test would be to run strings on the document and see if you can spot the information you need. Or you can open the pdf document in a text editor and take a look manually. 

Caché doesn't have a pdf rendering engine itself, so from its perspective it would just be a binary stream of data and you need to interpret it in your code. 

Cheers,

Fab

You can use %Net.HttpRequest to do that.

Something along these lines will work (of course you should checks for errors etc...):

USER>s request=##class(%Net.HttpRequest).%New()

USER>d request.Get("http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf")

USER>s file=##class(%FileBinaryStream).%New()

USER>d file.LinkToFile("/Users/kazamatzuri/temp/test.pdf")

USER>d file.CopyFrom(request.HttpResponse.Data)

USER>w file.%Close()
1
USER>

Hi Bapu,

without having tested the actual performance difference of these different approaches yet. Usually in a webservice context, the serialisation itself is taking less time than the logic around it.  The requests have to go through the webserver/csp gateway and you have your connections latency and bandwidth to consider as well. 

That being said, why are you trying to use %XML.DataSet? Is your SelectDoses a %ResultSet? See  http://docs.intersystems.com/latest/csp/documatic/%25CSP.Documatic.cls?APP=1&LIBRARY=%25SYS&CLASSNAME=%25XML.DataSet

Hi Nikita,

especially on a 'development mode' installation you need to be careful with messing with UnknownUser. UnknownUser is being used for many different things, and setting a startup routine like this would for example lock you out of the Management Portal. This is because in the default installation the CSP Gateway is using the UnknownUser to log into the instance. So you would have to set it up to use CSPSystem instead.  George has already pointed out how to get the user into programmer mode after running any of your code. The more supported approach is what Tim told you.

Cheers,

Fab

Hi Uri,

this comes down to the way computers are representing numbers. Since computers are based on a binary system, you have to approximate (some) numbers. This leads to things like .3 actually being .29999999999999998889 (in IEEE binary double precision floating point representation). This actually comes up from time to time, so I took the liberty to use some of the examples I gathered over the time: 

Various languages use different limits for rounding and or display of numbers and sometimes moving between languages leads to these artifacts coming up. One common example for that is moving from java's double to Cache's decimal types.

Cache's decimal representation has almost 19 digits of decimal precision and it has a range where its largest positive number is exactly 9.223372036854775807E+145 and its smallest non-zero positive number is exactly 1E-128.

Cache's binary representation, as well as Java's "double" type representation (just as an example, since you asked about other languages as well), uses the 64-bit double precision representation defined by IEEE Std 754-1985 (the IEEE Standard for Binary Floating-Point Arithmetic). This representation has a precision of 53 binary bits, which is almost 16 digits of decimal precision. It has a range where its largest positive number is approximately 1.797693134862315708E+308 and its smallest non-zero positive number is approximately 2.22507385850720138E-308. The set of decimal fractions that can be stored exactly in binary floating- point representation (i.e., without using an approximation) is very limited. The set includes 0.5, 0.25, 0.75, 0.125, 0.375, 0.625, 0.875. The list I just wrote down shows all the 1 digit, 2 digit and 3 digit decimal fractions that are represented exactly in binary. The remaining three- digit decimal fractions (992 decimal fractions out of a total of 999 fractions) must be approximated.

So looking at another example: 

The decimal value 4.2 is approximated in IEEE double precision binary floating-point representation by the decimal value 4.2000000000000001776356839400250464678... . The internal hex value is 4010CCCCCCCCCCCD. The next smaller binary value has hex value 4010CCCCCCCCCCCC and it is approximately 4.1999999999999992894572642398998141289... . Since the larger value is closest to 4.2, that is the IEEE double precision binary floating-point value we use to approximate the decimal value of 4.2. Both Cache and Java will use the same "double" value to approximate 4.2. The Cache decimal floating-point representation can exactly represent the decimal value 4.2 with no approximation necessary. The COS function $DOUBLE(x) can be used to force the value "x" into the IEEE binary floating-point representation by generating the best approximation. The COS function $DECIMAL(x) can be used to force the value "x" into the Cache decimal floating-point representation. The COS function $DECIMAL(x,n) can be used to convert the numeric value "x" into a string representation using "n" significant digits (but "n" is limited to be between 1 and 38.) Consider, the following:

USER>set x=4.2

USER>set xd=$DOUBLE(x)

USER>write x 4.2

USER>write xd 4.2000000000000001776

USER>write $DECIMAL(xd,30) 4.20000000000000017763568394003

USER>write $DECIMAL(xd,16) 4.2

USER>write $DECIMAL(xd,17) 4.2000000000000002

Cache converts the numeric value $DOUBLE(4.2) to a default string representation with 20 significant digits. Asking for 30 significant digits shows that $DOUBLE(4.2) has more than 20 digits in its decimal representation. Asking for 16 and 17 decimal digits shows that $DOUBLE(4.2) does approximate the decimal value 4.2 with an accuracy of about 16 decimal digits. Consider, the following computation which removes the leading 4 and 2. (It also multiplies by 10 which does cause a little round off error but you would get a lot more round off error if you computed xd-4.2. Computing xd- 4.2 causes so much round off error that all significance is lost and the answer is 0.0.)

USER>WRITE ((xd-4)*10)-2 .0000000000000017763568394002504646

You can try the above computation in Java using its "double" type and you should get the same answer (with a few less decimal digits printed.) Doing this in Java will give you another way (besides using the BigDecimal package) to demonstrate that "double xd=4.2:" does not produce an exact representation of 4.2 in Java. Another question is why Cache prints $DOUBLE(4.2) using as many digits as 4.2000000000000001776. The reason is that Cache can exactly represent other nearby decimal values such as 4.200000000000000177 and 4.200000000000000178. These are adjacent values in the Cache decimal floating-point representation and the value $DOUBLE(4.2) falls in between these two values. If we convert $DOUBLE(4.2) to decimal then we will get the larger of these values because that is the closest Cache decimal floating-point value.

USER>WRITE $DECIMAL($DOUBLE(4.2))

4.200000000000000178

This conversion from binary to decimal involves an approximation as the following comparisons show:

USER>write $DECIMAL($DOUBLE(4.2))=$DOUBLE(4.2)

0

USER>write $DECIMAL($DOUBLE(4.2))>$DOUBLE(4.2)

1

and the default conversions of these values to string representation shows why the comparisons give these results.

USER>write $DECIMAL($DOUBLE(4.2)),!,$DOUBLE(4.2)

4.200000000000000178

4.2000000000000001776

You should note that 4.2 and $DOUBLE(4.2) are not really close to each other. Cache can also represent an additional 177 decimal floating-point values between 4.2 and $DOUBLE(4.2). The default conversion of a $DOUBLE value to a string will usually have 20 significant digits. If the default representation has less than 20 significant digits then that $DOUBLE value exactly equals the corresponding decimal value represented by the string. If you want to format Cache $DOUBLE values as strings so that look they like Java conversions to strings then you can consider using $DECIMAL(xd,15) or $FNUMBER(xd,"G",14) which are two examples of formatting functions that only print 15 significant digits. If you take a Java "double" type value, move it into a Cache data base, and later extract the value to send it back to a Java "double" variable then the value originally sent to Cache will be identical to value that is returned. As long as the value is between 9.22E+145 and 1E-110 in magnitude then it will not matter whether the value stored internal to Cache uses decimal or binary representation. The approximations involved in converting between binary and decimal will not be large enough to change the "double" value. If the Java "double" data involves values that are outside this range then you must be careful to use the %Double type in Cache in order to eliminate conversions that might cause overflow or underflow. We usually recommend that customers use the default decimal representations in Cache and avoid the %Double type and avoid the COS $DOUBLE function. The default decimal representation rarely involves unexpected approximations. The only time to use $DOUBLE(), %Double and binary floating-point is when the values are not directly entered and read by humans but instead involve a machine-to-machine transfer of data stored using the representation defined by IEEE Std 754. One additional note: when you write 1.9f in Java, the suffix "f" means you are asking to use the 32-bit single precision binary floating-point representation defined by IEEE Std 754-1985. It has a much smaller range and only 24 bits of precision. It's approximation of decimal fractions is only good to about 7 decimal digits. This smaller precision explains the results you are seeing when using this value.

So all in all, you are not asking to get more accurate calculations, but you are rather asking for less accuracy, as you only want to see the rounded values. As shown in the above example you can use $double for some of these calculations. 

-Fab

Please note that this example uses the old (deprecated) dot-syntax for the loop. It also loops through your data twice. Nowadays you would write the same more like this:

#include %occInclude
ListDir
 set statement=##class(%SQL.Statement).%New()
 set status=statement.%PrepareClassQuery("%File","FileSet")
 if $$$ISERR(status) { do $system.OBJ.DisplayError(status) }
 set resultset=statement.%Execute("c:\temp","*","",1)
 while resultset.%Next() {
    write:resultset.%Get("Type")="D" !, resultset.%Get("Name")
 }