Sean Connelly · Apr 20, 2017 go to post

The error message is heavily escaped, it would look like this...

{"Info":{"Error":"ErrorCode":"5001","ErrorMessage":"ERROR #5001: Cannot find Subject Area: 'SampleCube'"} } }

This error is only raised in the %ParseStatement method of the %DeepSee.Query.Parser class.

I'm at the limits of what I know on DeepSee, but if I read this as it looks, there is a missing cube called SampleCube?

Sean Connelly · Apr 20, 2017 go to post

Check that you have not lost connection just before the Store() method...

ftp.Connected

Also check the value of

ftp.ReturnMessage

just after the Store() method, if there was a failure then this would have something useful to go on.

Sean.

Sean Connelly · Apr 20, 2017 go to post

Hi Everardo,

There is an extra couple of compilation steps required for the web method.

Each web method requires its own separate message descriptor class. This class contains the arguments of your method as properties of the class, e.g.
 

Property file As %Library.String(MAXLEN = "", XMLIO = "IN");
Property sql As %Library.String(MAXLEN = "", XMLIO = "IN");


This extra class is required to provide a concrete API to your web method. The web service description will project this class as a complex type that the calling services needs to adhere to.

What I think is happening is that when you have an argument called args... that the compiler is trying to compile
 

Property args... As %Library.String(MAXLEN = "", XMLIO = "IN");


Which would fail with an invalid member name error (which correlates with the 5130/5030 error code you have).

I think the main issue here is that there is nothing (to the best of my knowledge) in the SOAP specification that allows for variadic types.

Instead what you want is an argument type that can be projected as a list or an array, e.g.
 

ClassMethod GenerateFileFromSQL(file As %String, sql As %String, delimiter As %String = "", args As %ListOfDataTypes) As %String [ WebMethod ]


That will then be projected in the WSDL as a complex type with an unbounded max occurs, allowing the client to send any number of repeating XML elements for the property args.

If you pass args as %ListOfDataTypes to your non web method then you will need to decide if that method should have the same formal spec, or overload it, something like...
 

if $IsObject(args(1)),args(1).%IsA("%Library.ListOfDataTypes") {
  set list=args(1)
  for i=1:1:list.Count() {
    write !,list.GetAt(i)
  }
} else {
  for i=1:1:args {
      write !,args(i)
  }
}


Sean.

Sean Connelly · Apr 19, 2017 go to post

Hi Scott,

The %Stream package superseded the stream classes in the %Library package. If you look at the class documentation you will see in the descriptions that the %Library stream classes have been deprecated in favour of the %Stream variants. The only reason they still exist would be for legacy implementations.

The other difference is that one is a character stream and the other is a binary stream. As a general rule you should only write text to the character stream and non text (e.g. images) to the binary stream. The main reason for this is to do with unicode characters. You may not have seen issues writing text to %FileBinaryStream, but that might well be because your text didn't have any unicode conversions going on.

Performance wise I'm not sure there would be much in it between the two. You can access the source code of both and they both use the same underlying raw code for reading and writing to files. If you benchmarked them then I guess you would see a marginal difference, but not enough to question which one to use for best performance.

I wonder, how did you determine that the logIt code was the reason for messages slowing down. On the surface it should only have a small impact on the message throughput. If messages are queueing up then it almost feels like this is just the first observation of an overall performance issue going on. I guess you have monitored overall IO performance. If it's already under strain then this could be the straw that breaks the camels back.

On a curious note, whilst you might have needed to log messages in eGate, I wonder why this would be necessary in Ensemble. Unless you are using in memory messaging, all of your messages will be automatically logged internally, as well as being tailed to the transaction logs. By adding your own logging you are effectively writing the same message to disk not twice but three times. If you also have IO logging enabled on your operation then it will be four times. Not to mention how many times the message was logged before the operation. On top of that, if you have log trace events enabled in production then the IO overhead for just one messages is going to thrash the disks more than it needs to. Multiply that across your production(s) and how well IO is (or is not) spread over disks and it would be easy to see how a peak flow of messages can start to queue.

Another reason I see for messages queuing (due to IO thrashing) is because of poor indexes elsewhere in the production. A data store that worked fast in development will now be so large that even simple lookups will hog the disks and flush out memory cache putting an exponential strain on everything else. Suddenly a simple bespoke logger feels like its writing at the speed of a ZX Spectrum to a tape recorder.

Of course you may well have a highly tuned system and production and all of this is a rambling spam from me. In which case, nine times out of ten if I see messages queuing its just because the downstream system can't process messages as quickly as Ensemble can send them.

Sean.

Sean Connelly · Apr 19, 2017 go to post

I just tried the above out and it works.

You can do it more succinctly, but you must use "set" on $property as a "do" will throw a compile error for some reason...

set sc=$property(parentObject,"childRefProperty").Insert(childObject)
Sean Connelly · Apr 19, 2017 go to post

Have you tried...

set p=$property(parentObject,"childRefProperty")
do p.Insert(childObject)
Sean Connelly · Apr 19, 2017 go to post

I was trying to figure out if you had found a secret zip command on windows, but realised from your code you are using...

http://gnuwin32.sourceforge.net/packages/zip.htm

7zip has always been rock solid for me on windows, and is well maintained. The above zip lib looks like its almost 10 years old now?

Perhaps use a combination of both as per the HS.Util.Zip.Adapter class.

Sean Connelly · Apr 19, 2017 go to post

Hi Greg,

The only zip utility that I have come across is in Healthshare (core 10+).

If you have Healthshare then take a look at...

HS.Util.Zip.Adapter


If you don't have Healthshare then it's still easy enough to do via the command line with $zf...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=RCOS_fzf-1

First, if you are on windows then there is no built in command line outside of powershell. You will need to install 7zip (btw, Healthshare defaults to 7zip on windows as well). If you are on Linux then there is a built in zip command, but you might also chose to install 7zip as well.

Couple of trip hazards.

If you are building the command line on windows then 7zip will be installed in "Program Files" with a space, so you will need to wrap quotes around the exe path, which will need double quoting in a cache string.

If you are unzipping to a directory, the directory needs to exist first. Take a look at CreateDirectoryChain on the %File class to make this easier to do.

A simple untested example...

ClassMethod ZipFile(pSourceFile As %String, pTargetFile As %String) As %Status
{
    set cmd="""C:\Program Files\7-Zip\7z.exe"" a "_pTargetFile _" "_pSourceFile
    set status=$zf(-1,cmd)
    if status=0 quit $$$OK
    quit $$$ERROR($$$GeneralError,"Failed to zip, reason code: "_status)
}


Anyone landing here and happy just to use gzip, then there was a recent discussion here...

https://community.intersystems.com/post/there-option-export-globals-archive

Hope that helps.

Sean.

Sean Connelly · Apr 18, 2017 go to post

Here is an expanded example...
 

Method logProps(parent) As %String [ CodeMode = objectgenerator ]
{
  set x=%code
  for i=1:1:%compiledclass.Properties.Count() {
    #dim As %CompiledProperty
    set p=%compiledclass.Properties.GetAt(i)
    set name=p.Name
    if $extract(p.Name)'="%" do x.WriteLine(" do ..logProp("""_p.Name_""",.."_p.Name_")")
  }
}
Method logProp(propName, propValue)
{
  // call your macro logger here, or replace call to logProp with your macro above
}


Let's say you had a property...

  Property firstName As %String;


When you compile the class, your logProps method would look like this...

zlogProps(parent) public {
  do ..logProp("firstName",..firstName) }


There is an article on DC that explains method generators in more depth..

https://community.intersystems.com/post/exploring-code-generation-cach%C3%A9-method-generators

Sean Connelly · Apr 18, 2017 go to post

Hi Alexandr,

If property is in the context of this class then you could try

set value=$property($THIS,propName)

If you make the code blocks outer method a generator, e.g. [ CodeMode = objectgenerator ]

then you can bake the property accessor into the underling INT code, e.g.

do %code.WriteLine(" set value=.."_propName)

this approach means you don't have to make any repetitive IO calls to dictionary at run time.

If you need an expanded example then I can bash something out.

Sean.

Sean Connelly · Apr 18, 2017 go to post

Hi Tomas,

Its a good point, but performance doesn't have to be an issue.

In the main, the lint tool would only need to check the class that is being edited and compiled at that time. If coded optimally (e.g. memoize dictionary lookups etc) then is should only add milliseconds to a compilation cycle. This would hardly be noticed by the developer.

There is a use case where the edited class could break implementations of that class. In this instance a "lint all" process could be hooked into various other non disruptive steps, such as running a full unit test.

Losing milliseconds at compile time is a small trade off to the collective time lost on supporting these types of errors.

Sean Connelly · Apr 15, 2017 go to post

All of the uses cases are around the implementation of instance methods and static methods.

In all cases class A either extends or uses class B. For class A to compile, class B must be present and compiled first, no matter what module or namespace it belongs to. If class B implements a generic interface, then all variances of class B should adhere to the same interface (despite being a manual verification check at the moment).

The types of examples you have raised sound like you are hot calling functional code, in which case I agree it would be very hard to lint this style of coding.

Sean Connelly · Apr 14, 2017 go to post

Hi Clark,

Can you provide a more concrete example.

In particular, why would mappings change the formal spec or return type of a method.

Are you using mappings in some kind of environmental polymorphism?

For me the implementation should never be affected by mappings. Mappings are just a convenience, not a logical influence.

I'm struggling to understand the limitation that you are suggesting?

Sean Connelly · Apr 13, 2017 go to post

wow, that's interesting, never seen it used in methods that way, I guess its one way to work around the command error

so as a fringe case, if the lint tool found a $quit in a method then it would assume the developer knows what they are doing and ignore the whole method, that way no false postives

Sean Connelly · Apr 13, 2017 go to post

That's very true, rawContent is limited to 10,000 characters.

If you have messages that are larger then you could do it this way...

ClassMethod DisplaySegmentStats()
{
  write !!,"Segment Statistics...",!!
  &sql(declare hl7 cursor for select id into :id from EnsLib_HL7.Message)
  &sql(open hl7)
  &sql(fetch hl7)
  while SQLCODE=0
  {
    set msg=##class(EnsLib.HL7.Message).%OpenId(id)
    set raw=msg.getSegsAsString(id)
    for i=1:1:$l(raw,$C(13))-1
    {
      set seg=$p($p(raw,$c(13),i),"|")
      set stats(seg)=$G(stats(seg))+1
    }
    &sql(fetch hl7)
  }
  &sql(close hl7)
  zw stats
}
Sean Connelly · Apr 13, 2017 go to post

I'm still trying to get my head around this.

Are you saying that developers actively use $quit inside method code?

Sean Connelly · Apr 13, 2017 go to post

> QUIT from inside a loop is considered quitting a loop rather then a function, so it should always be without a value.

100% agreed.

My point was if the compiler will go as far as protecting the developer from this type of quit misshap, then could the compiler not also warn on potential quit <COMMAND> errors.

Sean Connelly · Apr 13, 2017 go to post

It does look interesting. I might find a few of the rules a little querulous, but there are a some gems in there, this ones priceless :)

Property of type %String without a MAXLEN

For what I want a 100 line linter would suffice for studio output. But I can see value in cachéquality if I was back managing a large team of developers again.

Sean Connelly · Apr 13, 2017 go to post

Various Options...

1. Call out to Java on the command line using $ZF

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

2. Access POJO's directly using Jalapeño

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

3. Consume a web service using the Cache soap wizard...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

4. Publish a web service from Cache...

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KE…

Sean Connelly · Apr 13, 2017 go to post

Code reviews is a great suggestion. Having a style guide is a good supplement to this.

I was wondering if anyone had attempted a community driven style guide yet?

As an example, I have adopted this style guide for all of my JavaScript development...

https://github.com/airbnb/javascript

Sean Connelly · Apr 13, 2017 go to post

Hi James,

Nothing simple that I can think of (perhaps DeepSee?).

Alternatively, I normally bash out a few lines of code, something like this... 

ClassMethod DisplaySegmentStats(){
  write !!,"Segment Statistics...",!!
  &sql(declare hl7 cursor for select rawContent into :raw from EnsLib_HL7.Message)
  &sql(open hl7)
  &sql(fetch hl7)
  while SQLCODE=0
  {
    for i=1:1:$l(raw,$C(13))-1
    {
      set seg=$p($p(raw,$c(13),i),"|")
      set stats(seg)=$G(stats(seg))+1
    }
    &sql(fetch hl7)
  }
  &sql(close hl7)
  zw stats}
Sean Connelly · Apr 12, 2017 go to post

Yes, excellent addition!

Rubber ducking is a great tool, and if no one is around then I will actually talk the problem out loud to myself. There is something about the vocal feedback loop to the brain that really helps. I've even asked my two dogs, to very curious looks lol.

Sean Connelly · Apr 12, 2017 go to post

Just wondering if you are missing a trick here.

If you are using try catch then your catch will return %Exception.AbstractException, this has a Location property.

Sean Connelly · Apr 12, 2017 go to post

You could create a macro that conditionally compiles the break points into your code based on the existence of a global variable.

Sean Connelly · Apr 12, 2017 go to post

Sounds like you might be adding unnecessary complexity.

> What is the most efficient way to process this large file?

That really depends on your definition of efficiency.

If you want to solve the problem with the least amount of watts then solving the problem with a single process would be the most efficient.

If you add more processes then you will be executing additional code to co-ordinate responsibilities. There is also the danger that competing processes will flush data blocks out of memory in a less efficient way.

If you want to solve the problem with speed then its important to understand where the bottlenecks are before trying to optimise anything (avoid premature optimisation).

If your process is taking a long time (hours not minutes) then you will most likely have data queries that have a high relative cost. It's not uncommon to have a large job like this run 1000x quicker just by adding the right index in the right place.

Normally I would write a large (single) process job like this and then observe it in the management portal (System>Process>Process Details). If I see its labouring over a specific global then I can track back to where the index might be needed.

You will then get further efficiencies / speed gains by making sure the tables are tuned and that Cache has as much configured memory cache as you can afford.

If you are writing lots of data during this process then also consider using a temporary global that won't hit the transaction files. If the process is repeatable from the file then there is no danger of losing these temp globals during a crash as you can just restart the job after the restore.

Lastly, I would avoid using Ensemble for this. The last thing you want to do is generate 500,000 Ensemble messages if there is no need to integrate the rows of data with anything other than internal data tables.

Correction. It's perfectly fine (for Ensemble) to ingest your file and process it as a single message stream. What I wouldn't do is split the file into 500,000 messages when there is no need to do this. Doing so would obviously cause additional IO. 

Sean Connelly · Apr 12, 2017 go to post

As an alternative to...

   s cn=##Expression($$$quote(%classname))

You could just do...

  cn=$CLASSNAME()

Sean Connelly · Apr 12, 2017 go to post

##SafeExpression will not work for you.

The method Abs() will get baked into A with the class name A.

B will no longer get its own baked method, so when Abs() is called on B, it will incorrectly return A.