Right. Forgot about it.

You can use ghostscript, here's how. In your case command would probably look like this:

Parameter COMMAND = "%1 -dBATCH -dNOPAUSE -sDEVICE=txtwrite -sOutputFile=%2 %3";

ClassMethod pdf2txt(pdf, txt) As %Status
{
    set cmd = $$$FormatText(..#COMMAND, ..getGS(), txt, pdf)
    return ..execute(cmd)
}

/// Get gs binary
ClassMethod getGS()
{
    if $$$isWINDOWS {
        set gs = "gswin64c"
    } else {
        set gs = "gs"
    }
    return gs
}

Execute method code.

Also note, that PDF can contain only images instead of text. in that case you'd need OCR.

Yes, to skip exporting storage you need to specify compilation flag:

/skipstorage=1

Description

Name: /skipstorage
Description: In class Export, if true do not export storage definition.
Type: logical
Default Value: 0

You can set it:

  • System-wide
  • As a Namespace default
  • For Atelier only: Project -> Compile Configuration

System and namespace defaults could be set via:

Set sc = $System.OBJ.SetQualifiers(qualifiers, system)

If you want to enable/disable/modify several ensemble hosts, it's better to update them without updating production first and after that update production. Maybe your error is caused by racing production updates. Also add longer timeout on production update.

set sc = ##class(Ens.Director).EnableConfigItem("Item1", 1, 0)
write:'sc $System.Status.GetErrorText(sc)
set sc = ##class(Ens.Director).EnableConfigItem("Item2", 1, 0)
write:'sc $System.Status.GetErrorText(sc)
set sc = ##class(Ens.Director).UpdateProduction(60)
write:'sc $System.Status.GetErrorText(sc)

If you want to compare two in-memory objects, you can use method generators, there are several related articles and discussions on that:

Simple comparator on GiitHib - note that it's a runtime comparator, therefore slow. Better solution would be method generators.

If you're comparing objects of different classes you need to find their common ancestor class and compare using that.

If you're comparing stored objects you can calculate hashes and compare that.

All in all it's a very complex topic and you need to determine what requirements do you have:

  • Streams?
  • Lists? Arrays? Position change?
  • Loops/relationships strategy
  • How many levels to compare?
  • Different classes? Do they have common superclass?
  • Do you need to compare dynamic objects/objects from unrelated classes?

And design your comparator based on that.

Here's a simple hasher on GitHub.

That works only for CSP context and CSP pages. You can write a wrapper I suppose, but I think it would be easier to just write your own querybuilder code:

ClassMethod Link(server = "www.example.com")
{
    try {
        set cspContext = $data(%request)
        if 'cspContext {
          set %request = {} // ##class(%CSP.Request).%New()  
          set %response = ##class(%CSP.Response).%New()
          set %session = {} //##class(%CSP.Session).%New(-1,0)
        }
        set query("param") = 1
        set page = "/abcd.csp"
        set url = ##class(%CSP.Page).Link(page,.query)
        set url = $replace(url, page, server)
        write url
        kill:'cspContext %request,%response,%session
    } catch {
        kill:'$g(cspContext) %request,%response,%session
    }
}

With querybuilder:

ClassMethod Link(server = "www.example.com")
{
    set query("param") = 1

    set data = ""
    set param = $order(query(""),1,value)
    while (param'="") {
        set data=data _ $lb($$$URLENCODE(param)_"="_$$$URLENCODE(value))
        set param = $order(query(param),1,value)          
    }
    write server _ "?" _ $lts(data, "&")
}

You're doing two separate operations:

  1. Syncing the data
  2. Syncing the cube

They can both be system tasks with one task dependent on other or even one task altogether.

If you're using persistent objects to store data you can specify DSTIME:

Parameter DSTIME = "AUTO";

and  ^OBJ.DSTIME would be maintained automatically.

UPD. Read your other comment. DSTIME  is relevant only for syncing. It does not affect full build behaviour.

For higher performance it's better to keep the data in InterSystems platform and sync it with remote db periodically.

To download the data via xDBC you have two main approaches:

  • Interoperability (Ensemble) SQL inbound adapter
  • "Raw" access via %SQLGatewayConnection or %Net.Remote.Java.JDBCGateway

Interoperability approach is better as it solves most problems and user should just enter the query, etc. "Raw" can be faster and allows for fine-tuning.

Now, to keep data synced there are several approaches available:

  • If source table has UpdatedOn field, track it and get rows updated only after last sync.
  • Journals: some databases provide parsable journals, use them to determine which rows changed in relevant tables.
  • Triggers: sometimes source table has triggers (i.e. Audit) which while do not provide explicit UpdatedOn field nonetheless can be used to determine row update time.
  • Hashing: hash incoming data and save the hash, update the row only if hash changed.

If you can modify source database - add UpdatedOn field, it's the best6 solution.

Linked tables allow you not to store data permanently, but the cube would be rebuilt each time. With other approaches syncing cube is enough.

Also check this guide on DeepSee Data Connectors.