Replies by Timur Safin for InterSystems Developer Community

Timur Safin · Feb 3, 2017

Yes, proper dependency tracking is tough. And some modern package managers allow to "lock" versions you have downloaded and stick with them forever. I do not see a way how multiple versions of a same component could be used in the same namespace, but, on the other hand, for local installations it's pretty easy to have multiple versions installed in the different namespaces. And developer could then lock some versions for selecte namespaces/applications.

In any case corner cases here are multiple, that's why we decided to postpone implementation of dependency tracking till later releases.

go to post

Timur Safin · Feb 3, 2017

And you are very brave if you want to install webterminal from inside of webterminal session :)

1st side-effect I foresee - web-socket will be disconnected, and if anything went wrong - you might get lost it forever. (Such recursive usage of a component is an interesting scenario

BTW, when I was preparing my curated set of initial components in the respository, and loaded, and reloaded, and reloaded webterminal to my testing namespace, I've recognized interesting side-effect - due to the way compiler projections work in webterminal, it's installed successfully only each even time. (I assume there are some ordering issue between RemoveProjection for uncompile and CreateProjection for immediate compile of classes).

P.S.

BTW, why not be "role-model" and not show others how to work with GitHub issues :) [I mean please do not hesitate to open issue in the CPM project repository. This helps]

go to post

Timur Safin · Feb 3, 2017

Keeping every next project in their special namespace is one of problems I want to address with CPM. it's ok to mix many tools to the single namespace to combine in final solution. CPM will make this project manageable. (Because you know all installed asserts, with exact versions of them, and at every time could wipe it out)

go to post

Timur Safin · Feb 3, 2017

1. No, IDE is not a show-stopper - JavaScript/NodeJS got developers attention regardless weak JS support in most of popular editors (e.g. only with TypeScript introduction, many years since then, editors are getting convenient refactoring support for JS/TS). Yes IDE could win some hearts, but there should be something more in the ecosystem to trigger wide adoption, beyond editor;

[Though, yes, Studio is rudimentary editor]

2. Instability of modules is not a blocker either. Given enough of eyes in vibrant, fast moving community, instability of some components could be relatively easy resolved. You need a friendly commiters policy (i.e. pull requests should be welcomed, without any NIH syndroms) , and toolset should be mature (i.e. role model components should use modern tools and methodologies like unit-tests, CI, peer-review and all that stuff).

P.S.

Have you submitted issues about problems you've discovered in components you mentioned?

go to post

Timur Safin · Feb 3, 2017

Vitaly, thanks for the hint :)

But did you look into CPM sources?

go to post

Timur Safin · Jan 17, 2017

That's exactly the point I want to discuss in the next part. (i.e. What shoud be ine the package metadata? What is the format used for metadata serializations? How we mark dependency if any? How to descibe anything to be run before/after installation of a package, and so on). This is very similar to that we have in multiple different implementations elsewhere, and it worth to discuss once again if we want ot have easy and useable package manager in the Cache environment.

go to post

Timur Safin · Nov 26, 2016

This looks nice, and might get even more convenient if you'd add "shell" for interactive work and not only API (you do have READ, WRITE, KILL entry points, but why not wrap them in the interactive shell?)

But here is the bigger concern - I've scanned thru your documentation, and haven't found any mention of security. You just open 5000,5001,5002 ports at each respective system, accept all incoming requests, not check any logins or passwords, or challenge phrases, or security tokens, and hope that there is no evil people in the world?

go to post

Timur Safin · Nov 25, 2016

OTOH their performance numbers are very impressive, even today (especially for Micron part) - http://www.tomshardware.com/reviews/3d-xpoint-guide,4747-6.html

go to post

Timur Safin · Nov 25, 2016

Very useful article, thanks, but at the moment I'd very cautious recommending Intel/Micron 3D XPoint memory technology. It looks like the real numbers are very far from the original claim (especially endurance improvements) - http://semiaccurate.com/2016/09/12/intels-xpoint-pretty-much-broken/

go to post

Timur Safin · Nov 24, 2016

Docker is cool way to deploy and use versioned application, but, IMVHO, it's only applicable for the case when you have single executable or single script, which sets up environment and invoke you for some particular functionality. Like run particular version of compiler for build scenario. Or run continious integration scenario. Or run web front-end environment.

1 function - 1 container with 1 interactive executable is easy to convert to Docker. But not Cache, which is inherently multi-process. Luca has done a great thing in his https://hub.docker.com/r/zrml/intersystems-cachedb/ Docker container where he has wrappee whole environment (including control daemon, write daemon, journal daemon, etc) in 1 handy Docker container whith single entry point implemnted in Go as ccontainrmain but this is , hmm, ... not very effecient way to use Docker.

Containers is all about density of CPU/disk resources, and all the beauty of Docker is based upon the simplicity to run multiple user at the single host. Given this way of packing Cache' configuration to the single container (each user run whole set of Cache' control processes) you will get the worst scalability.

It would be much, much bettrer, if there would be 2 kinds of Cache' docker containers (ran via Swarm for example) where ther would be single control container, and multiple users containers (each connecting to their separate port and separate namespace). But, today, with current security implementation, there would be big, big problem - each user would be seeing whole configuration, which is kind of unexpected in the case of docker container hosting.

Once, these security issues could be resolved there would be effeciet way to host Cache' under docker, but not before.

go to post

Timur Safin · Nov 22, 2016

Thanks for debugging advice - that was greatest missing point which prevented me from using generators wider. I've never understood how to debug them easily. Now I see!

[Goodbye readability! Hello performance, but hard to read!]

go to post

Timur Safin · Oct 12, 2016

I concur the warning about security implications once you've exported terminal access via unauthorized CSP application.

This is very big, and very wide security hole! One could escalate their permissions to localsystem superuser (when you run Windows) and could do pretty much anything with your system, if you didn't properly lock all involved layers.

go to post

Timur Safin · Oct 12, 2016

Putting aside legaility of this task (let assume this is all your code), your case sounds is even much more complicated: you are appranetly running newer version of an engine, with updated tokens and bytecode interpreter, but need to restore code for older version of bytecode.

That was not frequent, but bytecode tokens map did change over the time, so you actually need 2 decompilers (for older version, and for currently used) not one. Possibility to have those is quite zero. Sorry.

go to post

Timur Safin · Oct 12, 2016

[This is slightly overdue, but very welcomed change in any case!]

Did I interpret these correctly, and search improvements were mostly implemented using iFind capabilities?

go to post

Timur Safin · Sep 28, 2016

There is Russian proverb for such cases "Rumors about my death is slightly exaggerated"^[1] . Be it BigData, which is declared by Gartner as dead, but is here to stay, in slightly wider form and in more scenarios, or be it MapReduce. Yes, Google marketers claim to not use it anymore, after they have moved to better/more suitable for search interfaces, and yes, in Java world Apache Hadoop is not the best MapReduce implementation nowadays, where Apache Spark is the better/more modern implementation of the same/similar concepts.

But life is more complicated than that shown to us by marketing, there are still some big players, which are still using their own C++ implementation of MapReduce in their search infrastructure - like Russian Yandex search giant. And this is big enough for me to still count it as relevant.

^[1] As Eduard has pointed out that was Mark Twain who originally said "The report of my death was an exaggeration." Thanks for correction, @Eduard!

go to post

Timur Safin · Sep 8, 2016

For this particular usage scenario, running x86 code thru binary translator would be not a very good idea. Raspberry Pi itself, is not very fastest ARM processor, and adding JIT oberhead would make emulation layer work extremelly slow.

OTOH, initial porting to any new hardware platform, especially for the OS which is already supported (Debian) might be quite easy (especially if you disable for a moment assembler optimizations, and compile full C kernel). [Jose might correct me here, but at least that was my impression from InterSystems times]

The problem though - whether it worth all the pain. What is the reasonable outcome any vendor could get from such habby device owner? 50¢? 5$? Ok, we are talking about educational market, thus assuming there won't be any money stream, but rather enabling ecosystem. Why we think that RaspburryPI build would not repeat GlobalDB failed experiment? Why it would be different this time (on smaller hardware market, and with less powerful hardware)

go to post

Timur Safin · Sep 8, 2016

Nice catch, Daniel!

I wonder though, have you opened prodlog to change behavior of PutLine() method?

go to post

Timur Safin · Sep 6, 2016

Good point - the less traffic is there, the better final result.
Although this would be not very much canonical from MapReduce point of view, but the more aggregation could be done on a single node/worker, the better for reducer.

go to post

Timur Safin · Aug 26, 2016

And, as usual, pull requests welcome! https://github.com/tsafin/cache-map-reduce

go to post

Timur Safin · Aug 26, 2016

Very good question. The push operation of our FIFO is safe, even in their "lock-free" way, because of $increment/$sequence usage and their guarantees. But pop operation is troublesome if there will be multiple workers retrieving the same head just at the same moment.

So, yes, there is no "exactly one" guarantee, and if reduction phase will be running concurrently (it 's not yet planned such) then we have to lock each read-delete operation.

This gonna be problem for multiple node scenarios, so we will talk about this problem when we will approach remote execution and multiple nodes. Thanks for note!