Clear filter
Announcement
Jeff Fried · Mar 31, 2020
GA releases are now published for the 2020.1 version of InterSystems IRIS, IRIS for Health, and IRIS Studio!
A full set of kits and containers for these products are available from the WRC Software Distribution site, including community editions of InterSystems IRIS and IRIS for Health.
The build number for these releases is 2020.1.0.215.0.
InterSystems IRIS Data Platform 2020.1 makes it even easier to develop and deploy real-time, machine learning-enabled applications that bridge data and application silos. It has many new capabilities including:
Kernel Performance enhancements, including reduced contention for blocks and cache lines
Universal Query Cache - every query (including embedded & class ones) now gets saved as a cached query
Universal Shard Queue Manager - for scale-out of query load in sharded configurations
Selective Cube Build - to quickly incorporate new dimensions or measures
Security improvements, including hashed password configuration
Improved TSQL support, including JDBC support
Dynamic Gateway performance enhancements
Spark connector update
MQTT support in ObjectScript
InterSystems IRIS for Health 2020.1 includes all of the enhancements of InterSystems IRIS. In addition, this release includes:
In-place conversion to IRIS for Health
HL7 Productivity Toolkit including Migration Tooling and Cloverleaf conversion
X12 enhancements
FHIR R4 base standard support
As this is an EM (extended maintenance) release, many customers may want to know the differences between 2020.1 and 2019.1. These are listed in the release notes:
InterSystems IRIS 2020.1 release notes
IRIS for Health 2020.1 release notes
Documentation can be found here:
InterSystems IRIS 2020.1 documentation
IRIS for Health 2020.1 documentation
InterSystems IRIS Studio 2020.1 is a standalone development image supported on Microsoft Windows. It works with InterSystems IRIS and IRIS for Health version 2020.1 and below, as well as with Caché and Ensemble.
The platforms on which InterSystems IRIS and IRIS for Health 2020.1 are supported for production and development are detailed in the Supported Platforms document. Will there be an update to Caché 2018 as well ? The Community Editions can also be found in the Docker Store:
docker pull store/intersystems/iris-community:2020.1.0.215.0
docker pull store/intersystems/irishealth-community:2020.1.0.215.0 Hi @Kurt.Hofman - yes there is a 2018.1.4 in the works for Caché and Ensemble, though not for a couple of months. It is a maintenance release. Thanks, Jeff! Are there related updates of Docker images? Thank you, Steve!
Announcement
Anastasia Dyubaylo · May 12, 2020
Hi Community!
We are pleased to invite you to the upcoming webinar in Spanish: "How to implement integrations with .NET or Java on InterSystems IRIS" / "Cómo implementar integraciones con .NET o Java sobre InterSystems IRIS" on May 20 at 4:00 PM CEST!
What will you learn?
PEX (Production EXtension Framework), its architecture and its API, and how to develop an integration with Java or .NET, in order to rich the InterSystems IRIS pre-built components
Some simple examples in .NET
A more complex example, using PEX to add an existing client library and access to external services
Speaker: @Pierre-Yves.Duquesnoy, Sales Senior Engineer, InterSystems Iberia
Note: The language of the webinar is Spanish.
We are waiting for you at our webinar! ✌🏼
PLEASE REGISTER HERE! Is the video of this webinar available for viewing? Hi David,
This webinar recording is already available on InterSystems Developer Community en español.
Enjoy watching this video)
Article
Mikhail Khomenko · Mar 12, 2020
Imagine you want to see what InterSystems can give you in terms of data analytics. You studied the theory and now you want some practice. Fortunately, InterSystems provides a project that contains some good examples: Samples BI. Start with the README file, skipping anything associated with Docker, and go straight to the step-by-step installation. Launch a virtual instance, install IRIS there, follow the instructions for installing Samples BI, and then impress the boss with beautiful charts and tables. So far so good.
Inevitably, though, you’ll need to make changes.
It turns out that keeping a virtual machine on your own has some drawbacks, and it’s better to keep it with a cloud provider. Amazon seems solid, and you create an AWS account (free to start), read that using the root user identity for everyday tasks is evil, and create a regular IAM user with admin permissions.
Clicking a little, you create your own VPC network, subnets, and a virtual EC2 instance, and also add a security group to open the IRIS web port (52773) and ssh port (22) for yourself. Repeat the installation of IRIS and Samples BI. This time, use Bash scripting, or Python if you prefer. Again, impress the boss.
But the ubiquitous DevOps movement leads you to start reading about Infrastructure as Code and you want to implement it. You choose Terraform, since it’s well-known to everyone and its approach is quite universal—suitable with minor adjustments for various cloud providers. You describe the infrastructure in HCL language, and translate the installation steps for IRIS and Samples BI to Ansible. Then you create one more IAM user to enable Terraform to work. Run it all. Get a bonus at work.
Gradually you come to the conclusion that in our age of microservices it’s a shame not to use Docker, especially since InterSystems tells you how. You return to the Samples BI installation guide and read the lines about Docker, which don’t seem to be complicated:
$ docker pull intersystemsdc/iris-community:2019.4.0.383.0-zpm$ docker run --name irisce -d --publish 52773:52773 intersystemsdc/iris-community:2019.4.0.383.0-zpm$ docker exec -it irisce iris session irisUSER>zpmzpm: USER>install samples-bi
After directing your browser to http://localhost:52773/csp/user/_DeepSee.UserPortal.Home.zen?$NAMESPACE=USER, you again go to the boss and get a day off for a nice job.
You then begin to understand that “docker run” is just the beginning, and you need to use at least docker-compose. Not a problem:
$ cat docker-compose.ymlversion: "3.7"services: irisce: container_name: irisce image: intersystemsdc/iris-community:2019.4.0.383.0-zpm ports: - 52773:52773$ docker rm -f irisce # We don’t need the previous container$ docker-compose up -d
So you install Docker and docker-compose with Ansible, and then just run the container, which will download an image if it’s not already present on the machine. Then you install Samples BI.
You certainly like Docker, because it’s a cool and simple interface to various kernel stuff. You start using Docker elsewhere and often launch more than one container. And find that often containers must communicate with each other, which leads to reading about how to manage multiple containers.
And you come to Kubernetes.
One option to quickly switch from docker-compose to Kubernetes is to use kompose. Personally, I prefer to simply copy Kubernetes manifests from manuals and then edit for myself, but kompose does a good job of completing its small task:
$ kompose convert -f docker-compose.ymlINFO Kubernetes file "irisce-service.yaml" createdINFO Kubernetes file "irisce-deployment.yaml" created
Now you have the deployment and service files that can be sent to some Kubernetes cluster. You find out that you can install a minikube, which lets you run a single-node Kubernetes cluster and is just what you need at this stage. After a day or two of playing with the minikube sandbox, you’re ready to use a real live Kubernetes deployment somewhere in the AWS cloud.
Getting Set Up
So, let’s do this together. At this point we'll make a couple assumptions:
First, we assume you have an AWS account, you know its ID, and you don’t use root credentials. You create an IAM user (let's call it “my-user”) with administrator rights and programmatic access only and store its credentials. You also create another IAM user, called “terraform,” with the same permissions:
On its behalf, Terraform will go to your AWS account and create and delete the necessary resources. The extensive rights of both users are explained by the fact that this is a demo. You save credentials locally for both IAM users:
$ cat ~/.aws/credentials[terraform]aws_access_key_id = ABCDEFGHIJKLMNOPQRSTaws_secret_access_key = ABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890123[my-user]aws_access_key_id = TSRQPONMLKJIHGFEDCBAaws_secret_access_key = TSRQPONMLKJIHGFEDCBA01234567890123
Note: Don’t copy and paste the credentials from above. They are provided here as an example and no longer exist. Edit the ~/.aws/credentials file and introduce your own records.
Second, we’ll use the dummy AWS Account ID (01234567890) for the article, and the AWS region “eu-west-1.” Feel free to use another region.
Third, we assume you’re aware that AWS is not free and you’ll have to pay for resources used.
Next, you’ve installed the AWS CLI utility for command-line communication with AWS. You can try to use aws2, but you’ll need to specifically set aws2 usage in your kube config file, as described here.
You’ve also installed the kubectl utility for command-line communication with AWS Kubernetes.
And you’ve installed the kompose utility for docker-compose.yml for converting Kubernetes manifests.
Finally, you’ve created an empty GitHub repository and cloned it to your host. We’ll refer to its root directory as <root_repo_dir>. In this repository, we’ll create and fill three directories: .github/workflows/, k8s/, and terraform/.
Note that all the relevant code is duplicated in the github-eks-samples-bi repo to simplify copying and pasting.
Let’s continue.
AWS EKS Provisioning
We already met EKS in the article Deploying a Simple IRIS-Based Web Application Using Amazon EKS. At that time, we created a cluster semi-automatically. That is, we described the cluster in a file, and then manually launched the eksctl utility from a local machine, which created the cluster according to our description.
eksctl was developed for creating EKS clusters and it’s good for a proof-of-concept implementation, but for everyday usage it’s better to use something more universal, such as Terraform. A great resource, AWS EKS Introduction, explains the Terraform configuration needed to create an EKS cluster. An hour or two spent getting acquainted with it will not be a waste of time.
You can play with Terraform locally. To do so, you’ll need a binary (we’ll use the latest version for Linux at the time of writing of the article, 0.12.20), and the IAM user “terraform” with sufficient rights for Terraform to go to AWS. Create the directory <root_repo_dir>/terraform/ to store Terraform code:
$ mkdir <root_repo_dir>/terraform$ cd <root_repo_dir>/terraform
You can create one or more .tf files (they are merged at startup). Just copy and paste the code examples from AWS EKS Introduction and then run something like:
$ export AWS_PROFILE=terraform$ export AWS_REGION=eu-west-1$ terraform init$ terraform plan -out eks.plan
You may encounter some errors. If so, play a little with debug mode, but remember to turn it off later:
$ export TF_LOG=debug$ terraform plan -out eks.plan<many-many lines here>$ unset TF_LOG
This experience will be useful, and most likely you’ll get an EKS cluster launched (use “terraform apply” for that). Check it out in the AWS console:
Clean up when you get bored:
$ terraform destroy
Then go to the next level and start using the Terraform EKS module, especially since it’s based on the same EKS introduction. In the examples/ directory you’ll see how to use it. You’ll also find other examples there.
We simplified the examples somewhat. Here’s the main file in which the VPC creation and EKS creation modules are called:
$ cat <root_repo_dir>/terraform/main.tfterraform { required_version = ">= 0.12.0" backend "s3" { bucket = "eks-github-actions-terraform" key = "terraform-dev.tfstate" region = "eu-west-1" dynamodb_table = "eks-github-actions-terraform-lock" }}
provider "kubernetes" { host = data.aws_eks_cluster.cluster.endpoint cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data) token = data.aws_eks_cluster_auth.cluster.token load_config_file = false version = "1.10.0"}
locals { vpc_name = "dev-vpc" vpc_cidr = "10.42.0.0/16" private_subnets = ["10.42.1.0/24", "10.42.2.0/24"] public_subnets = ["10.42.11.0/24", "10.42.12.0/24"] cluster_name = "dev-cluster" cluster_version = "1.14" worker_group_name = "worker-group-1" instance_type = "t2.medium" asg_desired_capacity = 1}
data "aws_eks_cluster" "cluster" { name = module.eks.cluster_id}
data "aws_eks_cluster_auth" "cluster" { name = module.eks.cluster_id}
data "aws_availability_zones" "available" {}
module "vpc" { source = "git::https://github.com/terraform-aws-modules/terraform-aws-vpc?ref=master"
name = local.vpc_name cidr = local.vpc_cidr azs = data.aws_availability_zones.available.names private_subnets = local.private_subnets public_subnets = local.public_subnets enable_nat_gateway = true single_nat_gateway = true enable_dns_hostnames = true
tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" }
public_subnet_tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" "kubernetes.io/role/elb" = "1" }
private_subnet_tags = { "kubernetes.io/cluster/${local.cluster_name}" = "shared" "kubernetes.io/role/internal-elb" = "1" }}
module "eks" { source = "git::https://github.com/terraform-aws-modules/terraform-aws-eks?ref=master" cluster_name = local.cluster_name cluster_version = local.cluster_version vpc_id = module.vpc.vpc_id subnets = module.vpc.private_subnets write_kubeconfig = false
worker_groups = [ { name = local.worker_group_name instance_type = local.instance_type asg_desired_capacity = local.asg_desired_capacity } ]
map_accounts = var.map_accounts map_roles = var.map_roles map_users = var.map_users}
Let’s look a little more closely at the “terraform” block in main.tf:
terraform { required_version = ">= 0.12.0" backend "s3" { bucket = "eks-github-actions-terraform" key = "terraform-dev.tfstate" region = "eu-west-1" dynamodb_table = "eks-github-actions-terraform-lock" }}
Here we indicate that we’ll adhere to the syntax not lower than Terraform 0.12 (much has changed compared with earlier versions), and also that Terraform shouldn’t store its state locally, but rather remotely, in the S3 bucket.
It’s convenient if the terraform code can be updated from different places by different people, which means we need to be able to lock a user’s state, so we added a lock using a dynamodb table. Read more about locks on the State Locking page.
Since the name of the bucket should be unique throughout AWS, the name “eks-github-actions-terraform” won’t work for you. Please think up your own and make sure it’s not already taken (so you’re getting a NoSuchBucket error):
$ aws s3 ls s3://my-bucketAn error occurred (AllAccessDisabled) when calling the ListObjectsV2 operation: All access to this object has been disabled$ aws s3 ls s3://my-bucket-with-name-that-impossible-to-rememberAn error occurred (NoSuchBucket) when calling the ListObjectsV2 operation: The specified bucket does not exist
Having come up with a name, create the bucket (we use the IAM user “terraform” here. It has administrator rights so it can create a bucket) and enable versioning for it (which will save your nerves in the event of a configuration error):
$ aws s3 mb s3://eks-github-actions-terraform --region eu-west-1make_bucket: eks-github-actions-terraform$ aws s3api put-bucket-versioning --bucket eks-github-actions-terraform --versioning-configuration Status=Enabled$ aws s3api get-bucket-versioning --bucket eks-github-actions-terraform{ "Status": "Enabled"}
With DynamoDB, uniqueness is not needed, but you do need to create a table first:
$ aws dynamodb create-table \ --region eu-west-1 \ --table-name eks-github-actions-terraform-lock \ --attribute-definitions AttributeName=LockID,AttributeType=S \ --key-schema AttributeName=LockID,KeyType=HASH \ --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
Keep in mind that, in case of Terraform failure, you may need to remove a lock manually from the AWS console. But be careful when doing so.With regard to the module eks/vpc blocks in main.tf, the way to reference the module available on GitHub is simple:
git::https://github.com/terraform-aws-modules/terraform-aws-vpc?ref=master
Now let’s look at our other two Terraform files (variables.tf and outputs.tf). The first holds our Terraform variables:
$ cat <root_repo_dir>/terraform/variables.tfvariable "region" { default = "eu-west-1"}
variable "map_accounts" { description = "Additional AWS account numbers to add to the aws-auth configmap. See examples/basic/variables.tf for example format." type = list(string) default = []}
variable "map_roles" { description = "Additional IAM roles to add to the aws-auth configmap." type = list(object({ rolearn = string username = string groups = list(string) })) default = []}
variable "map_users" { description = "Additional IAM users to add to the aws-auth configmap." type = list(object({ userarn = string username = string groups = list(string) })) default = [ { userarn = "arn:aws:iam::01234567890:user/my-user" username = "my-user" groups = ["system:masters"] } ]}
The most important part here is adding the IAM user “my-user” to the map_users variable, but you should use your own account ID here in place of 01234567890.
What does this do? When you communicate with EKS through the local kubectl client, it sends requests to the Kubernetes API server, and each request goes through authentication and authorization processes so Kubernetes can understand who sent the request and what they can do. So the EKS version of Kubernetes asks AWS IAM for help with user authentication. If the user who sent the request is listed in AWS IAM (we pointed to his ARN here), the request goes to the authorization stage, which EKS processes itself, but according to our settings. Here, we indicated that the IAM user “my-user” is very cool (group “system: masters”).
Finally, the outputs.tf file describes what Terraform should print after it finishes a job:
$ cat <root_repo_dir>/terraform/outputs.tfoutput "cluster_endpoint" { description = "Endpoint for EKS control plane." value = module.eks.cluster_endpoint}
output "cluster_security_group_id" { description = "Security group ids attached to the cluster control plane." value = module.eks.cluster_security_group_id}
output "config_map_aws_auth" { description = "A kubernetes configuration to authenticate to this EKS cluster." value = module.eks.config_map_aws_auth}
This completes the description of the Terraform part. We’ll return soon to see how we’re going to launch these files.
Kubernetes Manifests
So far, we’ve taken care of where to launch the application. Now let’s look at what to run.
Recall that we have docker-compose.yml (we renamed the service and added a couple of labels that kompose will use shortly) in the <root_repo_dir>/k8s/ directory:
$ cat <root_repo_dir>/k8s/docker-compose.ymlversion: "3.7"services: samples-bi: container_name: samples-bi image: intersystemsdc/iris-community:2019.4.0.383.0-zpm ports: - 52773:52773 labels: kompose.service.type: loadbalancer kompose.image-pull-policy: IfNotPresent
Run kompose and then add what’s highlighted below. Delete annotations (to make things more intelligible):
$ kompose convert -f docker-compose.yml --replicas=1$ cat <root_repo_dir>/k8s/samples-bi-deployment.yamlapiVersion: extensions/v1beta1kind: Deploymentmetadata: labels: io.kompose.service: samples-bi name: samples-bispec: replicas: 1 strategy: type: Recreate template: metadata: labels: io.kompose.service: samples-bi spec: containers: - image: intersystemsdc/iris-community:2019.4.0.383.0-zpm imagePullPolicy: IfNotPresent name: samples-bi ports: - containerPort: 52773 resources: {} lifecycle: postStart: exec: command: - /bin/bash - -c - | echo -e "write\nhalt" > test until iris session iris < test; do sleep 1; done echo -e "zpm\ninstall samples-bi\nquit\nhalt" > samples_bi_install iris session iris < samples_bi_install rm test samples_bi_install restartPolicy: Always
We use the Recreate update strategy, which means that the pod will be deleted first and then recreated. This is permissible for demo purposes and allows us to use fewer resources.We also added the postStart hook, which will trigger immediately after the pod starts. We wait until IRIS starts up and install the samples-bi package from the default zpm-repository.Now we add the Kubernetes service (also without annotations):
$ cat <root_repo_dir>/k8s/samples-bi-service.yamlapiVersion: v1kind: Servicemetadata: labels: io.kompose.service: samples-bi name: samples-bispec: ports: - name: "52773" port: 52773 targetPort: 52773 selector: io.kompose.service: samples-bi type: LoadBalancer
Yes, we’ll deploy in the “default” namespace, which will work for the demo.
Okay, now we know where and what we want to run. It remains to see how.
The GitHub Actions Workflow
Rather than doing everything from scratch, we’ll create a workflow similar to the one described in Deploying InterSystems IRIS solution on GKE Using GitHub Actions. This time we don’t have to worry about building a container. The GKE-specific parts are replaced by those specific to EKS. Bolded parts are related to receiving the commit message and using it in conditional steps:
$ cat <root_repo_dir>/.github/workflows/workflow.yamlname: Provision EKS cluster and deploy Samples BI thereon: push: branches: - master
# Environment variables.# ${{ secrets }} are taken from GitHub -> Settings -> Secrets# ${{ github.sha }} is the commit hashenv: AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} AWS_REGION: ${{ secrets.AWS_REGION }} CLUSTER_NAME: dev-cluster DEPLOYMENT_NAME: samples-bi
jobs: eks-provisioner: # Inspired by: ## https://www.terraform.io/docs/github-actions/getting-started.html ## https://github.com/hashicorp/terraform-github-actions name: Provision EKS cluster runs-on: ubuntu-18.04 steps: - name: Checkout uses: actions/checkout@v2
- name: Get commit message run: | echo ::set-env name=commit_msg::$(git log --format=%B -n 1 ${{ github.event.after }})
- name: Show commit message run: echo $commit_msg
- name: Terraform init uses: hashicorp/terraform-github-actions@master with: tf_actions_version: 0.12.20 tf_actions_subcommand: 'init' tf_actions_working_dir: 'terraform'
- name: Terraform validate uses: hashicorp/terraform-github-actions@master with: tf_actions_version: 0.12.20 tf_actions_subcommand: 'validate' tf_actions_working_dir: 'terraform'
- name: Terraform plan if: "!contains(env.commit_msg, '[destroy eks]')" uses: hashicorp/terraform-github-actions@master with: tf_actions_version: 0.12.20 tf_actions_subcommand: 'plan' tf_actions_working_dir: 'terraform'
- name: Terraform plan for destroy if: "contains(env.commit_msg, '[destroy eks]')" uses: hashicorp/terraform-github-actions@master with: tf_actions_version: 0.12.20 tf_actions_subcommand: 'plan' args: '-destroy -out=./destroy-plan' tf_actions_working_dir: 'terraform'
- name: Terraform apply if: "!contains(env.commit_msg, '[destroy eks]')" uses: hashicorp/terraform-github-actions@master with: tf_actions_version: 0.12.20 tf_actions_subcommand: 'apply' tf_actions_working_dir: 'terraform'
- name: Terraform apply for destroy if: "contains(env.commit_msg, '[destroy eks]')" uses: hashicorp/terraform-github-actions@master with: tf_actions_version: 0.12.20 tf_actions_subcommand: 'apply' args: './destroy-plan' tf_actions_working_dir: 'terraform'
kubernetes-deploy: name: Deploy Kubernetes manifests to EKS needs: - eks-provisioner runs-on: ubuntu-18.04 steps: - name: Checkout uses: actions/checkout@v2
- name: Get commit message run: | echo ::set-env name=commit_msg::$(git log --format=%B -n 1 ${{ github.event.after }})
- name: Show commit message run: echo $commit_msg
- name: Configure AWS Credentials if: "!contains(env.commit_msg, '[destroy eks]')" uses: aws-actions/configure-aws-credentials@v1 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ secrets.AWS_REGION }}
- name: Apply Kubernetes manifests if: "!contains(env.commit_msg, '[destroy eks]')" working-directory: ./k8s/ run: | aws eks update-kubeconfig --name ${CLUSTER_NAME} kubectl apply -f samples-bi-service.yaml kubectl apply -f samples-bi-deployment.yaml kubectl rollout status deployment/${DEPLOYMENT_NAME}
Of course, we need to set the credentials of the “terraform” user (take them from the ~/.aws/credentials file), letting Github use its secrets:
Notice the highlighted parts of workflow. They will enable us to destroy an EKS cluster by pushing a commit message that contains a phrase “[destroy eks]”. Note that we won’t run “kubernetes apply” with such a commit message.Run a pipeline, but first create a .gitignore file:
$ cat <root_repo_dir>/.gitignore.DS_Storeterraform/.terraform/terraform/*.planterraform/*.json$ cd <root_repo_dir>$ git add .github/ k8s/ terraform/ .gitignore$ git commit -m "GitHub on EKS"$ git push
Monitor deployment process on the "Actions" tab of GitHub repository page. Please wait for successful completion.
When you run a workflow for the very first time, it will take about 15 minutes on the “Terraform apply” step, approximately as long as it takes to create the cluster. At the next start (if you didn’t delete the cluster), the workflow will be much faster. You can check this out:
$ cd <root_repo_dir>$ git commit -m "Trigger" --allow-empty$ git push
Of course, it would be nice to check what we did. This time you can use the credentials of IAM “my-user” on your laptop:
$ export AWS_PROFILE=my-user$ export AWS_REGION=eu-west-1$ aws sts get-caller-identity$ aws eks update-kubeconfig --region=eu-west-1 --name=dev-cluster --alias=dev-cluster$ kubectl config current-contextdev-cluster
$ kubectl get nodesNAME STATUS ROLES AGE VERSIONip-10-42-1-125.eu-west-1.compute.internal Ready <none> 6m20s v1.14.8-eks-b8860f
$ kubectl get poNAME READY STATUS RESTARTS AGEsamples-bi-756dddffdb-zd9nw 1/1 Running 0 6m16s
$ kubectl get svcNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEkubernetes ClusterIP 172.20.0.1 <none> 443/TCP 11msamples-bi LoadBalancer 172.20.33.235 a2c6f6733557511eab3c302618b2fae2-622862917.eu-west-1.elb.amazonaws.com 52773:31047/TCP 6m33s
Go to http://a2c6f6733557511eab3c302618b2fae2-622862917.eu-west-1.elb.amazonaws.com:52773/csp/user/_DeepSee.UserPortal.Home.zen?$NAMESPACE=USER (substitute link by your External-IP), then type “_system”, “SYS” and change the default password. You should see a bunch of BI dashboards:
Click on each one’s arrow to deep dive:
Remember, if you restart a samples-bi pod, all your changes will be lost. This is intentional behavior as this is a demo. If you need persistence, I've created an example in the github-gke-zpm-registry/k8s/statefulset.tpl repository.
When you’re finished, just remove everything you’ve created:
$ git commit -m "Mr Proper [destroy eks]" --allow-empty$ git push
Conclusion
In this article, we replaced the eksctl utility with Terraform to create an EKS cluster. It’s a step forward to “codify” all of your AWS infrastructure.We showed how you can easily deploy a demo application with git push using Github Actions and Terraform.We also added kompose and a pod’s postStart hooks to our toolbox.We didn’t show TLS enabling this time. That’s a task we’ll undertake in the near future. 💡 This article is considered as InterSystems Data Platform Best Practice.
Article
Evgeny Shvarov · Jun 24, 2020
Hi Developers!
Suppose you have a persistent class with data and you want to have a simple Angular UI for it to view the data and make CRUD operations.
Recently @Alberto.Fuentes described how to build Angular UI for your InterSystems IRIS application using RESTForms2.
In this article, I want to tell you how you can get a simple Angular UI to CRUD and view your InterSystems IRIS class data automatically in less than 5 minutes.
Let's go!
To make this happen you need:
1. InterSystems IRIS
2. ZPM
3. RESTForms2 and RESTForms2-UI modules.
I'll take a Data.Countries class which I generated and imported via csvgen using this command:
d ##class(community.csvgen).GenerateFromURL("https://raw.githubusercontent.com/datasciencedojo/datasets/master/WorldDBTables/CountryTable.csv",",","Data.Countries"
To make an Angular UI we need to expose REST API for this class, which will service CRUD operations.
Let's use restforms2 module for this.
This command in dockerfile installs restforms2 into IRIS container:
zpm "install restforms2" \
To add a REST API we need to derive the class from Form.Adaptor:
Class Data.Countries Extends (%Library.Persistent, Form.Adaptor)
Add restforms2 parameters to the persistent class to manage the general behavior: sorting parameter, display name, etc:
// Form name, not a global key so it can be anything
Parameter FORMNAME = "Countries";
/// Default permissions
/// Objects of this form can be Created, Read, Updated and Deleted
/// Redefine this parameter to change permissions for everyone
/// Redefine checkPermission method (see Form.Security) for this class
/// to add custom security based on user/roles/etc.
Parameter OBJPERMISSIONS As %String = "CRUD";
/// Property used for basic information about the object
/// By default getObjectDisplayName method gets its value from it
Parameter DISPLAYPROPERTY As %String = "name";
Perfect. Next, we can use restforms2 syntax to let restforms2 know, what properties of the class we want to expose to the CRUD. You can make it adding "DISPLAYNAME =" attribute to the properties, you want to expose into restforms2-ui. Example:
Property code As %Library.String(MAXLEN = 250) [ SqlColumnNumber = 2 ];
Property name As %Library.String(DISPLAYNAME = "Name", MAXLEN = 250) [ SqlColumnNumber = 3 ];
Property continent As %Library.String(DISPLAYNAME = "Continent", MAXLEN = 250) [ SqlColumnNumber = 4 ];
Property region As %Library.String(DISPLAYNAME = "Region", MAXLEN = 250) [ SqlColumnNumber = 5 ];
Property surfacearea As %Library.Integer(DISPLAYNAME = "Surface Area", MAXVAL = 2147483647, MINVAL = -2147483648) [ SqlColumnNumber = 6, SqlFieldName = surface_area ];
Property independenceyear As %Library.Integer(DISPLAYNAME = "Independence Year", MAXVAL = 2147483647, MINVAL = -2147483648) [ SqlColumnNumber = 7, SqlFieldName = independence_year ];
Great! Now lets introduce the UI layer. This command in dockerfile installs restforms2-ui, which is Angular UI for Restform2:
zpm "install restforms2-ui" \
That's it! Let' examine the UI for your class, which you can find in the URL server:port/restforms2-ui:
RESTForms goes with test classes Person and Company - and you can use it to examine the features of restroomsUI. Currently It can edit string, number, boolean, date and look-up fields.
You can test all this on your laptop, if clone and build this repository:
docker-compose up -d --build
And then open the URL:
localhost:port/restforms2-ui/index.html
or if you use VSCode, select this menu item:
Happy coding and stay tuned! It's great! I tried the application, and I liked the interface and how easy it is to create a simple CRUD using RESTForms. 💡 This article is considered as InterSystems Data Platform Best Practice. Love the accelerator concept for quick and easy CRUD :)
Article
Eduard Lebedyuk · Aug 3, 2020
InterSystems IRIS currently limits classes to 999 properties.
But what to do if you need to store more data per object?
This article would answer this question (with the additional cameo of Community Python Gateway and how you can transfer wide datasets into Python).
The answer is very simple actually - InterSystems IRIS currently limits classes to 999 properties, but not to 999 primitives. The property in InterSystems IRIS can be an object with 999 properties and so on - the limit can be easily disregarded.
Approach 1.
Store 100 properties per serial property. First create a stored class that stores a hundred properties.
Class Test.Serial Extends %SerialObject
{
Property col0;
...
Property col99;
}
And in your main class add as much properties as you need:
Class Test.Record Extends %Persistent
{
Property col00 As Test.Serial;
Property col01 As Test.Serial;
...
Property col63 As Test.Serial;
}
This immediately raises your limit to 99900 properties.
This approach offers uniform access for all properties via SQL and object layers (we always know property reference by it's number).
Approach 2.
One $lb property.
Class Test.Record Extends %Persistent
{
Property col As %List;
}
This approach is simpler but does not provide explicit column names.
Use SQL $LIST* Functions to access list elements.
Approach 3.
Use Collection (List Of/Array Of) property.
Class Test.Record Extends %Persistent
{
Property col As List Of %Integer;
}
This approach also does not provide explicit column names for individual values (but do you really need it?). Use property parameters to project the property as SQL column/table.
Docs for collection properties.
Approach 4.
Do not create properties at all and expose them via SQL Stored procedure/%DispatchGetProperty.
Class Test.Record Extends %Persistent
{
Parameter GLVN = {..GLVN("Test.Record")};
/// SELECT Test_Record.col(ID, 123)
/// FROM Test.Record
///
/// w ##class(Test.Record).col(1, )
ClassMethod col(id, num) As %Decimal [ SqlProc ]
{
#define GLVN(%class) ##Expression(##class(Test.Record).GLVN(%class))
quit $lg($$$GLVN("Test.Record")(id), num + 1)
}
/// Refer to properties as: obj.col123
Method %DispatchGetProperty(Property As %String) [ CodeMode = expression ]
{
..col(..%Id(), $e(Property, 4, *))
}
/// Get data global
/// w ##class(Test.Record).GLVN("Test.Record")
ClassMethod GLVN(class As %Dictionary.CacheClassname = {$classname()}) As %String
{
return:'$$$comClassDefined(class) ""
set strategy = $$$comClassKeyGet(class, $$$cCLASSstoragestrategy)
return $$$defMemberKeyGet(class, $$$cCLASSstorage, strategy, $$$cSDEFdatalocation)
}
The trick here is to store everything in the main $lb and use unallocated schema storage spaces to store your data. Here's an article on global storage.
With this approach, you can also easily transfer the data into Python environment with Community Python Gateway via the ExecuteGlobal method.
This is also the fastest way to import CSV files due to the similarity of the structures.
Conclusion
999 property limit can be easily extended in InterSystems IRIS.
Do you know other approaches to storing wide datasets? If so, please share them!
The question is how csvgen could be upgraded to consume csv files with 1000+ cols. While I always advertise CSV2CLASS methods for generic solutions, wide datasets often possess an (un)fortunate characteristic of also being long.
In that case custom object-less parser works better.
Here's how it can be implemented.
1. Align storage schema with CSV structure
2. Modify this snippet for your class/CSV file:
Parameter GLVN = {..GLVN("Test.Record")};
Parameter SEPARATOR = ";";
ClassMethod Import(file = "source.csv", killExtent As %Boolean = {$$$YES})
{
set stream = ##class(%Stream.FileCharacter).%New()
do stream.LinkToFile(file)
kill:killExtent @..#GLVN
set i=0
set start = $zh
while 'stream.AtEnd {
set i = i + 1
set line = stream.ReadLine($$$MaxStringLength)
set @..#GLVN($i(@..#GLVN)) = ..ProcessLine(line)
write:'(i#100000) "Processed:", i, !
}
set end = $zh
write "Done",!
write "Time: ", end - start, !
}
ClassMethod ProcessLine(line As %String) As %List
{
set list = $lfs(line, ..#SEPARATOR)
set list2 = ""
set ptr=0
// NULLs and numbers handling.
// Add generic handlers here.
// For example translate "N/A" value into $lb() if that's how source data rolls
while $listnext(list, ptr, value) {
set list2 = list2 _ $select($g(value)="":$lb(), $ISVALIDNUM(value):$lb(+value), 1:$lb(value))
}
// Add specific handlers here
// For example convert date into horolog in column4
// Add %%CLASSNAME
set list2 = $lb() _ list2
quit list2
} Thanks, Ed!Could you make a PR? I have no concrete ideas on how to automate this.
This is a more case-by-case basis. After more than 42 years of M-programming and in total of 48 years of programming experience I would say, if you need a class with about 1000 or more properties than something is wrong with your (database) design. There is nothing more to say. Period. Wide datasets are fairly typical for:
Industrial data
IoT
Sensors data
Mining and processing data
Spectrometry data
Analytical data
Most datasets after one-hot-encoding applied
NLP datasets
Any dataset where we need to raise dimensionality
Media featuresets
Social Network/modelling schemas
I'm fairly sure there's more areas but I have not encountered them myself.
Recently I have delivered a PoC with classes more than 6400 columns wide and that's where I got my inspiration for this article (I chose approach 4).
@Renato.Banzai also wrote an excellent article on his project with more than 999 properties.
Overall I'd like to say that a class with more than 999 properties is a correct design in many cases. You probably right for a majority of tasks. But how do you manage with AI tasks which NEED to manage thousands of features of entities? And features are properties/fields from data storage perspective.
Anyway, I'm really curious how do you deal with AI/ML tasks in IRIS or Caché. Entity–attribute–value model is usually used for this purpose.
I have already written about this at the time: SQL index for array property elements. That's good and well for sparse datasets (where say you have a record with 10 000 possible attributes but on average only 50 are filled).
EAV does not help in dense cases where every record actually has 10 000 attributes. My EAV implementation is the same as your Approach 3, so it will work fine even with fully filled 4.000.000 attributes.
Since the string has a limit of 3,641,144, approaches with serial and %List are dropped.
All other things being equal, everything depends on the specific technical task: speed, support for Objects/SQL, the ability to name each attribute, the number of attributes, and so on.
Announcement
Anastasia Dyubaylo · Dec 12, 2018
Hi Community!We are pleased to invite you to the upcoming webinar "Using Blockchain with InterSystems IRIS" on 20th of December at 10:00 (Moscow time)! Blockchain is a technology of distributed information storage and mechanisms to ensure its integrity. Blockchain is becoming more common in various areas, such as the financial sector, government agencies, healthcare and others.InterSystems IRIS makes it easy to integrate with one of the most popular blockchain networks – Ethereum. At the webinar we will talk about what a blockchain is and how you can start using it in your business. We will also demonstrate the capabilities of the Ethereum adapter for creating applications using the Ethereum blockchain.The following topics are planned to be considered:Introduction to BlockchainEthereumSmart contracts in EthereumIntersystems IRIS adapter for EthereumApplication example using adapterPresenter: @Nikolay.SolovievAudience: The webinar is designed for developers.Note: The language of the webinar is Russian.We are waiting for you at our webinar! Register now! It is tomorrow! Don't miss Register here! And now this webinar recording is available in a dedicated Webinars in Russian playlist on InterSystems Developers YouTube: Enjoy it!
Question
Nikhil Pawaria · Jan 25, 2019
How we can reduce the size of cache.dat file? Even after deleting the globals of a particular database from management portal size of its cache.dat file is not reduced. This is the way to do it, but make sure you are on a version where this won't cause problems. See:https://www.intersystems.com/support-learning/support/product-news-alerts/support-alert/alert-database-compaction/https://www.intersystems.com/support-learning/support/product-news-alerts/support-alerts-2015/https://www.intersystems.com/support-learning/support/product-news-alerts/support-alerts-2012/ You need to do these three steps in order:Compact Globals in a Database (optional)Compact a DatabaseTruncate a DatabaseIn can be done via ^DATABASE utility or in management portal. CACHE.DAT or IRIS.DAT, can only grow during normal work. But you can shrink it manually. But it is not as easy as it maybe sounds. And depends on version which you use, only past few versions were added with the compact tool. On very old versions you have to copy data from old database to thew new one.You can read my articles, about internal structure of CACHE.DAT, just to know what is this inside. And about database with visualization, where you can see how to compact database, and how it actually works.
Question
Stephan Gertsobbe · Jul 13, 2019
Hi all, we are wondering if anybody has a reporting tool that is capable using IRIS Objects?I know there are things like Crystal Reports and others out there who can read the SQL Data throug ODBC but we need the capability of using object methods while running the report.Since now we where using a JAVA based report generator (ReportWeaver) but since the object binding for JAVA doesn't exist anymore in IRIS data platform, did any of you have an alternative report generator? Looking forward to any answers cheersStephan No that's not really what I meat. My question was much more generic about data projection of object data like listOfObjects.When you look at the projected data of those "object" collections they are projected as $LISTBUILD lists in SQL. So the question was, is there a reporting tool out there in use, that can handle the object data from IRIS as for IRIS there is no object binding anymore like there was for Caché.For Java there is the cachedb.jar and that binding doesn't exist for IRIS. "using object methods while running the report"This is a rather generic statement.If you are using CLASS METHODS (as I 'd assume) you can project each class method as Stored SQL Procedure too.By this, you can make them available to be used over JDBC.Could be an eventual workaround.
Question
Evgeny Shvarov · Feb 12, 2019
Hi Community!What's the limit for Namespaces and Databases for one InterSystems IRIS installation?Yes, I checked with documentation but cannot find it at once. to my understanding, there is no technical limit.Though I believe to remember that it used to be ~16.000 some time in past.Class SYS.Database maps to ^SYS("CONFIG","IRIS","Databases",<DBNAME>) and has NO limit theresimilar Namespaces are stored in SYS("CONFIG","IRIS","Namespaces",<NSPCE>) an are covered by %SYS.Namespace If there is any limit it must be related to internal memory structures. (gmheap ??)
Announcement
Anastasia Dyubaylo · Dec 6, 2018
Hey Developers!Good news! Just in time for the holidays, Gartner Peer Insights is offering customers a $25 digital Visa Gift Card for an approved review of InterSystems IRIS or Caché this month!We decided to support and double the stakes. So! In December '18 you can get the second $25 digital Visa Gift Card for Gartner review on Caché or InterSystems IRIS on InterSystems Global Masters Advocacy Hub!See the rules below.Step #1: To get $25 Visa Card from Gartner Peer Insights, follow this unique link and submit a review. Make a screenshot for Step#2 so that we can see that you reviewed InterSystems IRIS or Caché.Note: The survey takes about 10-15 minutes. Gartner will authenticate the identity of the reviewer, but the published reviews are anonymous. You can check the status of your review and gift card in your Gartner Peer Insights reviewer profile at any time.Step #2: To get $25 Visa Card from InterSystems, complete a dedicated challenge on InterSystems Global Masters Advocacy Hub — upload a screenshot from Step #1.Don't forget: • This promotion is only for reviews entered in the month of December 2018. • InterSystems IRIS and Caché reviews only. • Use mentioned above unique link in order to qualify for the gift cards.Done? Awesome! Your card is on its way! To join Global Masters leave a comment to the post and we'll send the invite! Hurry up to get your $100 from December Caché and IRIS campaign from Gartner and InterSystems! ;) Only 12 days left!The recipe is the following: 1. You are our current customer of Caché or/and InterSystems IRIS.2. Make the review using this link.3. Get your $25 for Caché or InterSystems IRIS review ($50 for both).4. Save the screenshots of reviews and submit it in Global Masters - get another $25 for every Caché and InterSystems IRIS from Global Masters.5. Merry Christmas and have a great new year 2019! This is a good idea, hopefully everyone will do this but I did have a problem.Perhaps I have done this incorrectly but I could not see a way to submit screenshots in the challenge and when you click the "lets review" button, or whatever the actual text was, it closes it as completed and there seems no way to submit a screenshot. Also, the link to the challenge is for the same challenge number as it appears in and it takes you to the Global Masters front page.Also, you don't seem able to review both as suggested, if you use the link again or search for the platform you haven't reviewed yet it will simply state you have already submitted a review. I suspect this is because using the link you have to choose between Iris or Cache and so the offer is for one or the other but not both. Hi David! Thanks for reporting this. Our support team will contact you via GM direct messaging. Dear Community Members!Thank you so much for making reviews! You made InterSystems Data Platforms Caché and InterSystems IRIS a Gartner Customers' Choice 2019 in Operational Database management Systems!
Announcement
Anastasia Dyubaylo · Apr 12, 2019
Hi Community!
We're pleased to invite you to the DockerCon 2019 – the #1 container industry conference for all things Kubernetes, microservices, and DevOps. The event will be held at the Moscone Center in San Francisco from April 29 to May 2.
In addition, there will be a special session "Containerized Databases for Enterprise Applications" presented by @Thomas.Carroll, Product Specialist at InterSystems.
See the details below.
Containerized Databases for Enterprise Applications | Wednesday, May 1, 12:00 PM - 12:40 PM – Room 2001
Session Track: Using Docker for IT Infra & Ops
Containers are now being used in organizations of all sizes. From small startups to established enterprises, data persistence is necessary in many mission critical applications. “Containers are not for database applications” is a misconception and nothing could be further from the truth. This session aims to help practitioners navigate the minefield of database containerization and avoid some of the major pitfalls that can occur. Discussion includes traditional enterprise database concerns surrounding data resilience, high availability, and storage and how they mesh with a containerized deployment.
Speaker BioJoe is a Product Specialist at InterSystems, a passionate problem solver, and a container evangelist. He started his career as a solution architect for enterprise database applications before transitioning to product management. Joe is in the trenches of InterSystems transformation to a container-first, cloud-first product strategy. When he isn’t at a Linux shell he enjoys long walks on the beach, piña coladas (hold the rain), woodworking, and BBQ.
Be the first to register now! It's really great news. And so cool that InterSystems started to participate more in developers conferences. I wish to participate all of them :)
Announcement
Anastasia Dyubaylo · Oct 26, 2020
Hey Community,
We're pleased to invite you all to the Virtual Summit 2020 session dedicated to InterSystems online programming contests, best winning projects, and their developers! Please join:
⚡️ "Best applications of InterSystems programming contest series: Best IntegratedML, FHIR, REST API, Native API, ObjectScript solutions" session ⚡️
Please check the details below.
We will talk about the series of online contests for InterSystems developers. This session will focus on the contest winners and the top applications. Our developers will share their experience of participating in the exiting InterSystems coding marathon and will show demos of their winning projects.
Speakers: 🗣 @Anastasia.Dyubaylo, Community Manager, InterSystems 🗣 @Henrique.GonçalvesDias, System Management Specialist / Database Administrator, Sao Paulo Federal Court🗣 @José.Pereira, Business Intelligence Developer, Shift Consultoria e Sistemas Ltda🗣 @Henry.HamonPereira, System Analyst, BPlus Tecnologia🗣 @Dmitry.Maslennikov, Co-founder, CTO and Developer Advocate, CaretDev Corp🗣 @Renato.Banzai, Machine Learning Engineer Coordinator, Itaú Unibanco
Date & Time:
➡️ Day 1: Tuesday, October 27 (Boston starts Monday, October 26)
APAC
UTC Time
Boston Time
Best Applications of InterSystems Programming Contest Series
2:50 AM
10:50 PM
NA/LATAM/EMEA
UTC Time
Boston Time
Best Applications of InterSystems Programming Contest Series
3:50 PM
11:50 PM
So!
We will be happy to answer your questions in a virtual chat on the conference platform – please join! We'll start in 15 minutes! Please join!
📍 https://intersystems.6connex.com/event/virtual-summit/en-us/contents/433176/share?rid=FocusSessions&nid=804450 💥 Join us NOW here: https://intersystems.6connex.com/event/virtual-summit/en-us/contents/433253/share?rid=FocusSessions&nid=804450
Article
John Murray · Oct 27, 2020
Now that 1.0 has shipped and is featuring in various sessions at Virtual Summit 2020 it seems like a good time to offer some guidance on how to report problems.
InterSystems ObjectScript for VS Code consists of three collaborating VS Code extensions. For ease of installation and management there's a fourth entity, the InterSystems ObjectScript Extension Pack. It's a great way to get started with minimum clicks, and handy to have even if you have already installed the other extensions.
This modular architecture also means there are three different GitHub repositories where issues can be created. Fortunately VS Code itself helps with the task. Here's how to use it:
1. From the Help menu in VS Code choose Report Issue. Alternatively, open the Command Palette (I typically do this by pressing the F1 key) and run Help: Report Issue... (Pro Palette Tip: try typing just hri and see how fast it gets you to the right command)
2. A dialog like this appears:
3. Use the first field to classify your issue:
Bug Report
Feature Request
Performance Issue
4. In the second field pick "An extension".
5. The third dropdown lets you pick one of your installed extensions. You can also type a few characters to find the right entry. For example, isls quickly selects "InterSystems Language Server" for me.
Which one to choose? Here's a rough guide:
InterSystems Language Server
code colo(u)ring
Intellisense
InterSystems ObjectScript
export, import and compile
ObjectScript Explorer (browsing namespace contents)
direct server-side editing using isfs:// folders in a workspace
integration with server-side source control etc
InterSystems Server Manager
password management in local keychain
definition and selection of entries in `intersystems.servers`
If you can't decide, pick InterSystems ObjectScript.
6. Type a descriptive one-line summary of your issue. The dialog may offer a list of existing issues which could be duplicates. If you don't find one that covers yours, proceed.
7. Begin to enter details. At this stage I usually type just one character, then click "Preview on GitHub" to launch a browser page where I can use the familiar GH issue UI to complete my report. Tips for use there:
Paste images from your clipboard directly into the report field on GH. For hard-to-describe issues an animated GIF gets bonus points.
Link to other issues by prefixing the target number with #
Remember that whatever you post here is visible to anyone on the Internet. Mask/remove confidential information. Be polite.
8. When you are happy with what you have written (tip - use the Preview tab) click "Submit new issue".
Using Help: Report Issue... means your version numbers are automatically added.
Announcement
Anastasia Dyubaylo · Oct 27, 2020
Hey Developers,
We remind you about a great opportunity to make a live conversation with InterSystems Product Managers Team on Live Q&A Sessions at Virtual Summit 2020!
🗓 TODAY at 12:40 PM EDT at https://intersystems.6connex.com/event/virtual-summit/en-us/contents/434370/share?rid=FocusSessions&nid=804450
And now we've added more options to make it even easier for you to ask questions upfront:
✅ Submit your questions in the comments to this post
✅ Submit your question to our Discord Channel: discord.gg/WqVjtD
✅ Submit your questions to VS2020questions@InterSystems.com
✅ Send your question personally to @Anastasia.Dyubaylo or @Evgeny.Shvarov in Direct Messages on the community
✅ Submit your question to Q&A Chat on the conference platform during the session
Note: We will pass all your questions to the PM team, and you'll receive answers during the Live Q&A Sessions.
And let me introduce the whole InterSystems Product Managers Team:
@Jeffrey.Fried, Director of Product Managers @Andreas.Dieckow, Principal Product Manager@Robert.Kuszewski, Product Manager - Developer Experience @Raj.Singh5479, Product Manager - Developer Experience @Carmen.Logue, Product Manager - AI and Analytics @Thomas.Dyar, Product Specialist - Machine Learning @Steven.LeBlanc, Product Specialist - Cloud Operations @Patrick.Jamieson3621, Product Manager - Health Informatics Platform@Benjamin.DeBoe, Product Manager @Stefan.Wittmann, Product Manager@Luca.Ravazzolo, Product Manager @Craig.Lee, Product Specialist
So!
Please don't hesitate to ask your questions! Our PM team will be happy to answer you!
➡️ Our Live Q&A Sessions last from November 27 to 29! Schedule in this post. Please join us now!
📍 https://intersystems.6connex.com/event/virtual-summit/en-us/contents/433280/share?rid=FocusSessions&nid=804450 🗓 TODAY at 12:40 PM EDT at https://intersystems.6connex.com/event/virtual-summit/en-us/contents/434195/share?rid=FocusSessions&nid=804450
Please feel free to submit your questions to our PMs team! Don't miss today's Live Q&A Session:
🗓 TODAY at 12:40 PM EDT at https://intersystems.6connex.com/event/virtual-summit/en-us/contents/434370/share?rid=FocusSessions&nid=804450
Don't hesitate to ask your questions!
Article
Yuri Marx · Dec 21, 2020
Today, is important analyze the content into portals and websites to get informed, analyze the concorrents, analyze trends, the richness and scope of content of websites. To do this, you can alocate people to read thousand of pages and spend much money or use a crawler to extract website content and execute NLP on it. You will get all necessary insights to analyze and make precise decisions in a few minutes.
Gartner defines web crawler as: "A piece of software (also called a spider) designed to follow hyperlinks to their completion and to return to previously visited Internet addresses".
There are many web crawlers to extract all relevant website content. In this article I present to you Crawler4J. It is the most used software to extract website content and has MIT license. Crawler4J needs only the root URL, the depth (how many child sites will be visited) and total pages (if you want limit the pages extracted). By default only textual content will be extracted, but you config the engine to extract all website files!
I created a PEX Java service to allows you using an IRIS production to extract the textual content to any website. the content is stored into a local folder and the IRIS NLP reads these files and show to you all text analytics insights!
To see it in action follow these procedures:
1 - Go to https://openexchange.intersystems.com/package/website-analyzer and click Download button to see app github repository.
2 - Create a local folder in your machine and execute: https://github.com/yurimarx/website-analyzer.git.
3 - Go to the project directory: cd website-analyzer.
4 - Execute: docker-compose build (wait some minutes)
5 - Execute: docker-compose up -d
6 - Open your local InterSystems IRIS: http://localhost:52773/csp/sys/UtilHome.csp (user _SYSTEM and password SYS)
7 - Open the production and start it: http://localhost:52773/csp/irisapp/EnsPortal.ProductionConfig.zen?PRODUCTION=dc.WebsiteAnalyzer.WebsiteAnalyzerProduction
8 - Now, go to your browser to initiate a crawler: http://localhost:9980?Website=https://www.intersystems.com/ (to analyze intersystems site, any URL can be used)
9 - Wait between 40 and 60 seconds. A message you be returned (extracted with success). See above sample.
10 - Now go to Text Analytics to analyze the content extracted: http://localhost:52773/csp/IRISAPP/_iKnow.UI.KnowledgePortal.zen?$NAMESPACE=IRISAPP&domain=1
11 - Return to the production and see Depth and TotalPages parameters, increase the values if you want extract more content. Change Depth to analyze sub links and change TotalPages to analyze more pages.
12 - Enjoy! And if you liked, vote (https://openexchange.intersystems.com/contest/current) in my app: website-analyzer
I will write a part 2 with implementations details, but all source code is available in Github. Hi Yuri!Very interesting app!But as I am not a developer, could you please tell more about the results the analizer will give to a marketer or a website owner? Which insights could be extracted form the analysis? Hi @Elena.E
I published a new article about marketing and this app: https://community.intersystems.com/post/marketing-analysis-intersystems-website-using-website-analyzer
About the possible results allows you:
1. Get the most popular words, terms and sentences wrote into the website, so you discover the business focus, editorial line and marketing topics.
2. Sentiment analysis into the sentences, the content is has positive or negative focus
3. Rich cloud words to all the website. Rich because is a semantic analysis, with links between words and sentences
4. Dominance and frequence analysis, to analyze trends
5. Connections paths between sentences, to analyze depth and coverage about editorial topics
6. Search engine of topics covered, the website discuss a topic? How many times do this?
7. Product analysis, the app segment product names and link the all other analysis, so you can know if the website says about your product and Services and the frequency Hi Yuri!
This is a fantastic app!
And works!
But the way to set up the crawler is not that convenient and not very user-friendly.
You never know if the crawler works and if you placed the URL right.
Is it possible to add a page which will let you place the URL, start/stop crawler and display some progress if any?
Maybe I ask a lot :)
Anyway, this is a really great tool to perform IRIS NLP vs ANY site: