The invention and popularization of Large Language Models (such as OpenAI's GPT-4) has launched a wave of innovative solutions that can leverage large volumes of unstructured data that was impractical or even impossible to process manually until recently.

27 4
5 642

TL;DR

This article introduces using the langchain framework supported by IRIS for implementing a Q&A chatbot, focusing on Retrieval Augmented Generation (RAG). It explores how IRIS Vector Search within langchain-iris facilitates storage, retrieval, and semantic search of data, enabling precise and up-to-date responses to user queries. Through seamless integration and processes like indexing and retrieval/generation, RAG applications powered by IRIS enable the capabilities of GenAI systems for InterSystems developers.

4 3
2 317

I'm facing an issue while converting an ORU_R01 HL7 message to XML, specifically with the <pidgrpgrp> kind elements. When I use the getvalueat() method before conversion, the XML includes the <pidgrpgrp> and other <grp> elements, but when I don't use the getvalueat() method, the XML is generated without these <grp>elements.

0 1
0 144

What is Unstructured Data?
Unstructured data refers to information lacking a predefined data model or organization. In contrast to structured data found in databases with clear structures (e.g., tables and fields), unstructured data lacks a fixed schema. This type of data includes text, images, videos, audio files, social media posts, emails, and more.

8 3
0 324
Article
· Aug 2, 2022 8m read
Data models in InterSystems IRIS

Before we start talking about databases and different data models that exist, first we'd better talk about what a database is and how to use it.

A database is an organized collection of data stored and accessed electronically. It is used to store and retrieve structured, semi-structured, or raw data which is often related to a theme or activity.

At the heart of every database lies at least one model used to describe its data. And depending on the model it is based on, a database may have slightly different characteristics and store different types of data.

To write, retrieve, modify, sort, transform or print the information from the database, a software called Database Management System (DBMS) is used.

The size, capacity, and performance of databases and their respective DBMS have increased by several orders of magnitude. It has been made possible by technological advances in various areas, such as processors, computer memory, computer storage, and computer networks. In general, the development of database technology can be divided into four generations based on the data models or structure: navigational, relational, object and post-relational.

16 5
4 1.6K
Article
· Jan 13, 2022 4m read
How to find the dataset you need?

Hey community! How are you doing?

I hope to find everyone well, and a happy 2022 to all of you!

Over the years, I've been working on a lot of different projects, and I've been able to find a lot of interesting data.

But, most of the time, the dataset that I used to work with was the customer data. When I started to join the contest in the past couple of years, I began to look for specific web datasets.

I've curated a few data by myself, but I was thinking, "This dataset is enough to help others?"

5 4
0 355

I have multiple files with different columns, first 9 values are fixed, so i want to ignore the first value, and next 8 values i want to combine into one value using ^ sign

Current Format

|||||||||||^^||||||^^|||||||||||||||||
|||||||||||^^||||^^|||||||||||||||||||||||
|||||||||||^^|||^^||||||||

Desired Format

^^^^^^|||^^||||||^^|||||||||||||||||
^^^^^^|||^^||||^^|||||||||||||||||||||||
^^^^^^|||^^|||^^||||||||

Reading each line from the file use below code.

0 11
0 860

This is the second post of a series explaining how to create an end-to-end Machine Learning system.

Exploring Data

The InterSystems IRIS already has what we need to explore the data: an SQL Engine! For people who used to explore data in
csv or text files this could help to accelerate this step. Basically we explore all the data to understand the intersection
(joins) which should help to create a dataset prepared to be used by a machine learning algorithm.

1 0
1 267

A More Industrial-Looking Global Storage Scheme

In the first article in this series, we looked at the entity–attribute–value (EAV) model in relational databases, and took a look at the pros and cons of storing those entities, attributes and values in tables. We learned that, despite the benefits of this approach in terms of flexibility, there are some real disadvantages, in particular a basic mismatch between the logical structure of the data and its physical storage, which causes various difficulties.

2 0
0 814

Introduction

In the first article in this series, we’ll take a look at the entity–attribute–value (EAV) model in relational databases to see how it’s used and what it’s good for. Then we'll compare the EAV model concepts to globals.

3 0
4 3.9K

Hi Developers,

Please welcome a new video on InterSystems Developers YouTube Channel:

Multi-Model Development

https://www.youtube.com/embed/fiMPWhnE8hY
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

0 1
0 317

Hi, Community!

This week we have two videos.

Please find the second Developer Community Video of the week on InterSystems Developers YouTube Channel:

Turning Accountants into Explorers

https://www.youtube.com/embed/2wG17vzXTy4
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

1 0
0 286

Hi Community!

Check the new video of the week on the InterSystems Developers YouTube Channel:

Adding Alchemy to Unstructured Data

https://www.youtube.com/embed/GAN8l8hQGXc
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

2 0
0 347
Article
· May 25, 2017 2m read
The Interns are Coming!

The Data Platforms department here at InterSystems is gearing up for this year's crop of interns, and I for one am very excited to meet them all next week!

We've got folks from top technical colleges with diverse specialties from hard core engineers to pure computer scientists to mathematicians to business professionals. They come from countries around the world like Vietnam, China, and Finland and they all come with impressive backgrounds. We're sure they will do very well this summer.

6 0
0 503

This article contains the tutorial document for a Global Summit academy session on Text Categorization and provides a helpful starting point to learn about Text Categorization and how iKnow can help you to implement Text Categorization models. This document was originally prepared by Kerry Kirkham and Max Vershinin and should work based on the sample data provided in the SAMPLES namespace.

5 0
1 679

A group of students at the Chalmers University of Technology (Gothenburg, Sweden) tried different approaches to automatically rating the quality of emergency calls, including iKnow.

Excerpt: "The most impressive results produced by iKnow is its ability to correctly classify 100% of the calls using the Average algorithm. This is quite surprising since iKnow only compares low-level concepts, how words relates to each other."

7 1
0 461

Presenter: Danny Wijnschenk
Task: Help people make better decisions by letting application deal with all the data.
Approach: As an example, we’ll extend a demo asset management application for portfolio and trade compliance, using iKnow technology to translate agreements into rules that ensure portfolio compliance prior to trade execution.

In this session, we’ll discuss how easy it is to extend a classic application that deals with straightforward transactions, to also offer insights and actions based on more complex, unstructured data. We’ll present a use case on portfolio compliance from the financial services industry.

Content related to this session, including slides, video and additional learning content can be found here.

0 1
0 354

Presenter: Benjamin De Boe
Task: Extract specialized information from your unstructured data
Approach: Combine InterSystems iKnow technology with third-party and custom text-processing tools

This session explains how you can easily combine ISC, third-party and custom text processing tools to get the broadest insights in your unstructured data.

Content related to this session, including slides, video and additional learning content can be found here.

0 0
0 353

Presenter: Dirk Van Hyfte
Task: Leverage unstructured data to improve how clinicians deliver care
Approach: Give real-world examples of organizations that are benefiting from using their unstructured data

This session will feature real-world examples of how healthcare organizations can benefit from exposing unstructured data to clinicians at point-of-care as well as to clinical informatics building predictive models. Presenters are Wesley Williams, PhD, Vice President and Chief Information Officer, Mental Health Center of Denver; Augie Turano PhD. IT Director Veterans Informatics and Computer Infrastructure (VINCI); and Dirk Van Hyfte, MD, PhD, Senior Research Consultant.

Content related to this session, including slides, video and additional learning content can be found here.

0 0
0 290

Presenter: Misha Bouzinier
Task: Gain an understanding of natural language processing and the current state of the art
Approach: Discuss how InterSystems iKnow technology fits into the NLP ecosystem and complements the output of other components such as Lucene and Stanford NLP tools

A 101 session on Natural Language Processing that positions Intersystems tools in the broader ecosystem Problem: we’ve been touting “unstructured data” for five years, but many people both internally and externally still don’t know what it means to “process natural language” in general and how iKnow and our upcoming UIMA capabilities fit in this NLP ecosystem. This session will describe what a number of common technologies offer and how bare-bone NLP output typically needs to be complemented with more classic analytics or inference tooling to get the value out.

Content related to this session, including slides, video and additional learning content can be found here.

0 0
0 300

Hi,

I created an iKnow domain, where I supplied dictionaries, blacklist, metadata and stemming. The datasource is a table.

I would like to use iFind semantic search feature. It is said in the documentation that iFind use iKnow semantic analysis. But I want iFind to use the iKnow domain configuration I created earlier earlier. How can I do that ?

Regards,

Jack Abdo.

0 7
0 387

Introduction - Analyzing Textual Big Data

Big Data for Enriching Analytical Capabilities - Big data is revolutionizing the world of business intelligence and analytics. Gartner predicts that big data will drive $232 billion in spending through 2016, Wikibon claims that by 2017 big data revenue will have grown to $47.8 billion, and McKinsey Global Institute indicates that big data has the potential to increase the value of the US health care industry by $300 billion and to increase the industry value of Europe's public sector administration by Ä250 billion.

0 0
0 281
Article
· Oct 21, 2015 1m read
Use Cases for Unstructured Data

Introduction

Experts estimate that 85% of all data exists in unstructured formats – held in e-mails, documents (contracts, memos, clinical notes, legal briefs), social media feeds, etc. Where structured data typically accounts for quantitative facts, the more interesting and potentially more valuable expert opinions and conclusions are often hidden in these unstructured formats. And with massive volumes of text being generated at unprecedented speed, there’s very little chance this information can be made useful without some process of synthesis or automation.

1 0
0 241