#Vector Search

2 Followers · 131 Posts

Vector search is a method used in information retrieval and machine learning to find similar items based on their mathematical representations as vectors. In this approach, each item is represented as a high-dimensional vector, with each dimension corresponding to a feature or characteristic of the item. Vector search algorithms then compare these vectors to find similar items, such as having similar features or being close together in the vector space. Read more here.

Article Muhammad Waseem · Jul 4, 2023 6m read

As an AI language model, ChatGPT is capable of performing a variety of tasks like language translation, writing songs, answering research questions, and even generating computer code. With its impressive abilities, ChatGPT has quickly become a popular tool for various applications, from chatbots to content creation.
But despite its advanced capabilities, ChatGPT is not able to access your personal data. So in this article, I will demonstrate below steps to build custom ChatGPT AI by using LangChain Framework:

0
1 13185
Article Benjamin De Boe · Feb 13, 2023 4m read

With InterSystems IRIS 2022.2, we introduced Columnar Storage as a new option for persisting your IRIS SQL tables that can boost your analytical queries by an order of magnitude. The capability is marked as experimental in 2022.2 and 2022.3, but will "graduate" to a fully supported production capability in the upcoming 2023.1 release. 

The product documentation and this introductory video, already describe the differences between row storage, still the default on IRIS and used throughout our customer base, and columnar table storage and provide high-level guidance on choosing the appropriate storage layout for your use case. In this article, we'll elaborate on this subject and share some recommendations based on industry-practice modelling principles, internal testing, and feedback from Early Access Program participants. 

2
3 808
Article Benjamin De Boe · Jan 10, 2023 4m read

As you may well remember from Global Summit 2022 or the 2022.2 launch webinar, we're releasing an exciting new capability for including in your analytics solutions on InterSystems IRIS. Columnar Storage introduces an alternative way of storing your SQL table data that offers an order-of-magnitude speedup for analytical queries. First released as an experimental feature in 2022.2, the latest 2022.3 Developer Preview includes a bunch of updates we thought were worth a quick post here.

2
3 844
Article Henry Pereira · Apr 6, 2022 7m read

so... where's my money?

All of us know that money is important. We constantly need to monitor all expenses to avoid looking back to the bank statement and thinking: “So, where’s my money?”

To evade financial stress, we must keep an eye on the inflow and outflow of money into our accounts.It is also important to tack when and how we spend and earn. Manually recording all transactions in order to understand where our money goes requires an effort. It demands consistency, and it is boring. Today there is a bunch of mobile or SaaS options that help you manage your finances.

0
0 2086
Article Sergey Lukyanchikov · Jul 22, 2021 11m read

Fixing the terminology

A robot is not expected to be either huge or humanoid, or even material (in disagreement with Wikipedia, although the latter softens the initial definition in one paragraph and admits virtual form of a robot). A robot is an automate, from an algorithmic viewpoint, an automate for autonomous (algorithmic) execution of concrete tasks. A light detector that triggers street lights at night is a robot. An email software separating e-mails into “external” and “internal” is also a robot. Artificial intelligence (in an applied and narrow sense, Wikipedia interpreting it differently again) is algorithms for extracting dependencies from data. It will not execute any tasks on its own, for that one would need to implement it as concrete analytic processes (input data, plus models, plus output data, plus process control). The analytic process acting as an “artificial intelligence carrier” can be launched by a human or by a robot. It can be stopped by either of the two as well. And managed by any of them too.

0
0 453
Article Renato Banzai · Jul 19, 2020 3m read

This is the third post of a series explaining how to create an end-to-end Machine Learning system.

Training a Machine Learning Model

When you work with machine learning is common to hear this work: training. Do you what training mean in a ML Pipeline? Training could mean all the development process of a machine learning model OR the specific point in all development process that uses training data and results in a machine learning model.

picture Source

So Machine Learning Models are not equal Common Applications?

In the very last point it looks like a normal application.

10
2 448
Article Renato Banzai · Jul 17, 2020 3m read

This is the second post of a series explaining how to create an end-to-end Machine Learning system.

Exploring Data

The InterSystems IRIS already has what we need to explore the data: an SQL Engine! For people who used to explore data in csv or text files this could help to accelerate this step. Basically we explore all the data to understand the intersection (joins) which should help to create a dataset prepared to be used by a machine learning algorithm.

0
1 374
Article Niyaz Khafizov · Oct 8, 2018 16m read

Hi all. We are going to find duplicates in a dataset using Apache Spark Machine Learning algorithms.

Note: I have done the following on Ubuntu 18.04, Python 3.6.5, Zeppelin 0.8.0, Spark 2.1.1

Introduction

In previous articles we have done the following:

In this series of articles, we explore Machine Learning and record linkage.

0
1 779
Article Niyaz Khafizov · Jul 19, 2018 4m read

Hi all. Today we are going to use k-means algorithm on the Iris Dataset.

Note: I have done the following on Ubuntu 18.04, Apache Zeppelin 0.8.0, python 3.6.5.

Introduction

K-Means is one of the simplest unsupervised learning algorithms that solves the clustering problem. It groups all the objects in such a way that objects in the same group (group is a cluster) are more similar (in some sense) to each other than to those in other groups. For example, assume you have an image with a red ball on the green grass. K-Means will split all pixels into two clusters.

0
3 9834
Article David E Nelson · Mar 9, 2017 9m read

Apache Spark has rapidly become one of the most exciting technologies for big data analytics and machine learning. Spark is a general data processing engine created for use in clustered computing environments. Its heart is the Resilient Distributed Dataset (RDD) which represents a distributed, fault tolerant, collection of data that can be operated on in parallel across the nodes of a cluster. Spark is implemented using a combination of Java and Scala and so comes as a library that can run on any JVM.

5
1 2857
Question Benjamin Eriksson · Mar 14, 2016

Hello! 

My group and I are currently doing a research project on natural language processing and iKnow plays a big role in this project.  I am aware that the algorithms iKnow use aren't public, and I respect that.

My question is, are there any public documents/research that explains, at least part of, the algorthims iKnow uses and the motivations for using them?  

Here is a concrete example:  We are using GetSimilar() for many of our results and it works very well.

2
0 471