#Unstructured Data

0 Followers · 28 Posts

Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy but may contain data such as dates, numbers, and facts as well.

All

Top

By update

Question Kanishk Mittal · Jul 28

Schema Design Best Practices for Cross-Departmental Data Lakes in IRIS

We’re building out a data lake in IRIS 2025.1 that aggregates data across multiple business systems and departments. I’m trying to establish best practices for schema design and separation.

Right now, I’m thinking of using a separate schema for each distinct system of record feeding into the data lake - for example, one schema per upstream source system, rather than splitting based on function (e.g. staging, raw, curated). The idea is that this would make it easier to manage source ownership, auditing, and pipeline logic, especially when multiple domains are contributing data.

#InterSystems IRIS #InterSystems IRIS BI (DeepSee) #Access control #Big Data #Databases #Unstructured Data

0 0

Article Maxim Gorshkov · Feb 14, 2024 4m read

Data Tagging in IRIS Using Embedded Python and the OpenAI API

The invention and popularization of Large Language Models (such as OpenAI's GPT-4) has launched a wave of innovative solutions that can leverage large volumes of unstructured data that was impractical or even impossible to process manually until recently. Such applications may include data retrieval (see Don Woodlock's ML301 course for a great intro to Retrieval Augmented Generation), sentiment analysis, and even fully-autonomous AI agents, just to name a few!

#InterSystems IRIS #Artificial Intelligence (AI) #Analytics #API #Best Practices #Embedded Python #Large Language Model (LLM) #ObjectScript #Python #Unstructured Data

19 2

0 754

Article José Pereira · May 14, 2024 11m read

Q&A Chatbot with IRIS and langchain

TL;DR

This article introduces using the langchain framework supported by IRIS for implementing a Q&A chatbot, focusing on Retrieval Augmented Generation (RAG). It explores how IRIS Vector Search within langchain-iris facilitates storage, retrieval, and semantic search of data, enabling precise and up-to-date responses to user queries. Through seamless integration and processes like indexing and retrieval/generation, RAG applications powered by IRIS enable the capabilities of GenAI systems for InterSystems developers.

#InterSystems IRIS #Artificial Intelligence (AI) #Generative AI (GenAI) #Python #Tools #Unstructured Data #Vector Search

Open Exchange

0 0

0 415

Question Nimisha Joseph · Feb 29, 2024

Issue Encountered When Calling getValueAt() on ORU_R01 HL7 Message to XML Conversion

I'm facing an issue while converting an ORU_R01 HL7 message to XML, specifically with the <pidgrpgrp> kind elements. When I use the getvalueat() method before conversion, the XML includes the <pidgrpgrp> and other <grp> elements, but when I don't use the getvalueat() method, the XML is generated without these <grp>elements.

I've attempted to debug the issue using zwrite on the HL7 message before and after calling getvalueat(). Before calling it, the content appears different, and after calling it, the content shows buildmap=1, etc.Please see the xml generated in 2 cases.

#Ensemble #Debugging #HL7 #Unstructured Data

0 0

0 185

Article Veerarajan Karunanithi · Feb 27, 2024 4m read

Insights from unstructured data using SQL Text Search

What is Unstructured Data?
Unstructured data refers to information lacking a predefined data model or organization. In contrast to structured data found in databases with clear structures (e.g., tables and fields), unstructured data lacks a fixed schema. This type of data includes text, images, videos, audio files, social media posts, emails, and more.

Why Are Insights from Unstructured Data Important?
According to an IDC (International Data Corporation) report, 80% of worldwide data is projected to be unstructured by 2025, posing a significant concern for 95% of businesses. Forbes Article

#InterSystems IRIS #Artificial Intelligence (AI) #Databases #Generative AI (GenAI) #iFind #SQL #Tutorial #Unstructured Data

0 0

0 417

Article Iryna Mykhailova · Aug 2, 2022 8m read

Data models in InterSystems IRIS

Before we start talking about databases and different data models that exist, first we'd better talk about what a database is and how to use it.

A database is an organized collection of data stored and accessed electronically. It is used to store and retrieve structured, semi-structured, or raw data which is often related to a theme or activity.

At the heart of every database lies at least one model used to describe its data. And depending on the model it is based on, a database may have slightly different characteristics and store different types of data.

#InterSystems IRIS #Best Practices #Columnar Storage #Databases #Data Model #Document Data Model (NoSQL) #Multi-model #Object Data Model #Relational Tables #Unstructured Data

14 5

3 1770

Article Henrique Dias · Jan 13, 2022 4m read

How to find the dataset you need?

Hey community! How are you doing?

I hope to find everyone well, and a happy 2022 to all of you!

Over the years, I've been working on a lot of different projects, and I've been able to find a lot of interesting data.

But, most of the time, the dataset that I used to work with was the customer data. When I started to join the contest in the past couple of years, I began to look for specific web datasets.

I've curated a few data by myself, but I was thinking, "This dataset is enough to help others?"

#InterSystems IRIS #InterSystems IRIS for Health #Contest #Data Import and Export #Data Model #Unstructured Data

Open Exchange

5 4

0 391

Question Ahmad Bukhtiar · Nov 19, 2020

Help with string function reading data from files

I have multiple files with different columns, first 9 values are fixed, so i want to ignore the first value, and next 8 values i want to combine into one value using ^ sign

Current Format

|||||||||||^^||||||^^|||||||||||||||||
|||||||||||^^||||^^|||||||||||||||||||||||
|||||||||||^^|||^^||||||||

Desired Format

^^^^^^|||^^||||||^^|||||||||||||||||
^^^^^^|||^^||||^^|||||||||||||||||||||||
^^^^^^|||^^|||^^||||||||

Reading each line from the file use below code.

#dim line as %String = tInput.ReadLine(, .status)

"here i was to put some string function to change format of the data in line variable"

#Caché #Ensemble #Code Snippet #Coding Guidelines #Unstructured Data

0 11

0 995

Article Renato Banzai · Jul 17, 2020 3m read

Using Machine Learning to Organize the Community - 2

This is the second post of a series explaining how to create an end-to-end Machine Learning system.

Exploring Data

The InterSystems IRIS already has what we need to explore the data: an SQL Engine! For people who used to explore data in csv or text files this could help to accelerate this step. Basically we explore all the data to understand the intersection (joins) which should help to create a dataset prepared to be used by a machine learning algorithm.

Posts Table ( Provided by Intersystems Team )

Tags Table ( Provided by Intersystems Team )

#InterSystems IRIS #IntegratedML #Machine Learning (ML) #Python #Unstructured Data #Vector Search

Open Exchange

1 0

1 337

Article Sergey Kamenev · May 28, 2020 7m read

Entity-attribute-value model in relational databases. Should globals be emulated on tables? Part 2.

A More Industrial-Looking Global Storage Scheme

In the first article in this series, we looked at the entity–attribute–value (EAV) model in relational databases, and took a look at the pros and cons of storing those entities, attributes and values in tables. We learned that, despite the benefits of this approach in terms of flexibility, there are some real disadvantages, in particular a basic mismatch between the logical structure of the data and its physical storage, which causes various difficulties.

#InterSystems IRIS #InterSystems IRIS for Health #Databases #Globals #Performance #Relational Tables #SQL #Unstructured Data

Open Exchange

2 0

0 939

Article Sergey Kamenev · May 11, 2020 8m read

Entity-attribute-value model in relational databases. Should globals be emulated on tables? Part 1.

Introduction

In the first article in this series, we’ll take a look at the entity–attribute–value (EAV) model in relational databases to see how it’s used and what it’s good for. Then we'll compare the EAV model concepts to globals.

Sometimes you have objects with an unknown number of fields, or perhaps hierarchically nested fields, for which, as a rule, you need to search.

#InterSystems IRIS #Databases #Globals #Performance #Relational Tables #SQL #Unstructured Data

Open Exchange

3 0

4 4330

Article Alex Litkovets · Apr 10, 2017 5m read

iKnow Review Analyzer (iKRA)

Introduction

We used the InterSystems iKnow technology to create a review assessment system called iKnow Reviews Analyzer (iKRA). Some information about the prototype of the system can be found here. iKRA analyzes users’ text reviews and automatically rates the object being reviewed. This functionality may come in very handy on e-commerce sites, forums or collections of media content – in other words, everywhere where people discuss products, places or services, for example.

What does the solution do?

#InterSystems Natural Language Processing (NLP, iKnow) #Databases #InterSystems IRIS BI (DeepSee) #Unstructured Data #InterSystems Natural Language Processing (NLP, iKnow)

4 5

0 2017

Announcement Anastasia Dyubaylo · Sep 18, 2019

New Video: Multi-Model Development

Hi Developers,

Please welcome a new video on InterSystems Developers YouTube Channel:

Multi-Model Development

#InterSystems IRIS #Data Model #Global Summit 2018 #Multi-model #Unstructured Data #Video

0 1

0 382

Announcement Anastasia Dyubaylo · Sep 7, 2017

The Second Developer Video of the Week: Turning Accountants into Explorers

Hi, Community!

This week we have two videos.

Please find the second Developer Community Video of the week on InterSystems Developers YouTube Channel:

Turning Accountants into Explorers

#Summit #Analytics #Global Summit 2016 #Unstructured Data #Video

1 0

0 327

Announcement Anastasia Dyubaylo · Jun 22, 2017

Developer Video of the Week: Adding Alchemy to Unstructured Data

Hi Community!

Check the new video of the week on the InterSystems Developers YouTube Channel:

Adding Alchemy to Unstructured Data

#Summit #Global Summit 2016 #UIMA #Unstructured Data #Video

2 0

0 392

Article Michelle Stolwyk · May 25, 2017 2m read

The Interns are Coming!

The Data Platforms department here at InterSystems is gearing up for this year's crop of interns, and I for one am very excited to meet them all next week!

We've got folks from top technical colleges with diverse specialties from hard core engineers to pure computer scientists to mathematicians to business professionals. They come from countries around the world like Vietnam, China, and Finland and they all come with impressive backgrounds. We're sure they will do very well this summer.

#InterSystems Data Platform Blog #Analytics #Caché #Interoperability #SQL #Unstructured Data #InterSystems Data Platform Blog

6 0

0 591

Article Benjamin De Boe · Nov 3, 2016 16m read

Getting started with Text Categorization

This article contains the tutorial document for a Global Summit academy session on Text Categorization and provides a helpful starting point to learn about Text Categorization and how iKnow can help you to implement Text Categorization models. This document was originally prepared by Kerry Kirkham and Max Vershinin and should work based on the sample data provided in the SAMPLES namespace.

#InterSystems Data Platform Blog #Analytics #Best Practices #Studio #Terminal #InterSystems Natural Language Processing (NLP, iKnow) #Management Portal #Tutorial #Unstructured Data #InterSystems Data Platform Blog

5 0

1 758

Article Otto Medin · Nov 1, 2016 1m read

Bachelor thesis: Automated quality rating of emergency calls using NLP

A group of students at the Chalmers University of Technology (Gothenburg, Sweden) tried different approaches to automatically rating the quality of emergency calls, including iKnow.

Excerpt: "The most impressive results produced by iKnow is its ability to correctly classify 100% of the calls using the Average algorithm. This is quite surprising since iKnow only compares low-level concepts, how words relates to each other."

Full story: http://publications.lib.chalmers.se/records/fulltext/244534/244534.pdf

#InterSystems Natural Language Processing (NLP, iKnow) #Caché #Unstructured Data #InterSystems Natural Language Processing (NLP, iKnow)

7 1

0 492

Article Daniel Wijnschenk · Apr 7, 2016 1m read

Global Summit 2016 - Turning Accountants into Explorers

Presenter: Danny Wijnschenk
Task: Help people make better decisions by letting application deal with all the data.
Approach: As an example, we’ll extend a demo asset management application for portfolio and trade compliance, using iKnow technology to translate agreements into rules that ensure portfolio compliance prior to trade execution.

#InterSystems Natural Language Processing (NLP, iKnow) #Unstructured Data

0 1

0 380

Article Benjamin De Boe · Apr 8, 2016 1m read

Global Summit 2016 - Adding Alchemy to Unstructured Data

Presenter: Benjamin De Boe
Task: Extract specialized information from your unstructured data
Approach: Combine InterSystems iKnow technology with third-party and custom text-processing tools

This session explains how you can easily combine ISC, third-party and custom text processing tools to get the broadest insights in your unstructured data.

Content related to this session, including slides, video and additional learning content can be found here.

#InterSystems Natural Language Processing (NLP, iKnow) #UIMA #Unstructured Data

0 0

0 391

Article Jacquelyn Gentile · Apr 8, 2016 1m read

Global Summit 2016 - Advancing Health through Unstructured Data

Presenter: Dirk Van Hyfte
Task: Leverage unstructured data to improve how clinicians deliver care
Approach: Give real-world examples of organizations that are benefiting from using their unstructured data

#InterSystems Natural Language Processing (NLP, iKnow) #Unstructured Data

0 0

0 329

Article Misha Bouzinier · Apr 7, 2016 1m read

Global Summit 2016 - How Computers Learn to Read

Presenter: Misha Bouzinier
Task: Gain an understanding of natural language processing and the current state of the art
Approach: Discuss how InterSystems iKnow technology fits into the NLP ecosystem and complements the output of other components such as Lucene and Stanford NLP tools

#InterSystems Natural Language Processing (NLP, iKnow) #Unstructured Data

0 0

0 325

Article Developer Community Admin · Oct 21, 2015 3m read

Analytics of Textual Big Data: Text Exploration of the Big Untapped Data Source

Introduction - Analyzing Textual Big Data

Big Data for Enriching Analytical Capabilities - Big data is revolutionizing the world of business intelligence and analytics. Gartner predicts that big data will drive $232 billion in spending through 2016, Wikibon claims that by 2017 big data revenue will have grown to $47.8 billion, and McKinsey Global Institute indicates that big data has the potential to increase the value of the US health care industry by $300 billion and to increase the industry value of Europe's public sector administration by Ä250 billion.

#Caché #Unstructured Data #InterSystems Natural Language Processing (NLP, iKnow)

0 2

0 315

Announcement Janine Perkins · Mar 8, 2016

Featured InterSystems Online Courses: Introduction to the Document Data Model and Getting Started with Using the Document Data Model

Find out what sets the InterSystems Document Data Model apart in the industry.

#Learning Portal #Caché #Learning Portal #Unstructured Data

2 0

0 314

Question Jack Abdo · Feb 2, 2016

IFind french and stemming

Hi,

I created with Studio a persistent class with the following field and index:

Property DescriptionDemande As %String(MAXLEN = "");
Index IDXBASDescriptionDemande On (DescriptionDemande) As %iFind.Index.Basic(INDEXOPTION = 1, LANGUAGE = "fr", LOWER = 1);

INDEXOPTION is set to 1 for activating stemming. I'm indexing french documents. I have set lower to 1 because I want to do non case sensitive search.

I inserted a single french word "élément" in the field DescriptionDemande for testing purposes using this query: insert into my_table(DescriptionDemande) values(' élément')

#Caché #Beginner #Development Environment #Studio #Unstructured Data #iFind

1 2

1 407

Question Jack Abdo · Jan 15, 2016

Using iKnow domain configuration from iFind

Hi,

I created an iKnow domain, where I supplied dictionaries, blacklist, metadata and stemming. The datasource is a table.

I would like to use iFind semantic search feature. It is said in the documentation that iFind use iKnow semantic analysis. But I want iFind to use the iKnow domain configuration I created earlier earlier. How can I do that ?

Regards,

Jack Abdo.

#InterSystems Natural Language Processing (NLP, iKnow) #Unstructured Data #iFind

0 7

0 434

Question Scott Beeson · Jan 21, 2016

[SOLVED (KIND OF)] Missing something obvious trying to do a lookup in a method

So calling this lookup manually from the console works as expected:

PHR>set key = "WMMC_IMM"
PHR>w ##class(Ens.Util.FunctionSet).Lookup("BlockFeed",key)
1

However, calling it from a method with some concatination to build the key is giving me problems:

ClassMethod canSendToState(iParticipant As %String, iFeed As %String) As %Boolean
{
set k = iParticipant _ "_" _ iFeed
w "Looking up " _ k,!
set x = ..Lookup("BlockFeed",k,"not found")
w "x = " _ x,!
}

PHR>w ##class("Custom.MHC.Common.Functions").canSendToState("WMMC","IMM")
Looking up WMMC_IMM
x = not found

#Ensemble #Beginner #Code Snippet #Unstructured Data

0 8

0 430

Article Developer Community Admin · Oct 21, 2015 1m read

Use Cases for Unstructured Data

Introduction

Experts estimate that 85% of all data exists in unstructured formats – held in e-mails, documents (contracts, memos, clinical notes, legal briefs), social media feeds, etc. Where structured data typically accounts for quantitative facts, the more interesting and potentially more valuable expert opinions and conclusions are often hidden in these unstructured formats. And with massive volumes of text being generated at unprecedented speed, there’s very little chance this information can be made useful without some process of synthesis or automation.

#Caché #Unstructured Data #InterSystems Natural Language Processing (NLP, iKnow)

1 0

0 290

Dev Community resources

InterSystems resources

#Unstructured Data

Schema Design Best Practices for Cross-Departmental Data Lakes in IRIS

Data Tagging in IRIS Using Embedded Python and the OpenAI API

Q&A Chatbot with IRIS and langchain

TL;DR

Issue Encountered When Calling getValueAt() on ORU_R01 HL7 Message to XML Conversion

Insights from unstructured data using SQL Text Search

Data models in InterSystems IRIS

How to find the dataset you need?

Help with string function reading data from files

Using Machine Learning to Organize the Community - 2

Exploring Data

Posts Table ( Provided by Intersystems Team )

Tags Table ( Provided by Intersystems Team )

Entity-attribute-value model in relational databases. Should globals be emulated on tables? Part 2.

A More Industrial-Looking Global Storage Scheme

Entity-attribute-value model in relational databases. Should globals be emulated on tables? Part 1.

Introduction

iKnow Review Analyzer (iKRA)

New Video: Multi-Model Development

The Second Developer Video of the Week: Turning Accountants into Explorers

Developer Video of the Week: Adding Alchemy to Unstructured Data

The Interns are Coming!

Getting started with Text Categorization

Bachelor thesis: Automated quality rating of emergency calls using NLP

Global Summit 2016 - Turning Accountants into Explorers

Global Summit 2016 - Adding Alchemy to Unstructured Data

Global Summit 2016 - Advancing Health through Unstructured Data

Global Summit 2016 - How Computers Learn to Read

Analytics of Textual Big Data: Text Exploration of the Big Untapped Data Source

Featured InterSystems Online Courses: Introduction to the Document Data Model and Getting Started with Using the Document Data Model

IFind french and stemming

Using iKnow domain configuration from iFind

[SOLVED (KIND OF)] Missing something obvious trying to do a lookup in a method

Use Cases for Unstructured Data

Community in numbers

Dev Community resources

InterSystems resources

Our social networks

#Unstructured Data

TL;DR

Exploring Data

Posts Table ( Provided by Intersystems Team )

Tags Table ( Provided by Intersystems Team )

A More Industrial-Looking Global Storage Scheme

Introduction

Trending apps

Community in numbers