Benjamin De Boe · Nov 5, 2018 go to post

The SwaggerDoc for the iKnow REST APIs is also available online. Attribute information is relatively fine-grained and associated with entity occurrences aka sentence parts, which you can retrieve per sentence from the /domain/{domain}/sentences/{id}/details endpoint. The /domain/{domain}/entities series of endpoints are for unique entities, with which no occurrence-level information such as attributes is associated. Of course, you can also let attributes get rendered with highlighting rules, but I believe you want the attribute information in a more raw format, right? As an alternative to the REST API, you can get it from the SentenceAttribute and SourceAttribute views for your domain, as configured through the objectsPackage attribute in your domain definition (which you can of course wrap in a simple REST API of your own).

You might already have seen this article which, despite dating back two years, provides a full overview of how to feed and use sentiment attributes.

Benjamin De Boe · Dec 3, 2018 go to post

Hi Sean,

IRIS uses different port numbers than Caché and Ensemble so port clashes are not an issue, but there are a few components that are typically shared across instances (e.g. ISCAgent) where consecutive installations of IRIS and Caché might cause trouble. We're documenting these and also other compatibility items of note (such as accessing one platform with the other's xDBC driver) in a guide that will be published shortly.

The general recommendation remains to stick to instances of the same platform (so either all IRIS or all Caché) on a single server. Note that the use of VMs or Containers of course ensures a proper separation of libraries and enables you to run all your favourite cluster setups from the same physical server.

Benjamin De Boe · Mar 12, 2019 go to post

plain variables don't really have a type, so it's worth knowing what you want to use your date for (storing as %Date, %PosixTime, %Timestamp,...). If you're starting from a string and want to convert to IRIS' internal date format, use the $zdateh() function and use 4 as the second argument for European date format. See the docs for that function for more info.

Benjamin De Boe · Mar 21, 2019 go to post

There is a simple regression calculator that is used internally for similar trend line work, iirc. The class reference is not spectacularly elaborate, but it's fairly straightforward to use. First you use the add() function to load up points and then the result() function will calculate a simple trend line and populate Slope and Intercept properties:

USER>s stat = ##class(%DeepSee.extensions.utils.SimpleRegression).%New()

USER>w stat.add(0,1)
1
USER>w stat.add(1,2)
1
USER>w stat.result(.b,.y0,.r)
1
USER>zw b,y0,r
b=1
y0=1
r=1
USER>w stat.Slope
1

you can keep adding data and re-calculate:

USER>w stat.add(1,1)
1
USER>w stat.result(.b,.y0,.r)
1
USER>zw b,y0,r
b=.5
y0=1
r=.5
Benjamin De Boe · Apr 15, 2019 go to post

No, iKnow doesn't extract text from RTF prior to its NLP task. Besides the LibreOffice suggestion, I've also heard people who've worked with simple Java RTF extractors (part of regular JDK) and Tika in the past. 

Benjamin De Boe · Jun 17, 2019 go to post

Hi Guillaume,

iFind indices, like bitmap indices before, require a bitmap-friendly ID key (positive integer). When you make a table the child in a parent-child relationship, the underlying storage structure will use a composite key that therefore no longer satisfies the bitmap friendliness. We do plan to lift this limitation in a future release, as it's already the case for bitmap indices, but for now you'll have to review your schema and see if a one-to-many or (preferred) foreign key would work for you.

Thanks,
benjamin

Benjamin De Boe · Sep 6, 2019 go to post

InterSystems IRIS (and Caché before that) will indeed make this decision for you. The SQL optimizer will analyze all the conditions in your query and select the best query plan based on the available table statistics, which includes column selectivity. See also this article on collecting those stats with the TuneTable command.

As a matter of fact, our development team is making some exciting enhancements to the cost functions used to turn those table statistics into the actual cost estimates for the possible query plans. More about that at our upcoming Global Summit!

Benjamin De Boe · Jan 2, 2020 go to post

In addition to the suggestions made earlier (to provide more context such as full query & plan), you may also want to consider POSIXTIME as the data type for your date / time columns. It encodes dates as a simple 64-bit integer, which makes for efficient storage, indices and queries.

Benjamin De Boe · Feb 2, 2020 go to post

Hi Conor,

We added a "cancel" button to the SMP in IRIS 2019.3 and are looking at an equally convenient mechanism from the shell. 

Benjamin De Boe · Feb 6, 2020 go to post

I didn't mind being lured into reading this excellent article, but the concept of "DataOps" is entirely different from what's being discussed here :-)

You could also add the failed Ariana V  501 launch as an example of a case where conversion between units / data types caused somewhat costly trouble

Benjamin De Boe · Feb 20, 2020 go to post

Hi Darko,

LONGVARCHAR is actually storing the text as a stream, so to make LIKE work, you'll need to convert it to a string, e.g. using SUBSTRING(), in the expression you're serving to the LIKE operator. The following works fine for me:

SELECT SUBSTRING(text, 1, 999999) FROM table WHERE SUBSTRING(text, 1, 9999999) LIKE '%abc%'

This looks a little clumsy, but streams are really meant for huge data that wouldn't fit in a VARCHAR. Note that you can get a massive amount of text in VARCHAR (%String) columns these days, so maybe that's worth considering.

Separately, when working with iFind, that'll provide faster search results and it also transparently takes care of the stream aspect.

Benjamin De Boe · Mar 9, 2020 go to post

Indeed, that's the recommended SQL way of achieving what Eduard described about PPGs. Drawbacks are that queries on these tables cannot be parallelized (as that implies multiple processes, of course).

Our TSQL support is meant for Sybase customers wishing to redeploy their TSQL applications on IRIS (especially now that SAP/Sybase is terminating support for those platforms). Just temporary table support by itself wouldn't be a reason to start building TSQL applications and abandon IRIS SQL/ObjectScript, of course :-). However, for a recent TSQL migration we did some work on our TSQL temp table support and were considering to roll that out to regular IRIS SQL, so this thread is a good place to share your experiences and requirements so we can make sure to do that properly, as needed. yes

Benjamin De Boe · Mar 11, 2020 go to post

FWIW, you can just put as many conditions as you want in the same SQL query. InterSystems IRIS includes an excellent SQL optimizer that will consider all the conditions (and other clauses) in your query and decide on the most efficient path to access the data.

Benjamin De Boe · Mar 13, 2020 go to post

and if you prefer something more database-independent, you can use the standard INFORMATION_SCHEMA package:

SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE COLUMN_NAME LIKE 'PATID%'
Benjamin De Boe · Mar 15, 2020 go to post

indeed, system schemas (whose name starts with a % sign) require certain privileges (that you quickly get used to once you have them :-) ). That's why I suggested querying the INFORMATION_SCHEMA

Benjamin De Boe · Mar 15, 2020 go to post

In addition to Eduard's response: sharding is a feature of our SQL / Objects layer. Raw globals are at a level below what we'd need for hooking in the workload distribution you're referring to.

Benjamin De Boe · Apr 15, 2020 go to post

coming late to the party, but just as a small and likely irrelevant note: if you use a PPG for storage and at some point (because the table appears big) the query optimizer thinks it's worth using parallel query execution, it won't see a thing in those worker processes. You should be able to avoid that using the %NOPARALLEL keyword though.

Benjamin De Boe · Apr 22, 2020 go to post

Hi Hansel,

at Global Summit 2019, we already showed some early prototypes of our forthcoming System Alerting and Monitoring tool that also leverages Prometheus and Grafana to monitor your IRIS deployments (recording is here). It's still a few weeks away, but this Tech Talk will discuss this subject in more detail.

Thanks,
benjamin

Benjamin De Boe · Apr 27, 2020 go to post

If you need a programmatic entry point for dealing with complex filter conditions, take a look at %SQL.AbstractFind. It allows you to invoke ObjectScript code to populate bitmaps based on (any number of) user-provided arguments through a %FIND predicate. It is built on by iFind and this older (but worthwhile) community article on custom spatial indices, although in both cases it works hand-in-hand with a custom index (using %FunctionalIndex). That isn't required, but usually the flip side of the same use case coin.

Benjamin De Boe · Apr 27, 2020 go to post

I don't want to discourage creativity, but this approach feels very risky to me. When you issue any SQL against this table before setting the % variable, it's going to cause ugly errors or unpredictable behaviour. I also wouldn't bet my money on this working in all possible parallel query execution scenarios (likely some, likely not all).

FWIW, within InterSystems development, we typically call % variables that survive between method calls a leak rather than single-rivet-keeping-your-skyscraper-together :-). Again, apologies for putting it a little strong here, but I think most use cases asking for flexibility can be addressed with more robust solutions, such as temporary tables, class inheritance (with the NoExtent keyword), etc. 

Benjamin De Boe · May 11, 2020 go to post

Depends on how you're set up. There is a setting to allow specifying those values that is meant for this type of bulk load scenario.

@Daniel Buxton : the multi-row insert syntax you're asking about is currently not supported by IRIS SQL (as you figured by now :-)). Robert's approach using UNION all is a creative workaround, but depending on where your big INSERT is coming from you might also just be able to generate / rewrite it into separate INSERTs? Once you have that in place, you can use the SQL Shell's RUN command or the ObjectScript $SYSTEM.SQL.DDLImport() method to execute your script file.

Benjamin De Boe · May 20, 2020 go to post

For more about where the source code for embedded SQL has gone in 2020.1, please see this article on the Universal Query Cache

Indeed funny the doc doesn't call out table names while it does talk about field names, probably because it silently assumes the one implies the other. I'll ask our doc writer to fix that. Thanks for the notification.

Benjamin De Boe · May 24, 2020 go to post

Access Denied errors can stem from a variety of reasons. As a sanity check, running the windows ODBC connection manager's test function never hurts to rule out connectivity issues. By any means, you can consult with the WRC on support issues like this.

Benjamin De Boe · Jun 8, 2020 go to post

You can force embedded SQL to be compiled along with the embedding application code using a compile flag /compileembedded=1. This is not recommended though, so we'd really like to know more about why the current model doesn't suit you. In fact this first-run compiling would allow your customers to leverage the statistics for their data rather than what you baked into it when compiling the embedded SQL. See also this article.

Benjamin De Boe · Jun 8, 2020 go to post

The universal query cache has very little if anything to do with sharding (see here)

I also wouldn't recommend using custom class queries for anything for anything you can achieve with SQL. It leaves you with much more flexibility and transparent benefits (presumed ;-) ) as you update your table statistics or upgrade to a new version.