Question Goran Stefanovic · Dec 12, 2018

GROUP BY causes NativeError: [10053]

A very simple SQL

SELECT field1 F1, count(field2) CntF2
FROM GCT.PA_Data
where field1 is not null
group by field1

gives an error back:

 [SQLCODE: <-461>:<Communication link failure>]

  [%msg: <Connection 'TRAKCARELIVE': Execute: SQLState: (HY000) NativeError: [10053] Message: [Cache ODBC][State : HY000][Native Code 10053] [c:\intersystems\ensemble\bin\cache.exe] Software caused connection abort> ]

It does work in another (test) environment, but not on live. Any idea why?

Comments

Robert Cemper · Dec 12, 2018

Software caused connection abort

As you didn't supply many details the is more than 1 reason

  • you have a network problem connecting to your live server
  • your live server hs not enabled ODBC access
  • the account you use has no access privileges
  • your GCT.PA_Data is so big or untuned that you fall into a connection timeout.

Try to run your query on the live server and see its performance.

Try a simple query like "SELECT top 3 field1,field2 FROM GCT.PA_Data2"  just to see if you have access rights at all.

0
Eduard Lebedyuk  Dec 13, 2018 to Robert Cemper

If that doesn't work too, try

SELECT 1

If it fails - connection/config problems.

If it succeeds it's something else.

0
Lucas Fernandes  Dec 13, 2018 to Robert Cemper

If it is when you add Group By, it can be:

  •  your GCT.PA_Data is so big or untuned that you fail a connection timeout.
0
Goran Stefanovic  Dec 13, 2018 to Robert Cemper

It is giving a result for either top n or without it, but only when "group by" is not there.

As soon as I add that line it spits out the error.

0
Robert Cemper  Dec 13, 2018 to Goran Stefanovic

This indicates that GROUP BY is to slow to answer in time.

If the selectivity of field1 is very low this may take quite a while.
The closer field1 is to unique the longer it takes as it produces a large resultset.

You may allow a longer timeout. But pls. don't ask me how. Some other experts may know. 

If your GCT.PA_Data is a huge thing you may slice it into pieces by year, id,  ... whatever seems useful:

Example:

SELECT field1 F1, count(field2) CntF2
FROM (
     SELECT filed1,field2 from GCT.PA_Data
     where field1 is not null and ID between 1 and 100000
) group by field
1

This is not the final solution but a way to understand the limits of your server.

BTW:
an index on field1 might be useful anyhow
Example

Index ff on field1 [ DATA = filed2 ] ;

with this construct, your query will only access this special index
instead of the full record and NULL fields are all grouped in advance.

It is kind of a "materialized view" that other DBs offer.

0
Martin Fukátko  Dec 14, 2018 to Goran Stefanovic

Groran, to make sure that problem is the connection timeout you can test the SQL in terminal. Terminal connection doesn't have timeout. and you can see how much it costs. Here is example from documentation:

SAMPLES>SET result=$SYSTEM.SQL.Execute("SELECT TOP 5 name,dob,ssn FROM Sample.Person")

SAMPLES>DO result.%Display()
0