What is the best way to get the size and other storage details of individual Globals in a namespace?
Thanks,
Mary
Monitoring is a process of controlling and management of performance and availability of software applications.
What is the best way to get the size and other storage details of individual Globals in a namespace?
Thanks,
Mary
Prometheus is one of the monitoring systems adapted for collecting time series data.
Its installation and initial configuration are relatively easy. The system has a built-in graphic subsystem called PromDashfor visualizing data, but developers recommend using a free third-party product called Grafana. Prometheus can monitor a lot of things (hardware, containers, various DBMS's), but in this article, I would like to take a look at the monitoring of a Caché instance (to be exact, it will be an Ensemble instance, but the metrics will be from Caché). If you are interested – read along.
Hello Developers!
Previously, I shared with you all a handy operational analytics dashboard you can build to visualize key message processing metrics, such as number of inbound/outbound messages, average processing times, etc.
This time around, I’d like to walk you through an enhanced log monitor using a workflow many of you are already familiar with – working with alerts as messages inside a production, creating routing rules to filter and route alerts, and using pre-built components like the email adapter to send notifications at a granular level.
Whenever the Windows SNMP Service restarts, the snmpdbg log says the following.
13:08:59 :Attempting initial TCP connection(s) with 1 Cache instances ...
13:08:59 :Get connection with ENSEMBLE on port 1972
13:08:59 :Connection refused on port 1972, check if Cache instance ENSEMBLE is started.
13:08:59 :Cache iscsnmp.dll initialized for 1 configs
Ensemble and all productions are running. I've set up Caché SNMP agent on many other servers in our company and those are working fine. However this one server won't budge.
Does anyone have any idea what the problem may be here?
Regards,
Glenn
Hi, Community!
You know that your productions need to be monitored. But what should you be monitoring, and how?
Let me invite you to join Michael Brady, Technical Trainer with InterSystems Learning Services, to learn about message volume monitoring tools, what really happens when you purge a message and how you can monitor your disk space from afar.
This webinar is valuable for anyone managing Ensemble or HealthShare productions.
It will take place on Thursday, May 4, 2017 10:30 am Eastern Daylight Time (New York, GMT-04:00)
Hi,
I created a task from Management portal Task manager to use the Ens.Util.Tasks.Purge task . Task set up includes email notification setup for Completion email and error email.
This task is giving an error and no email is generated:
| <CLASS DOES NOT EXIST>zSendMail+22^%SYS.TaskSuper.1 *Security.SSLConfigs |
I tested all other task types available from Ens.Util.task but all are giving the same error.
Not sure if this Is this a bug or some missing configuration in the task setup ? Anyone noticed any similar issue or any idea how to fix this ?
Thank you for your help.
Regards,
Mary
Hi,
As part of our continuous efforts to expand and improve the InterSystems IRIS Data Platform, we’ve set up a brief survey around SQL monitoring. Your feedback will help us in designing and developing the right tools for the job and improve the platform’s overall ease-of-use. Please use the link below to access the survey, which should only take around 5 minutes to complete.
Hello,
I just watched the recording of Michael Brady's presentation on Ensemble Disk Free Space Monitoring. Is the sample code for the Task definition class still available? How can I obtain a copy?
Thanks
Hi Developers!
As you know the application errors live in ^ERRORS global. They appear there if you call:
d e.Log() in a Catch section of Try-Catch.
With @Robert Cemper's approach, you can now use SQL to examine it.
Inspired by Robert's module I introduced a simple IRIS Analytics module which shows these errors in a dashboard:
Hey Developers,
Check out the latest video on FHIR API Management:
⏯ FHIR API Management: Basic Configuration
⏯ FHIR API Management: FHIR Dev Portal
GA releases are now available for the first version (v1.0) of InterSystems System Alerting and Monitoring (InterSystems SAM for short) InterSystems SAM v1.0 provides a modern monitoring solution for InterSystems IRIS based products. It allows high-level views of clusters and single-node drilled down metrics-visualization together with alerts notifications. This first version provides visualization for more than one hundred InterSystems IRIS kernel metrics, and users can extend the default-supplied Grafana template to their liking. V1.0 is meant to be a simple and intuitive baseline. Help us
Hi All,
With this article, I would like to show you how easily and dynamically System Alerting and Monitoring(or SAM for short) can be configured. The use case could be that of a fast and agile CI/CD provisioning pipeline where you want to run your unit-tests but also stress-tests and you would want to quickly be able to see if those tests are successful or how they are stressing the systems and your application (the InterSystems IRIS backend SAM API is extendable for your APM implementation).
Hi,
During the implementation of iris-history-monitor using ZPM, I'm bumping on the following scenario:
My Installer.cls has a call for the Custom Sensors Class method. The Custom information looks like a charm as I described in this article:
IRIS History Monitor using custom built-in REST API /api/monitor/metrics
But, now I'm trying to replicate the same behavior using the module.xml to work with ZPM.
Preview releases are now available for the first version (v1.0) of InterSystems System Alerting and Monitoring (InterSystems SAM for short). InterSystems SAM v1.0 provides a modern monitoring solution for InterSystems IRIS-based products. It allows high-level views of clusters and single-node drilled down metrics-visualization together with alerts notifications. This first version provides visualization for more than one hundred InterSystems IRIS kernel metrics, and users can extend the default-supplied Grafana template to their liking. V1.0 is meant to be a simple and intuitive baseline.
Note (October 2022): yape has been deprecated and replaced by YASPE, there is no more development on yape.
Note (June 2019): A lot has changed, for the latest details go here
Note (Sept 2018): There have been big changes since this post first appeared, I suggest using the Docker Container version, the project and details for running as a container are still in the same place published on GitHub so you can download, run - and modify if you need to.
One of the topics that comes up often when managing Ensemble productions is disk space:
The database (the CACHE.DAT file) grows in a rate that was unexpected; or the Journal files build up at a fast pace; or the database grows continuously though the system has a scheduled purge of the Ensemble runtime data.
It would have been better if these kind of phenomena would have been observed and accounted for yet at the development and testing stage rather than on a live system.
For this purpose I created a basic framework that could aid in this task.
Hi everyone,
The project IRIS History Monitor received an update, using ZPM and the built-in REST API /api/monitor/metrics.
Hi all.
A long time ago I enabled Activity Monitoring to be able to save myself headaches in the future when looking at the performance of various message routes through our productions. It's served it's purpose of answering questions on how many messages we process a week etc but I had not had the chance to really dig down into the stats for specific message types or destinations to pin point issues.
That time has come, as I have an outbound that periodically queues up without much rhyme or reason.
Hi Community!
The new video from Global Summit 2019 is already on InterSystems Developers YouTube:
Hi Community,
The new video from Global Summit 2019 is already on InterSystems Developers YouTube:
Hello! This article continues the article "Making Prometheus Monitoring for InterSystems Caché". We will take a look at one way of visualizing the results of the work of the ^mgstat tool. This tool provides the statistics of Caché performance, and specifically the number of calls for globals and routines (local and over ECP), the length of the write daemon’s queue, the number of blocks saved to the disk and read from it, amount of ECP traffic and more. ^mgstat can be launched separately (interactively or by a job), and in parallel with another performance measurement tool, ^pButtons.
Off the back of the Interface Monitoring post I had created a class that queries the Ens.AlertRequest global and returns the entries between 6pm the night before and 6am in the morning.
I tested this build in our T&D environments and the build worked very well.
However in our production environment the query is being truncated, by what I believe to be a timeout and I get a partial query output.
In the System>SQL pages my 12 hour query times out.
APM normally focuses on the activity of the application but gathering information about system usage gives you important background information that helps understand and manage the performance of your application so I am including the IRIS History Monitor in this series.
In this article I will briefly describe how you start the IRIS or Caché History Monitor to build a record of the system level activity to go with the application activity and performance information you gather. I will also give examples of SQL to access the information.
This post is dedicated to the task of monitoring a Caché instance using SNMP. Some users of Caché are probably doing it already in some way or another. Monitoring via SNMP has been supported by the standard Caché package for a long time now, but not all the necessary parameters are available “out of the box”. For example, it would be nice to monitor the number of CSP sessions, get detailed information about the use of the license, particular KPI’s of the system being used and such. After reading this article, you will know how to add your parameters to Caché monitoring using SNMP.
Hi all,
I recently discovered the Monitoring Activity Volume feature in IRIS and I was amazed by it. So, I put it to work in one of our productions. It is nice how easy it is to set up and all the possibilites that came with it.
But there's something weird: the numbers. Actually, one of the BP is stating a time of more than 6 seconds to process:
But it is not really possible, as our production is running at a pace of about 40 msg/second, being this one the first step. So my question is: how is this avg. duration calculated? What does this time include? Is it in seconds?
Thanks a lot,
Just wanted to share my Zabbix template for monitoring InterSystems IRIS on Linux servers.
It monitors irisusr (configurable) memory consumption:
How to use:
Hello,
I want to create a dashboard with a line graph that shows system availability over time. I used this code to create a Dashboard:
Set tItem = ##class(%DeepSee.UserLibrary.Link).%New()
Set tItem.fullName = "Availability"
Set tPage = "Availability.UI.CSVImport.zen"
Set tItem.href = $system.CSP.GetPortalApp($namespace,tPage)_tPage
Set tItem.title = "Availability"
Set tSC = tItem.%Save()
I have a process in place that stores the data in a SQL table with three properties:
I'm a DBA and support Caché databases on AIX. I coded shell scripts for monitoring journaling status, databases size, license end date.
We recently got a new instance of Caché on Windows. I'm just curious to know whether anyone coded database monitoring scripts on Windows using PowerShell or any other scripting language.
If yes, please share the details.
Thanks & Regards,
Bharath Nunepalli.
Hi all,
I'm looking to set up monitoring for several interfaces. I understand that I can set an Inactivity Timeout. However, obviously there are messages coming through more frequently during certain hours than other hours.
Is there a way to set an Inactivity Timeout for each hour of the day instead of one value that is used all day long?
Best,
Erin