Question Gabriel Silva dos Santos · Jan 17

Problems with data replication in Caché 2016.2

Hello everyone,

I’m facing issues with replicating data from my Caché 2016 database to a PostgreSQL database. I need to handle around 300 data updates per minute, and whenever certain tables are modified, those changes must be reflected in other databases.

So far, I’ve tried various approaches, including:

  • Setting up an intermediary API,
  • Using Azure Service Bus,
  • Leveraging Caché Jobs,
  • All of which rely on table triggers as the entry point.

However, each of these solutions has led to performance bottlenecks and system lockups. I’m running out of ideas on how to proceed, as none of the implementations have been stable. Could anyone offer some guidance or share best practices for this type of integration?

Thanks in advance for your help!

Product version: Caché 2016.1
$ZV: Cache for Windows (x86-64) 2016.2.1 (Build 803_1) Mon Feb 27 2017 09:47:46 EST

Comments

Yaron Munz · Jan 17

Hello Gabriel,

It seems that updates to the other database (PostgreSQL) need to be "close to real-time," though a slight delay is acceptable. What matters most to you is ensuring stability and preventing any loss of updates.

I would consider the following:
1. Using the "SQL Gateway Connection" capability to connect remote tables directly to Cache. The benefit is that you have all logic on Cache side (having a remote REST/API will need also some remote logic to return the status of the operation in case of local updates failures)
2. Loosely coupling the local updates (Cache) with the remote updates:
a. Create a "staging area" (which could be a global or a class/table) to hold all updates to the remote table. These updates will be set by a trigger, ensuring that updating the local object/table in Cache is not delayed by updates to the remote database, The staging area delete its entries only on successful update (when failing they will be kept) - so you might need a monitor process to alert when the staging area is cleaning (e.g. remote DB is down, or network issues)
b. Use a separate (dependent) component to handle the updates. If you have interoperability (Ensemble), this might be easier to implement. However, it’s not mandatory; you could also use a background job (or a task) to periodically scan the "staging area" and perform the updates  
 

0
Gabriel Silva dos Santos  Jan 17 to Yaron Munz

The option I’m looking into is the first one, where I can directly access PostgreSQL and perform updates almost in real time without causing significant bottlenecks. However, I’m facing technical challenges in implementing this approach. Do you have any tips or guidance on how I can make this work successfully?

0
Robert Cemper · Jan 17

Almost 6 years back  I wrote and article on data synchronization
Using DSTIME  and a related example in OEX.
It is focussed on detecting and optimizing insert, change, delete of specified Tables/Classes
and it takes care of processing cycles to avoid duplications.
So the output might be minimized.
Transmission speed to PostgreSQL is not part of the example
 

0
Gabriel Silva dos Santos  Jan 17 to Robert Cemper

Interesting example! I’ll read the article to understand it better… Does this work for Caché 2016.2?

0
Robert Cemper  Jan 20 to Gabriel Silva dos Santos

Yes i! I'm quite sure! 
DSTIME was introduced with or short after release of  DeepSee more
than a decade back and didn't change.

0
Gabriel Silva dos Santos  Jan 21 to Robert Cemper

Robert, thank you so much for your help and guidance. I managed to implement a solution where I set up a .NET service running on the server. It uses localhost to receive the data to be sent to another database and takes responsibility for sending it to the service bus. This resolved the bottleneck and met the required demand.

Best regards, and thank you again!

0