Is there a function similar to Oracle's CDC in the cache database?

Question

Question

xiao jikang · Sep 17, 2018

After enabling archiving log functions, Oracle can get real-time change data through CDC.

Can Cache database realize such functions?

My requirement is to take incremental data from the cache database regularly every day and synchronize it to other relational databases.

Incremental data can be text files.

Discussion (16)1

Log in or sign up to continue

Pete Greskoff · Sep 18, 2018

From what I've gathered from Oracle, CDC is just a way to replicate data onto another system. If that's what you're looking for, Caché mirroring is the answer you're looking for. Take a look at:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY...

That's the general mirroring information for how it works. Contained in that chapter is information about reporting asyncs, which is what it sounds like Oracle's CDC is for:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY...

Though, it looks like Oracle replicates data to a different database on the same system, which mirroring will not allow. If you want to do replication to a different database on the same instance (though I can't think of a use case for that), you could use shadowing instead:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY...

1 0

Robert Cemper · Sep 20, 2018

with ^OBJ.DSTIME you get the registered changes.
It is up to you to process it and to generate whatever format or style you prefer when and how often you like

So it's nothing ready and YOU have to write some code and manage it.

DeepSee does nothing for you.

0 0

score 3 · Answer 1 · 2018-09-18T05:55:35-04:00

Hi Xiao.

Many people here know Caché. Not so many know Oracle.

If you would explain what is Oracle CDC. What it does, and what features do you need, then we would be able to answer how you can do this in Caché.

Thank you,
Alexander.

score 0 · Answer 2 · 2018-09-19T23:23:28-04:00

Sorry, I didn't expect that.

My requirement is to take incremental data from the cache database regularly every day and synchronize it to other relational databases.

Incremental data can be text files.

score 0 · Answer 3 · 2018-12-12T05:45:08-05:00

I will add it. For example, the amount of data in my cache database has reached 10Tb, I have imported this 10Tb data into hadoop hdfs, and then analyze it through hbase. However, business data continues to be generated. The cache does not support direct docking of hdfs to store data. Therefore, I need the same cdc record table as oracle to store all dml and ddl operations. I only need to extract the data from the cdc record table. Do the corresponding operation in hdfs. The journal records the information of the global. If it is out of the cache, the global information is useless. So many times we need data at the table level.

score 0 · Answer 4 · 2018-09-20T04:56:35-04:00

I would like to generate text files from the daily increments of the cache database to facilitate synchronization with other relational databases, such as oracle, db2, mysql...

Do not consider real-time, synchronize once a day.

Because the amount of data is too large, we can not use the full amount of synchronization.

score 0 · Answer 5 · 2018-09-20T04:58:21-04:00

xiao jikang · Sep 20, 2018

You say shadow and mirror are operations between cache and cache.

0 0

score 0 · Answer 6 · 2018-09-20T05:00:49-04:00

I would like to generate text files from the daily increments of the cache database to facilitate synchronization with other relational databases, such as oracle, db2, mysql...

Do not consider real-time, synchronize once a day.

Because the amount of data is too large, we can not use the full amount of synchronization.

Can DeepSee solve my problem?

score 0 · Answer 7 · 2018-09-20T12:21:00-04:00

Journal is definitely a valuable source.
BUT: It documents Global changes. It doesn't know about objects nor rows in tables.

score 0 · Answer 8 · 2018-09-20T20:25:05-04:00

I would recommend the DSTIME approach as well. We use this for syncing up information from our cache ERP to our BI tool. DeepSea is actually using DSTIME to show changes as well.

With using DSTIME you are allowed to write code and determine what DSTime batches you want to synchronize with, allowing you to fully control how often you want to sync up data.

score 0 · Answer 9 · 2018-09-20T22:04:33-04:00

xiao jikang · Sep 20, 2018

How to get the complete journal?

0 0

score 0 · Answer 10 · 2018-09-21T04:37:06-04:00

Robert Cemper · Sep 21, 2018

setting your terminal width to 132 will show the rest

0 0

score 1 · Answer 11 · 2018-09-20T12:32:15-04:00

Even SQL operations are captured in the journals as far as my understanding goes. But I'd agree any solution would have to check on any scenarios such as data not being journaled since processes can disable journaling "at will". Journaling can also be disabled at the db level.

Going under the assumption that "everything" is captured in the journals it should be possible to write a process that reads the journal file and convert its entries to another format.

score 0 · Answer 12 · 2018-09-20T05:08:20-04:00

Robert Cemper · Sep 20, 2018

right.

0 0

score 2 · Answer 13 · 2018-09-20T03:07:09-04:00

According to your explanation - the feature you are looking for is available in Caché

For the use with DeepSee there was a class parameter introduced to allow synchronization of tables.

DSTIME = 1 creates a log in ^OBJ.DSTIME to trace INSERT, UPDATE and DELETE of rows / objects.
see the general description and details on ^OBJ.DSTIME

score 1 · Answer 14 · 2018-09-20T09:53:54-04:00

For Cache/IRIS, the JOURNAL files would be the closest option to CDC. There isn't any "generic" product out there that can read and publish the content on journal files for things like streaming or other uses. You can certainly write something to read and publish/stream the changes but it'll be a customized option.

Cache/IRIS can do it internally for Cache-to-Cache sync/mirroring so it is possible to follow the same logic for other uses. Maybe check with Intersytems if they have had request lake this before. Either they already have a utility or can help you build one.