Question Adel Elsayed · Jul 21, 2021

What is the best way to export a global or a table to a file?

in order to do analysis on huge data volumes, it is better to take you data to a separate machine for analysis away from the operational machine, so trying to write huge globals or tables into files as is would take a huge space, what could be a solution or best practice ?..in python for instance, there is the pickling option (serializes data to byte string and saves it to a file) to save space, what could be best in object script ?

Comments

Karunanithi Veerarajan · Jul 21, 2021

from my point of view, writing and reading from a flat-file is a good one. but instead of that if you take backup whole database and use it wherever you want.  because it would be a generic one. do you face any problems with the backup?

0
Robert Cemper · Jul 21, 2021
  1. Whatever you do it affects the operational machine as you have to read through the whole Table/Global
  2. For Table it can be a simple SELECT * FROM TABLE  with the option to select columns of interest
  3. For Global  use $system.OBJ.Export(<globalname>.GBL,<filename>)  as XML.

 Of course, if you have a Shadow instance of your operational machine you can do it there and avoid the load on the primary machine.
Eventually, you can run there all analyses without export at all.

0
Anna Golitsyna  Jul 26, 2021 to Robert Cemper

Can one export a specific node as XML via $system.OBJ.Export or otherwise?

0
Bill Sherman  Jul 26, 2021 to Anna Golitsyna

Anna - yes, when you build your array of items to export, you can include expressions like:

set items("myGlobal(""x"").GBL") = ""

set status = $system.OBJ.Export(.items, myFile, myOptions)

... and this will export ^myGlobal("x").

0
Anna Golitsyna  Jul 26, 2021 to Bill Sherman

Very useful, just tested. Thanks!

0
Anna Golitsyna  Sep 21, 2021 to Bill Sherman

One more question if I may: I try to exclude global subnodes but so far unsuccessfully. Is it possible?
This correctly exports the corresponding node only: items("SOMEGLOBAL(""ABC"",""XYZ"",""USERFORM_ATEST"").GBL") = ""
Now if I try to exclude this node from the SOMEGLOBAL like this, items("SOMEGLOBAL.GBL",'"SOMEGLOBAL(""ABC"",""XYZ"",""USERFORM_ATEST"").GBL") = "", it does not exclude it. There is a barely visible single negating quote in the beginning of the second item.

0
Bill Sherman  Sep 21, 2021 to Anna Golitsyna

I don't think this syntax works. I think the exclusion is for matching against wild cards, for example if you wanted to export all the globals called some*.gbl but exclude one or more of them. I don't think it deals with subscripts.

You may need to merge your global into a temporary global, kill the subnodes you want to exclude, export, and then merge the temporary global back to restore them. (or something similar)

I understand that this is probably not an option if the data is huge. Maybe InterSystems support (WRC) can help.

0
Erik Hemdal · Jul 21, 2021

You can also use a reporting async or perhaps an ECP application server, depending on what analysis you are doing. 

0
Marcio Jose Pedroso Dias · Jul 21, 2021

Hi.

As the main objective is not to overload the server any of the solutions presented will overload the production server. Whether performing a backup, exporting data to file, etc. ​Especially if it is a repetitive, constant task with a large volume of data.

If your product license allows it, it is recommended to use a shadow server. Adopting a shadow server will redirect the processing effort from any adopted solution to another machine.
https://docs.intersystems.com/latest/csp/docbook/Doc.View.cls?KEY=GCDI_… 

0
Evgeny Shvarov · Jul 21, 2021

If you plan to have IRIS analytics server one of the options is to setup the second server as disaster recovery mirror member - it's an async mirror member.

This will let you to have a server that is almost always automatically up-to-date  and the server is ready for analytics queries load

0
David Hockenbroch · Jul 21, 2021

If your other system is also a Cache or IRIS server, there is a mirroring options called a reporting async that exists for this specific purpose. If you've got multiple servers and need to consolidate that data for reporting purposes, a reporting async can be a part of up to 10 mirrors to help you bring that data together, too.

0
Scott Roth · Jul 17

Is there a way to make the file, pipe delimited vs comma delimited?

0