I think Vic's links are a good place to start. Every application is such a snowflake that it is very hard to make blanket recommendations. The community link will give you a good idea of how system resources are used and what to monitor to be aware of whether your app is near limits of has resources tp spare that could be rightsized. The documentation links are good for Caché / IRIS general guidelines as well. 

Hi, yes you can import using the system management portal; System >Classes Then import into %SYS.

Here is version information before;

%SYS>write $$version^SystemPerformance()
14

After import, you can see the version information changed. Also note there was a conversion run. The custom profile I had created before the import existed after the update. 

%SYS>write $$version^SystemPerformance()
$Id: //iris/2020.1.0/databases/sys/rtn/diagnostic/systemperformance.mac#1 $
%SYS>d ^SystemPerformance
Re-creating command data for new ^SystemPerformance version.
Old command data saved in ^IRIS.SystemPerformance("oldcmds").
Current log directory: /path/path/iris/mgr/
Available profiles:
     1  12hours     - 12 hour run sampling every 10 seconds
     2  24hours     - 24 hour run sampling every 10 seconds
     3  30mins      - 30 minute run sampling every 1 second
     4  4hours      - 4 hour run sampling every 5 seconds
     5  5_mins_1_sec- 5 mins 1 sec
     6  8hours      - 8 hour run sampling every 10 seconds
     7  test        - A 5 minute TEST run sampling every 30 seconds

Select profile number to run: 5

Collection of this sample data will be available in 420 seconds.
The runid for this data is 20200518_094753_5_mins_1_sec.

%SYS>

You can also import from the command line;

USER>zn "%SYS"

%SYS>do $system.OBJ.Load("/path/SystemPerformance-IRIS-All-2020.1.0-All.xml","ck")

Load started on 05/18/2020 10:02:13
Loading file /path/SystemPerformance-IRIS-All-2020.1.0-All.xml as xml
Imported object code: SystemPerformance
Load finished successfully.

%SYS>

If you mean are there metrics for Ensemble, HealthShare, etc. Then no, not at the moment. However, the roadmap is there for this. 

You can add custom metrics though; IRIS Documentation. See section "Create Application Metrics".

This will be very powerful when you start to combine telemetry from all the services that make up an application; from the OS, IRIS, and the application.

Hi Ron, I should have been clearer. The metrics are in a format to be consumed by Prometheus (or SAM). Once in Prometheus they go into a database that Grafana connects to as a Prometheus datasource. You want to do it this way to get the full functionality of Prometheus Queries + Grafana visualisation. We did try using a connector directly to IRIS but that really limits the functionality (was SimpleJSON). I will be publishing some example Grafana templates specific to IRIS soon. But the  Link here to Mikhail's post has an example of connecting to Grafana near the end.

Hi Ashley, I don't do much with Windows, but a colleague offered the following as 'quick and dirty' examples. Perhaps, as this gets bumped to the front page of community because of the answer someone else can contribute a more functional example.

For your production use you will need to substitute your paths etc and add logging and perhaps enhance the error checking. So with all the usual caveats of test before use in production and so on;

Freeze Script

D:
CD D:\InterSystems\T2017\mgr
..\bin\cache -s. -B -V -U%%SYS ##Class(Backup.General).ExternalFreeze() 

if errorlevel 3 goto NEXT
if errorlevel 5 goto FAIL
goto END

:NEXT
CD D:\InterSystems\HS20152\mgr
..\bin\cache -s. -B -V -U%%SYS ##Class(Backup.General).ExternalFreeze() 

if errorlevel 3 goto OK
if errorlevel 5 goto FAIL
goto END

:OK
Echo SYSTEM IS FROZEN
exit 0

:FAIL
echo ERROR
exit 1

:END
exit 1

Thaw Script

D:
CD D:\InterSystems\HS20152\mgr
..\bin\cache -s. -B -V -U%%SYS ##Class(Backup.General).ExternalThaw()

CD D:\InterSystems\T2017\mgr
..\bin\cache -s. -B -V -U%%SYS ##Class(Backup.General).ExternalThaw()

exit 0

Hi, good question. The answer is the typical consultant answer... it depends. The temptation is to offer a "Best Practice" answer, but really, there are no best practices, just what's best for you or your customer's situation. If your storage performance is OK then keep monitoring, but you don't have to change anything. If you are having storage performance problems, or your capacity planning says you will need to scale and optimise, then you need to start looking at strategies available to you. Direct IO is one, but there are others. What you have prompted me to do is think about a community post bringing together storage options, especially now as we have moved into a time of all-flash SSD, NVMe, Optane..... So, I got this far without any answer at all...

A quick summary, because it will take a while to write a new post. Direct I/O is a feature of the file system whereby file reads and writes go directly from the application to the storage device, bypassing the operating system read and write caches. Direct I/O is used only by applications (such as databases) that manage their own caches. 

For Caché and IRIS Journals already do direct IO to ensure the journal really is persisted to disk, not in a buffer. 

InterSystems do recommend Direct I/O in some situations specifically, for example for HCI on Linux because we do need to optimise IO on these platforms (vSAN, Nutanix, etc). See the HCI Post. Direct I/O is enabled for reads AND writes with the [config] setting wduseasyncio=1. This also enables asynchronous writes for the write daemon. There can be situations, like Caché online backup that is doing a lot of sequential writes, or where there is continuous database writes, like a database build with a lot of database expansions where OS write cache is an advantage. So don't think Direct I/O is an answer to every situation. If you are using a modern backup technology like snapshots then you will be fine tho'.

MO

On Linux use Asynchronous IO (asyncio) for RANREAD testing.
asyncio  enables direct IO for database reads and writes which bypasses file cache at the OS and LVM layers 

NOTE: Because direct IO bypasses filesystem cache, OS file copy operations including Caché Online Backup will be VERY slow when direct IO is configured.


Add the following to [config] section of the cache.cpf and restart Caché/HealthShare/IRIS: 

wduseasyncio=1
 


It might be helpful if you just have a 15-minute pButtons to run while RANREAD runs to see operating system io stats, eg iostat.

From zn "%SYS"

%SYS>set rc=$$addprofile^pButtons("15_minute","15 minute only", "1", "900")

%SYS>d ^pButtons
Current log directory: /trak/backup/benchout/pButtonsOut/
Available profiles:
     1  12hours     - 12 hour run sampling every 10 seconds
     2  15_minute   - 15 minute only
     3  24hours     - 24 hour run sampling every 10 seconds
     4  30mins      - 30 minute run sampling every 1 second
     5  4hours      - 4 hour run sampling every 5 seconds
     6  8hours      - 8 hour run sampling every 10 seconds
     7  test        - A 5 minute TEST run sampling every 30 seconds

select profile number to run:

Hi Rich, the short story is that I have fixed this an pushed to GitHub. A new container version will appear very soon.

The problem is caused by an unexpected date format in the "Profile" section of the pButtons HTML file.  Year format yy instead of expected yyyy. 

As a workaround, you can open the HTML file in a text editor and edit the date manually to be the expected format. eg

Profile run "24hours_5" started at 00:00:00 on Jun 08 19.

change to:

Profile run "24hours_5" started at 00:00:00 on Jun 08 2019.

Thanks for bringing this to my attention.!