ahh, OK, I don't know why I got primary/alternate in my head... so the example is the standard message;

Journaling switched to:  ...

Simpler to watch the cconsole.log?

I can't find an example but; test if the primary and alternate are on different paths?

06/23/18-19:37:30:760 (19971) 0 CACHE JOURNALING SYSTEM MESSAGE
Journaling switched to: /trak/site/live/jrnpri/MIRROR-TCMIRROR-20180623.010

Hi, good question. The answer is the typical consultant answer... it depends. The temptation is to offer a "Best Practice" answer, but really, there are no best practices, just what's best for you or your customer's situation. If your storage performance is OK then keep monitoring, but you don't have to change anything. If you are having storage performance problems, or your capacity planning says you will need to scale and optimise, then you need to start looking at strategies available to you. Direct IO is one, but there are others. What you have prompted me to do is think about a community post bringing together storage options, especially now as we have moved into a time of all-flash SSD, NVMe, Optane..... So, I got this far without any answer at all...

A quick summary, because it will take a while to write a new post. Direct I/O is a feature of the file system whereby file reads and writes go directly from the application to the storage device, bypassing the operating system read and write caches. Direct I/O is used only by applications (such as databases) that manage their own caches. 

For Caché and IRIS Journals already do direct IO to ensure the journal really is persisted to disk, not in a buffer. 

InterSystems do recommend Direct I/O in some situations specifically, for example for HCI on Linux because we do need to optimise IO on these platforms (vSAN, Nutanix, etc). See the HCI Post. Direct I/O is enabled for reads AND writes with the [config] setting wduseasyncio=1. This also enables asynchronous writes for the write daemon. There can be situations, like Caché online backup that is doing a lot of sequential writes, or where there is continuous database writes, like a database build with a lot of database expansions where OS write cache is an advantage. So don't think Direct I/O is an answer to every situation. If you are using a modern backup technology like snapshots then you will be fine tho'.

MO

On Linux use Asynchronous IO (asyncio) for RANREAD testing.
asyncio  enables direct IO for database reads and writes which bypasses file cache at the OS and LVM layers 

NOTE: Because direct IO bypasses filesystem cache, OS file copy operations including Caché Online Backup will be VERY slow when direct IO is configured.


Add the following to [config] section of the cache.cpf and restart Caché/HealthShare/IRIS: 

wduseasyncio=1
 


It might be helpful if you just have a 15-minute pButtons to run while RANREAD runs to see operating system io stats, eg iostat.

From zn "%SYS"

%SYS>set rc=$$addprofile^pButtons("15_minute","15 minute only", "1", "900")

%SYS>d ^pButtons
Current log directory: /trak/backup/benchout/pButtonsOut/
Available profiles:
     1  12hours     - 12 hour run sampling every 10 seconds
     2  15_minute   - 15 minute only
     3  24hours     - 24 hour run sampling every 10 seconds
     4  30mins      - 30 minute run sampling every 1 second
     5  4hours      - 4 hour run sampling every 5 seconds
     6  8hours      - 8 hour run sampling every 10 seconds
     7  test        - A 5 minute TEST run sampling every 30 seconds

select profile number to run:

Hi all, I have been advised that the rtkaio library has been discontinued in the SUSE distribution since SUSE 9, so you cannot use the rtkaio libraray on SUSE. Specifically for SUSE do NOT add the following to the cache.cpf file. Also note that the rtkaio library is not needed for IRIS -- only Caché.

LibPath=/lib64/rtkaio/

Hi Rich, the short story is that I have fixed this an pushed to GitHub. A new container version will appear very soon.

The problem is caused by an unexpected date format in the "Profile" section of the pButtons HTML file.  Year format yy instead of expected yyyy. 

As a workaround, you can open the HTML file in a text editor and edit the date manually to be the expected format. eg

Profile run "24hours_5" started at 00:00:00 on Jun 08 19.

change to:

Profile run "24hours_5" started at 00:00:00 on Jun 08 2019.

Thanks for bringing this to my attention.!

For newer storage, especially all flash, 10,000 iterations will be too quick, change this to 100,000 for sustained read activity -- it should be less than a minute on SSD storage for each step. For example using the above example;

for i in `seq 2 2 30`; do echo "do ##class(PerfTools.RanRead).Run(\"/db/RANREAD\",${i},100000)" | csession CACHEINSTNAME -U "%SYS"; done

hmmm... it worked for me just now.... I did notice when I exported from the markdown editor I used it came across as "\<!--break--\> escaping the slash, but edit to "<!--break-->"  worked OK. on the post about minimum monitoring and alerting solution. To be honest I had not tried until you mentioned it. haha yes switch MD to WYSWIG is a big mistake :(