Thanks for your questions. I agree that an outage is difficult to define, but I'll mention a few things you should keep an eye on to give you a starting place. If you're not sure about a particular concept below, please take a look at the HealthShare/Ensemble Documentation for more information.
1) The Red Hat Server: A HealthShare/Ensemble instance runs on top of a server, so it's important to setup server-level monitoring to report any outages. This might come in the form of a secondary server recording network pings and noting when your HealthShare server doesn't respond or something more complex. Since you're running on Red Hat, work with your Linux admin to get more information on this.
2) Your HealthShare/Ensemble Instance(s): You can check whether your instance is running from Linux shell by running "ccontrol list" (run "ccontrol help" for additional information). Once again, your Linux admin should be able to write an OS script to continuously check if HealthShare/Ensemble is running and email if something is wrong. There are also a number of screens in the Management Portal including the 'System Dashboard' and 'Ensemble System Monitor' which list a value called "System Up Time" which shows how long the instance has been running. Note also that the "cconsole.log" file contains phrases like "Recovery started at Wed Aug 15 07:03:04 2018" which denote when the instances starts.
3) Your Ensemble Production(s) within an Instance: You will have one or more Ensemble Namespaces on your instance, each of which can supporting exactly one running Ensemble Production. A Running Ensemble Production has a status of "Running", so if you define an outage as your Ensemble Production being down, you can look for a status other than "Running". There are a number of classmethods in the 'Ens.Director' class which will report this information [like GetProductionStatus() and IsProductionRunning()].
4) Your Individual Business Host(s): An Ensemble production will have a number of Business Services, Business Processes, and Business Operations which might be in an Error or other bad state. Depending on how you define "outage", you might want to check on the status of these items using the "Ensemble Production Monitor" to confirm they're running properly.
I hope this helps.