Jon Astle · Jul 26, 2018

Database Size Getting Out Of Control

I work for a large NHS Trust in the UK and we are using Healthshare and we process 1000s of messages each day.  Many of these are standard HL7 messages however for several months now we also pickup and drop off 1000s of PDF files.

We have our message purge set to 365 as we have to keep a years worth of messages as we have a retrieval and send process that enables us to replay any set of messages to any destination which we use to prepopulate end systems with activity and result history.

The issue we have is our database is growing out of control and I am not sure the best way to remediate this situation.

Is there a way that I could setup a custom purge to get rid of some of the stream data within the database?  I realise that I could move the PDF pickup and drop off projects to another production however and decease the purge on the new production however that will only decrease the main production day by day over the next 12 months.

Any suggestions would be greatly received?

2 0 4 526
Log in or sign up to continue
  • Declaring Stream Properties
    You can store in the database not the files themselves, but only links to them. Thus, the size of your database will be small.
  • Managing Caché
    You can create a custom task on a schedule in which you need programmatic to clean up old files and then compress cache.dat (compact/truncate).

Depending on PDF type (character-based) using GZIP file stream may provide additional size advantages.

Hi, I also help support an NHS trust using Ensemble, and it also has ever-growing PDF files in messages. We have our incoming PDFs as external file streams and it helps, though you have to bear in mind that the files are not going to be part of the cache backup for Disaster Recovery, etc. (Not sure about mirroring. I'd assume they don't get mirrored either as the contents are not in the journal.)

As yet, we don't have as big a problem as you - less messages and we only keep 92 days - but that is just as well as the PDF files are converted to base64 encoded in HL7 v2 messages, so they then do take up space in the database, and the journal, and the backup, which has resulted in the need to expand the disk space recently. I can recommend keeping Ensemble on a virtual server with disk expansion on demand.

I tend to think the problem is not going to go away whatever you do. I assume, like us, the PDFs come from 3rd party applications and they are always going to be producing  ever more and prettier documents as time goes by. So I recommend looking at more disk. :-)  / Mike