We get questions on various topics from time to time on Log Insight. Many of the questions recently have been about data retention. The following comes from a discussion between a customer and Log Insight expert and engineering lead Jon Herlocker: (used with permssion)
Here is how retention works.
We have two kinds of data –
1. indexed/searchable data
2. archive data
Log Insight reviews all the VMDKs that are attached to the Log Insight virtual machine, and pools them, creating a single pool of storage. Log Insight then keeps as much indexed/searchable data as will fit on the pool of storage. When that pool of storage fills up, the oldest data is removed to make space for the new data. If archiving is enabled, instead of data being deleted, the indexes will be removed, and the raw/compressed data will be copied to a NFS server archive. The archived data is approximately 1/10 the size of the indexed data. At any point in time the archived data can be reloaded into a separately offline Log Insight instance if analysis of historical data is needed.
There is a configuration setting called the Retention Notification Threshold – the customer should set this to the required length of retention for *immediately searchable* data. Log Insight will continually measure the rate the input data and estimate how much long data can be retained with the currently available pool of storage. If the estimation drops below the Retention Notification Threshold, then the administrator is immediately notified that in the near future they are likely to drop below their desired retention rate. They can then take action to add more storage to Log Insight.
So the customer should set the set the Retention Notification Threshold to the HIPPA requirement for *immediately searchable* data, and Log Insight will then inform their administrator if they have enough storage to meet the requirement or provide enough information to estimate how much storage needs to be added. Alternatively if the customer needs to retain the data for compliance, but doesn’t need it immediately searchable, we recommend a lower cost archival storage array with an NFS server.
Now, of course, your installation will be different, so make sure to consult your local VMware specialist so your questions can be answered in context. Can’t wait, post a question to the VMTN community for Log Insight.