Good morning folks.
Fresh off the press, here is the first edition of our new blog series The Inside Scoop. See the introductory post for this series for some additional background information.
This first edition focuses on the topic Common configuration pitfalls and issues our vCenter Server users are encountering.
In a somewhat of a journalistic fashion, we met up with some of our front line Technical Support Engineers at our Support Center in Cork, Ireland and sat down with them so that we could pick their brains for a while. These Support Engineer's primary area of expertise is for our vSphere product family, more specifically, our vCenter Server product. They spend a large portion of their time helping customers troubleshoot the issues that they are encountering in their environments.
When we met up with these very helpful individuals, we had one question on our minds: what are the common vCenter Server configuration issues that are seen most often coming into the Technical Support teams.
Well, enough talking from us. Here are the most common configuration pitfalls and issues our vCenter Server users are encountering according to our Technical Support Engineers.
vCenter Server Statistics levels set too high
Often times we receive Support Requests from users complaining of poor or slow performance in their vCenter Server environments, specifically after the vCenter Server Statistics levels have been increased. Other symptoms can include:
- The VMware VirtualCenter Server service takes longer to start or may even fail to start
- The vCenter Server database increases significantly in size
- The database transaction log increases significantly in size
- The performance data rollup jobs and stored procedures take a long time to run and may fail
- Database jobs may fail
- Performance graphs display gaps
The default statistic level is set to "1" which is the lowest setting. Sometimes this is increased to higher levels within an environment for various reasons, more than likely troubleshooting, but is forgotten, and not decreased after the fact. This can lead to a higher load being placed on the vCenter Server and the amount of data which the vCenter Server rollup jobs need to process will increase. As a result, vCenter Server might experience performance effects such as slowness, database job delays, failures, and even service interruptions.
Realistically, the Statistics level should only be increased for troubleshooting purposes and only for a short period of time, no longer than 24 hours, and should be reduced again as soon as possible in order to avoid these kinds of performance issues.
Your best bet would be to always ensure that the Statistics levels remain at the default level and only increase when absolutely necessary. For additional information see the following VMware Knowledge Base articles:
- Slow vCenter Server performance after increasing the performance data statistics collection level to more than 2 (2003885)
- Determining if vCenter Server rollup jobs are processing performance data (2007388)
vCenter Server performance data (Tasks & Events) growing too large
Another common issue that is seen through various incoming Support Requests is when vCenter Server experiences performance issues due to the Tasks and Events data growing too large, consuming disk space and ultimately leading to vCenter Server failures.
Some of the symptoms which users report are as follows:
- vCenter Server service starts but eventually crashes
- A slow down in performance is noticed regarding vCenter Server
- Database jobs are failing
- The database transaction logs are working correctly
- Viewing the data Disk usage by tables report shows the VPX_EVENT and VPX_EVENT_ARG tables utilizing the maximum space
- You may notice excessive growth of the VPX_EVENT table, and this error message in the Microsoft SQL Event log:
Could not allocate space for object 'dbo.VPX_EVENT'.'VPXI_EVENT_USERNAME' in database 'VCDB' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup
vCenter Server stores performance data in the vCenter Server database. Over time, data collection results in growth of the database files. This issue is actually pretty easy and straight forward to resolve and can be prevented from reoccurring again.
vCenter Server has a Database Retention Policy setting that allows you to specify when vCenter Server tasks and events should be deleted. VMware Knowledge base article Purging old data from the database used by vCenter Server (1025914) discusses this topic and provides instructions concerning the Database Retention Policy. There is also an embedded KBTV video in that KB article which demonstrates purging old data from the vCenter Server database.
vCenter Server SQL database transaction logs growing to excessive sizes
Thirdly and lastly for today is the issue of the Database transaction logs filling up. This is another very common issue reported by our vCenter Server users. Users often report symptoms such as:
- The transaction log of the vCenter Server database grows to an excessive size
- The vCenter Server service fails to start
- The Windows Event logs state - Faulty Application : vpxd.exe
- The vCenter Server vpx.log may show an error similar to to the following when connected to a SQL database
ODBC error: (42000) - [Microsoft][ODBC SQL Server Driver][SQL Server]The log file for database '<database>' is full. Back up the transaction log for the database to free up some log space
Many users opt to use Microsoft's SQL database for their vCenter Server installation. SQL Server offers administrators a choice of recovery models, which is the primary factor that determines transaction log disk space requirements. The default recovery model on the full version of SQL is Full Recovery. This recovery model has the potential to consume all available disk space if appropriate database maintenance is not performed. It is best practice to schedule regular backups of the database and transaction log to avoid unnecessary growth.
More often than not, the vCenter Server will crash when all of the disk space is being consumed by the transaction logs. Also, if the database does not have enough space to write to the transaction log, the vCenter Server service will fail to start. VMware Knowledge Base article Troubleshooting transaction logs on a Microsoft SQL database server (1003980) is the first port of call if you are encountering such symptoms. This article clearly describes what is happening and what you can do to resolve this issue.
Our Technical Support Engineers will often recommend using the Simple Recovery model unless your business requirement dictates otherwise. There are some limitations to using the Simple Recovery model and these are described within the Additional Information section of the above mentioned Knowledge Base article, so be sure to read up on them and familiarize yourself first before committing to a particular recovery model. Other Knowledge Base articles which may be of interest here include:
- Investigating the health of a vCenter Server database (1003979)
- SQL Server Recovery Model Affects Transaction Log Disk Space Requirements (1001046)
That concludes our Inside Scoop edition for this week and we hope this will be of some help and assistance to you in regards to keeping your vCenter Server environments running as smoothly as possible. Be sure to check back with us again in the coming weeks for the next edition of The Inside Scoop which will focus on Top tips for troubleshooting SSO related issues.