posted

7 Comments

There are several advanced configuration options that one can use to modify the operation of vSphere HA 5.0.  The following lists the supported options that relate to vSphere HA 5.0.  As the use of these options can significantly impact the operation of vSphere HA and hence the availability protection provided, it is highly recommended that users fully understand the use and ramifications of using these options.

** UPDATE **

Knowledge base article 1006421 provides the latest, most up-to-date information on these options. 

****

 HA Advanced Options

The following advanced options can be configured on a per-cluster basis through the use of the HA Advanced Options section of the user interface.  In some cases, the use of an option requires you to reconfigure vSphere HA on all hosts before the option takes effect. 

Name

Default

Valid values

Description

Reconfig?

das.isolationAddressX

 

 

Sets the address to ping to determine if a host is isolated from the network. This address is pinged only when heartbeats are not received from any other host in the cluster. If not specified, the default gateway of the management network is used. This default gateway has to be a reliable address that is available, so that the host can determine if it is isolated from the network. You can specify multiple isolation addresses (up to 10) for the cluster: das.isolationaddressX, where X = 1-10. Typically you should specify one per management network. Specifying too many addresses makes isolation detection take too long.

Y

das.allowNetworkX

 

 

This option is only recommended for ESXi 3.5 hosts. To control the network selection for ESXi 4.0 and more recent hosts, use the UI to specifiy the port groups that are to be used for management. Port group names of management networks to use for HA communication. X should be replaced by 0-9. If not used, all appropriate networks will be used.

Y

das.useDefaultIsolationAddress

 

true/

false

By default, vSphere HA uses the default gateway of the console network as an isolation address. This attribute specifies whether or not this default is used

Y

das.isolationShutdownTimeout

300

 

The period of time in seconds the system waits for a virtual machine to shut down before powering it off. This only applies if the host's isolation response is Shut down VM.

N

das.maxvmrestartcount

5

 

Defines the maximum number of times a HA master agent will try restart a VM after a failure before giving up and reporting it was unable to restart the VM.

N

das.maxftvmrestartcount

5

 

Defines the maximum number of times a HA master agent will try to start a secondary VM of an vSphere Fault Tolerance VM pair before giving up and reporting it could not.

N

das.ignoreRedundantNetWarning

false

true/

false

Suppress the host config issue about lack of redundant management networks on a host in a HA enabled cluster

N

das.vmMemoryMinMB

0

 

Defines the default memory resource value assigned to a virtual machine if its memory reservation is not specified or zero. This is used for the Host Failures Cluster Tolerates admission control policy

N

das.vmCpuMinMHz

256

 

Defines the default CPU resource value assigned to a virtual machine if its CPU reservation is not specified or zero. This is used for the Host Failures Cluster Tolerates admission control policy. If no value is specified, the default is 256MHz.

N

das.slotCpuInMHz

 

 

Defines the maximum bound on the CPU slot size. If this option is used, the slot size is the smaller of this value or the maximum CPU reservation of any powered-on virtual machine in the cluster.

 

das.slotMemInMB

 

 

Defines the maximum bound on the memory slot size. If this option is used, the slot size is the smaller of this value or the maximum memory reservation plus memory overhead of any powered-on virtual machine in the cluster.

 

das.includeFTcomplianceChecks

true

true/

false

Controls whether vSphere Fault Tolerance compliance checks should be run as part of the cluster compliance checks. Set this option to false to avoid cluster compliance failures when Fault Tolerance is not being used in a cluster.

N

das.maxFtVmsPerHost

4

0 means no limit

Defines the maximum number of vSphere Fault Tolerance primary of secondary VMs that can be placed on a host during normal operation.When a value greater than zero is defined, attempts to power on more than the specified number of FT VMs on a the same host will fail. Further, vSphere DRS, if enabled, won’t exceed this limit. However, vSphere DRS won’t correct any violations of the limit and vSphere HA will ignore the limit when responding to a failure.

N

das.ignoreInsufficientHbDatastore

false

true/

false

Suppress the host config issue that the number of heartbeat datastores is less than das.heartbeatDsPerHost

N

das.heartbeatDsPerHost

2

2-5

Defines the number of required heartbeat datastores per host. vCenter Server will attempt to chose the specified number and if it cannot, will report a configuration issue on the host. This issue can be suppressed using the das.ignoreInsufficientHbDatastore option.

Y

HA Agent (FDM) Configuration Options

The following options are set on a per-host basis by editing the fdm.cfg file on the host.  Alternately, these can also be set on a per-cluster basis through the vSphere Client by prepending ‘das.config.’ to the option name.  Use of any of these options requires a restart for them to take effect.

Name

Default

Description

Cluster Manager

fdm.deadIcmpPingInterval

10

ICPM pings are used to determine whether a slave host is network accessible when the FDM on that host is not connected to the master. This parameter controls the interval (expressed in seconds) between pings.

fdm.icmpPingTimeout

5

Defines the time to wait in seconds for an ICMP ping reply before assuming the host being pinged is not network accessible.

fdm.hostTimeout

10

Controls how long a master FDM waits in seconds for a slave FDM to respond to a heartbeat before declaring the slave host not connected and initiating the workflow to determine whether the host is dead, isolated, or partitioned.

fdm.stateLogInterval

600

Frequency in seconds to log cluster state.

fdm.nodeGoodness

0

When a master election is held, the FDMs exchange a goodness value, and the FDM with the largest goodness value is elected master. Ties are broken using the host IDs assigned by VC. This parameter can be used to override the computed goodness value for a given FDM. To force a specific host to be elected master each time an election is held and the host is active, set this option to a large positive value.  This option should not be specified on a per-cluster basis.

Inventory Manager

fdm.ft.cleanupTimeout

900

When a vSphere Fault Tolerance VM is powered on by vCenter Server, vCenter Server informs the HA master agent that it is doing so. This option controls how many seconds the HA master agent waits for the power on of the secondary VM to succeed. If the power on takes longer than this time (most likely because vCenter Server has lost contact with the host or has failed), the master agent will attempt to power on the secondary VM.

fdm.storageVmotionCleanupTimeout

900

When a storage vmotion is done in a HA enabled cluster using pre 5.0 hosts and the home datastore of the VM is being moved, HA may interpret the completion of the storage vmotion as a failure, and may attempt to restart the source VM. To avoid this issue, the HA master agent waits the specified number of seconds for a storage vmotion to complete. When the storage vmotion completes or the timer expires, the master will assess whether a failure occurred.

Policy Manager

fdm.policy.unknownStateMonitorPeriod

10

Defines the number of seconds the HA master agent waits after it detects that a VM has failed before it attempts to restart the VM.

FDM Service

fdm.event.maxMasterEvents

1000

Defines the maximum number of events cached by the master

fdm.event.maxSlaveEvents

600

Defines the maximum number of events cached by a slave.

 

VPXD Configuration Options

The following options are configure the behavior of vpxd.  Editing the vpxd.cfg file, which will affect all clusters in the inventory of vCenter Server, sets them.  All of these options require a restart of vpxd in order for them to take effect.

Name

Default

Description

vpxd.das.reportNoMasterSec

120

How long to wait in seconds before appending a cluster config issue to report that vCenter Server was unable to locate the HA master agent for the corresponding cluster.

vpxd.das.sendProtectListIntervalSec

60

Minimum time (in seconds) between consecutive calls by vCenter Server to the HA master agent to request that it protect a new VM.

vpxd.das.aamMemoryLimit

100

Memory limit in MB for AAM resource pool (used for FDM)

vpxd.das.electionWaitTimeSec

120

When configuring HA on a host, how long to wait in seconds after sending the host list to a new host for the FDM to become configured (change to master or slave state)

vpxd.das.heartbeatPanicMaxTimeout

60

Defines the value HA uses (in seconds) when configuring the host Misc.HeartbeatPanicTimeout advanced option

vpxd.das.slotMemMinMB

0

Default value in MB to use for memory reservation if no user value is set on any VM. Use to compute the slot size for HA admission control.

vpxd.das.slotCpuMinMHz

32

Default value in MHz to use for cpureservation if no user value is set on any VM. Use to compute the slot size for HA admission control