DEFAULT
access
required: false
scopable: true
candidates: rwo, roo, rwx, rox
default: rwo
The access mode of the volume.
rwo
is Read Write Onceroo
is Read Only Oncerwx
is Read Write Manyrox
is Read Only Many
rox
and rwx
modes are served by flex volume services.
app
required: false
scopable: false
default: default
A user-defined code linking to:
- who is responsible for this service.
- who is billable.
This code thus provides a most useful object grouping and filtering key.
Short and simple codes, like ERP, are easier to work with.
children
required: false
scopable: false
convert: list-lowercase
The list of services or instances expressed as <path>[@<nodename>]
that must
be down
or stdby up
to allow this service to be stopped by the daemon.
The list is whitespace-separated.
comment
required: false
scopable: false
Comments help the users understand the role of the object and its resources.
comp_schedule
required: false
scopable: true
default: ~00:00-06:00
The instance compliance run schedule.
See usr/share/doc/schedule
for the schedule syntax.
create_pg
required: false
scopable: true
default: true
convert: bool
Use process grouping when possible.
If turned on, the agent will create a container group for:
- the object
- each resource group (ie, the subset:drivergroup tuple)
- each resource
A container group allows capping the memory, swap and cpu usage.
These cappings can be defined using the pg_*
keywords in the
DEFAULT, the subset or the resource section.
disable
required: false
scopable: true
convert: bool
Disables the object instance, which has the following effects:
- The instance status and the status of all its resource is
n/a
. - Stop and start actions have no effect, and not produce error.
- Disabled resources are not enabled when DEFAULT.disable=false.
drpnodes
required: false
scopable: true
convert: other-nodes
Example:
drpnodes = n1 n2
A node selector expression specifying the list of cluster nodes hosting
object instances when all primary nodes
are unavailable, like in a
DRP situation.
If not specified or left empty, the node evaluating the keyword is assumed to be the only instance hosting node.
Labels can be used to define a list of nodes by an arbitrary property.
For example cn=fr cn=kr
would be evaluated as n1 n2 n3
if n1
and
n2
have the cn=fr
label and n3
has the cn=kr
label.
The glob syntax can be used in the node selector expression. For
example n1 n[23] n4*
would be expanded to n1 n2 n3 n4
in a
n1 n2 n3 n4 n5
cluster.
The drpnodes can be data synchronization targets for sync
resources.
env
required: false
scopable: false
default: The same as the node `env`.
A code like PRD, DEV, etc... the agent can use to enforce data protection policies:
- A non-PRD object instance can not be started on a PRD node
- A PRD object instance can be started on a non-PRD node (typically in a DRP situation)
The default value is read from the node env
keyword.
flex_max
required: false
scopable: false
depends: topology=flex
default: The number of elements in `nodes`.
convert: int
The maximum number of up instances of this object in the cluster. Above this number the aggregated object status is degraded to warn.
The 0
value is interpreted as unlimited.
flex_min
required: false
scopable: false
depends: topology=flex
default: 1
convert: int
The minimum number of up instances of this object in the cluster. Below this number the aggregated object status is degraded to warn.
flex_primary
required: false
scopable: true
depends: topology=flex
default: The first node of `nodes`.
convert: list-lowercase
The node in charge of syncing the other nodes in a flex object.
flex_target
required: false
scopable: false
depends: topology=flex
default: The value of `flex_min`.
convert: int
The optimal number of up instances of the object in the cluster.
The value must be between flex_min
and flex_max
.
If orchestrate=ha
, the daemon is free to take action to reach the
flex_target
.
hard_affinity
required: false
scopable: false
convert: list-lowercase
Example:
hard_affinity = svc1 svc2
A whitespace separated list of object paths.
These objects must be started on the local node to allow the local monitor to start an instance of the service.
hard_anti_affinity
required: false
scopable: false
convert: list-lowercase
Example:
hard_anti_affinity = svc1 svc2
A whitespace separated list of object paths.
These object must not be started on the local node to allow the local monitor to start an instance of the object.
id
required: false
scopable: false
default: A random generated UUID.
A rfc4122 random uuid generated by the agent.
monitor_action
required: false
scopable: true
candidates: crash, freezestop, none, reboot, switch, no-op
default: none
convert: list
Example:
monitor_action = reboot
The action to trigger when a monitored resource is no longer in the "up" or "standby up" state, and all restart attempts for the resource have failed.
The reboot
and crash
monitor actions do not attempt to cleanly stop any
processes. On Linux, they utilize system-level sysrq triggers.
This behavior is designed to ensure that the host stops writing to shared disks as quickly as possible, minimizing the risk of data corruption. This is critical because a failover node is likely preparing to write to the same shared disks.
You can append a fallback monitor action to this keyword. A common example
is freezestop reboot
. In this case, the reboot action will be executed
if the stop fails or times out.
Other monitor_actions values:
none
: the default value for monitor action disabled (monitor
keyword must be alsofalse
or undefined).freezestop
: freeze and subsequently stop the monitored instance.switch
: try monitored instance stop to allow any other cluster nodes to takeover the instance.no-op
: The monitor action No Operation is called but does nothing. It may be used for demonstration. The final local expect after call will be set toevicted
.
monitor_schedule
required: false
scopable: true
default: @5m
The instance monitored resources status evaluation schedule.
See usr/share/doc/schedule
for the schedule syntax.
nodes
required: false
scopable: true
default: The lowercased hostname of the evaluating node.
convert: nodes
Example:
nodes = n1 n*
A node selector expression specifying the list of cluster nodes hosting object instances.
If not specified or left empty, the node evaluating the keyword is assumed to be the only instance hosting node.
Labels can be used to define a list of nodes by an arbitrary property.
For example cn=fr cn=kr
would be evaluated as n1 n2 n3
if n1
and
n2
have the cn=fr
label and n3
has the cn=kr
label.
The glob syntax can be used in the node selector expression. For
example n1 n[23] n4*
would be expanded to n1 n2 n3 n4
in a
n1 n2 n3 n4 n5
cluster.
parents
required: false
scopable: false
convert: list-lowercase
The list of services or instances expressed as <path>[@<nodename>]
that must
be up
to allow this service to be started by the daemon.
The list is whitespace-separated.
pg_blkio_weight
required: false
scopable: true
Example:
pg_blkio_weight = 50
Block IO relative weight. Value: between 10
and 1000
.
The kernel default is 1000
.
pg_cpu_quota
required: false
scopable: true
Example:
pg_cpu_quota = 50%@all
The kernel default value is used, which usually is 1024 shares.
In a cpu-bound situation, this setting ensures the service does not use more than its share of cpu resource. The actual percentile depends on shares allowed to other services.
pg_cpu_shares
required: false
scopable: true
convert: size
Example:
pg_cpu_shares = 512
The kernel default value is used, which usually is 1024 shares.
In a cpu-bound situation, this setting ensures the service does not use more than its share of cpu resource. The actual percentile depends on shares allowed to other services.
pg_cpus
required: false
scopable: true
depends: create_pg=true
Example:
pg_cpus = 0-2
Allow service process to bind only the specified cpus.
Cpus are specified as list or range : 0,1,2
or 0-2
.
pg_mem_limit
required: false
scopable: true
convert: size
Example:
pg_mem_limit = 512m
Ensures the service does not use more than specified memory (in bytes).
The Out-Of-Memory killer is triggered in case of tresspassing.
pg_mem_oom_control
required: false
scopable: true
Example:
pg_mem_oom_control = 1
A flag (0 or 1) that enables or disables the Out of Memory killer for the processes of the group.
- If enabled (0), tasks that attempt to consume more memory than they are allowed are immediately killed by the OOM killer.
- If disabled (1), tasks are allowed to continue to try allocating memory, stressing the system.
The OOM killer is enabled by default in every cgroup using the memory controller.
pg_mem_swappiness
required: false
scopable: true
Example:
pg_mem_swappiness = 40
Set a swappiness percentile value for the process group.
pg_mems
required: false
scopable: true
Example:
pg_mems = 0-2
Allow service process to bind only the specified memory nodes.
Memory nodes are specified as list or range : 0,1,2
or 0-2
.
pg_vmem_limit
required: false
scopable: true
convert: size
Example:
pg_vmem_limit = 1g
Ensures the service does not use more than specified memory+swap (in bytes).
The Out-Of-Memory killer is triggered in case of tresspassing.
The specified value must be greater than pg_mem_limit
.
pool
required: false
scopable: true
The name of the pool this volume was allocated from.
pre_monitor_action
required: false
scopable: true
Example:
pre_monitor_action = /bin/true
A callout to execute before the monitor_action
.
For example, if monitor_action = freezestop
, a pre_monitor_action
script
may decide to crash the server if it detects a situation were freezestop
can
not succeed (for example, a fs can not be umounted due to an unresponsive
storage array).
provision_timeout
required: false
scopable: true
convert: duration
Example:
provision_timeout = 1m30s
Wait for <duration>
before declaring the action a failure.
Takes precedence over timeout
.
resinfo_schedule
required: false
scopable: true
default: @60m
The instance key-val table emit schedule.
See usr/share/doc/schedule
for the schedule syntax.
rollback
required: false
scopable: true
default: true
convert: bool
If set to false
, the default rollback on start action error behaviour is
disabled, leaving the instance in its half-started state (avail warn
).
The daemon then refuses to failover a service if any instance is in warn
availabity state. It is highly recommended to not use rollback=false
if
orchestrate=ha
.
run_schedule
required: false
scopable: true
The instance tasks run action default schedule.
See usr/share/doc/schedule
for the schedule syntax.
shared
required: false
scopable: true
default: true
convert: bool
If true
, the resource will be considered shared during provision and
unprovision actions.
A shared resource driver can implement a different behaviour depending on weither it is run from the leader instance, or not:
-
When
--leader
is set, the driver creates and configures the system objects. For example the disk.disk driver allocates a SAN disk and discover its block devices. -
When
--leader
is not set, the driver does not redo the actions already done by the leader, but may do some. For example, the disk.disk driver skips the SAN disk allocation, but discovers the block devices.
The daemon takes care of setting the --leader
flags on the commands
it submits during deploy, purge, provision and unprovision
orchestrations.
Warning: If admins want to submit
--local
provision or unprovision commands themselves, they have to set the--leader
flag correctly.
Flex objects usually don't use shared resources. But if they do, only
the flex primary gets --leader
commands.
size
required: false
scopable: true
convert: size
The size used by this volume in its pool.
soft_affinity
required: false
scopable: false
convert: list-lowercase
Example:
soft_affinity = svc1 svc2
A whitespace separated list of services that must be started on the node to allow the monitor to start this service.
If the local node is the only candidate ignore this constraint and allow start.
soft_anti_affinity
required: false
scopable: false
convert: list-lowercase
Example:
soft_anti_affinity = svc1 svc2
A whitespace separated list of services that must not be started on the node to allow the monitor to start this service.
If the local node is the only candidate ignore this constraint and allow start.
start_timeout
required: false
scopable: true
convert: duration
Example:
start_timeout = 1m30s
Wait for <duration>
before declaring the action a failure.
Takes precedence over timeout
.
stat_timeout
required: false
scopable: true
convert: duration
The fs resources status evaluation includes a stat syscall test. This keyword defines the maximum wait time for those stat calls to respond.
When expired, the resource status is degraded is to warn, which can trigger a monitor action (reboot or crash the node) if the resource is monitored.
status_schedule
required: false
scopable: true
default: @10m
The instance status evaluation schedule.
See usr/share/doc/schedule
for the schedule syntax.
status_timeout
required: false
scopable: true
default: 1m
convert: duration
Example:
status_timeout = 10s
The maximum duration of the instance status evaluation.
For example, the total start action duration is constrained by different timeouts:
-
the
start_timeout
Limiting the start action duration. -
the
stop_timeout
Limiting the start rollback duration triggered by start errors. -
the
status_timeout
Limiting the post-start instance status evaluation duration.
stonith
required: false
scopable: false
depends: topology=failover
default: false
convert: bool
Shoot The Other Node In The Head, aka fence, using a callout.
The callout is triggered after a quorum vote won, when the surviving node is about to start a local instance of a service that was known to be started on a unreachable peer node.
The callout is meant to prevent the peer from writing to shared disks, remote databases, and from responding to clients.
The Fence Agents project is a well known bundle of callout used by many clustering tools.
stop_timeout
required: false
scopable: true
convert: duration
Example:
stop_timeout = 1m30s
Wait for <duration>
before declaring the action a failure.
Takes precedence over timeout
.
sync_schedule
required: false
scopable: true
default: 04:00-06:00
The instance sync default schedule.
See usr/share/doc/schedule
for the schedule syntax.
sync_timeout
required: false
scopable: true
convert: duration
Example:
sync_timeout = 1m30s
Wait for <duration>
before declaring the action a failure.
Takes precedence over timeout
.
timeout
required: false
scopable: true
default: 1h
convert: duration
Example:
timeout = 2h
Wait for <duration>
before declaring a state-changing action a failure.
A per-action <action>_timeout
can override this value.
topology
required: false
scopable: false
candidates: failover, flex
default: failover
-
failover
The service is allowed to be up on one node at a time.
-
flex
The service can be up on
flex_target
nodes, whereflex_target
must be in the[flex_min, flex_max]
range.
type
required: false
scopable: false
The resource driver name.
unprovision_timeout
required: false
scopable: true
convert: duration
Example:
unprovision_timeout = 1m30s
Wait for <duration>
before declaring the action a failure.
Takes precedence over timeout
.