DEFAULT

access

required:    false
scopable:    true
candidates:  rwo, roo, rwx, rox
default:     rwo

The access mode of the volume.

  • rwo is Read Write Once
  • roo is Read Only Once
  • rwx is Read Write Many
  • rox is Read Only Many

rox and rwx modes are served by flex volume services.

app

required:    false
scopable:    false
default:     default

A user-defined code linking to:

  • who is responsible for this service.
  • who is billable.

This code thus provides a most useful object grouping and filtering key.

Short and simple codes, like ERP, are easier to work with.

children

required:    false
scopable:    false
convert:     list-lowercase

The list of services or instances expressed as <path>[@<nodename>] that must be down or stdby up to allow this service to be stopped by the daemon.

The list is whitespace-separated.

comment

required:    false
scopable:    false

Comments help the users understand the role of the object and its resources.

comp_schedule

required:    false
scopable:    true
default:     ~00:00-06:00

The instance compliance run schedule.

See usr/share/doc/schedule for the schedule syntax.

create_pg

required:    false
scopable:    true
default:     true
convert:     bool

Use process grouping when possible.

If turned on, the agent will create a container group for:

  • the object
  • each resource group (ie, the subset:drivergroup tuple)
  • each resource

A container group allows capping the memory, swap and cpu usage. These cappings can be defined using the pg_* keywords in the DEFAULT, the subset or the resource section.

disable

required:    false
scopable:    true
convert:     bool

Disables the object instance, which has the following effects:

  • The instance status and the status of all its resource is n/a.
  • Stop and start actions have no effect, and not produce error.
  • Disabled resources are not enabled when DEFAULT.disable=false.

drpnodes

required:    false
scopable:    true
convert:     other-nodes

Example:

drpnodes = n1 n2

A node selector expression specifying the list of cluster nodes hosting object instances when all primary nodes are unavailable, like in a DRP situation.

If not specified or left empty, the node evaluating the keyword is assumed to be the only instance hosting node.

Labels can be used to define a list of nodes by an arbitrary property. For example cn=fr cn=kr would be evaluated as n1 n2 n3 if n1 and n2 have the cn=fr label and n3 has the cn=kr label.

The glob syntax can be used in the node selector expression. For example n1 n[23] n4* would be expanded to n1 n2 n3 n4 in a n1 n2 n3 n4 n5 cluster.

The drpnodes can be data synchronization targets for sync resources.

env

required:    false
scopable:    false
default:     The same as the node `env`.

A code like PRD, DEV, etc... the agent can use to enforce data protection policies:

  • A non-PRD object instance can not be started on a PRD node
  • A PRD object instance can be started on a non-PRD node (typically in a DRP situation)

The default value is read from the node env keyword.

flex_max

required:    false
scopable:    false
depends:     topology=flex
default:     The number of elements in `nodes`.

convert:     int

The maximum number of up instances of this object in the cluster. Above this number the aggregated object status is degraded to warn.

The 0 value is interpreted as unlimited.

flex_min

required:    false
scopable:    false
depends:     topology=flex
default:     1
convert:     int

The minimum number of up instances of this object in the cluster. Below this number the aggregated object status is degraded to warn.

flex_primary

required:    false
scopable:    true
depends:     topology=flex
default:     The first node of `nodes`.

convert:     list-lowercase

The node in charge of syncing the other nodes in a flex object.

flex_target

required:    false
scopable:    false
depends:     topology=flex
default:     The value of `flex_min`.

convert:     int

The optimal number of up instances of the object in the cluster. The value must be between flex_min and flex_max.

If orchestrate=ha, the daemon is free to take action to reach the flex_target.

hard_affinity

required:    false
scopable:    false
convert:     list-lowercase

Example:

hard_affinity = svc1 svc2

A whitespace separated list of object paths.

These objects must be started on the local node to allow the local monitor to start an instance of the service.

hard_anti_affinity

required:    false
scopable:    false
convert:     list-lowercase

Example:

hard_anti_affinity = svc1 svc2

A whitespace separated list of object paths.

These object must not be started on the local node to allow the local monitor to start an instance of the object.

id

required:    false
scopable:    false
default:     A random generated UUID.

A rfc4122 random uuid generated by the agent.

monitor_action

required:    false
scopable:    true
candidates:  crash, freezestop, none, reboot, switch, no-op
default:     none
convert:     list

Example:

monitor_action = reboot

The action to trigger when a monitored resource is no longer in the "up" or "standby up" state, and all restart attempts for the resource have failed.

The reboot and crash monitor actions do not attempt to cleanly stop any processes. On Linux, they utilize system-level sysrq triggers.

This behavior is designed to ensure that the host stops writing to shared disks as quickly as possible, minimizing the risk of data corruption. This is critical because a failover node is likely preparing to write to the same shared disks.

You can append a fallback monitor action to this keyword. A common example is freezestop reboot. In this case, the reboot action will be executed if the stop fails or times out.

Other monitor_actions values:

  • none: the default value for monitor action disabled (monitorkeyword must be also false or undefined).
  • freezestop: freeze and subsequently stop the monitored instance.
  • switch: try monitored instance stop to allow any other cluster nodes to takeover the instance.
  • no-op: The monitor action No Operation is called but does nothing. It may be used for demonstration. The final local expect after call will be set to evicted.

monitor_schedule

required:    false
scopable:    true
default:     @5m

The instance monitored resources status evaluation schedule.

See usr/share/doc/schedule for the schedule syntax.

nodes

required:    false
scopable:    true
default:     The lowercased hostname of the evaluating node.

convert:     nodes

Example:

nodes = n1 n*

A node selector expression specifying the list of cluster nodes hosting object instances.

If not specified or left empty, the node evaluating the keyword is assumed to be the only instance hosting node.

Labels can be used to define a list of nodes by an arbitrary property. For example cn=fr cn=kr would be evaluated as n1 n2 n3 if n1 and n2 have the cn=fr label and n3 has the cn=kr label.

The glob syntax can be used in the node selector expression. For example n1 n[23] n4* would be expanded to n1 n2 n3 n4 in a n1 n2 n3 n4 n5 cluster.

parents

required:    false
scopable:    false
convert:     list-lowercase

The list of services or instances expressed as <path>[@<nodename>] that must be up to allow this service to be started by the daemon.

The list is whitespace-separated.

pg_blkio_weight

required:    false
scopable:    true

Example:

pg_blkio_weight = 50

Block IO relative weight. Value: between 10 and 1000.

The kernel default is 1000.

pg_cpu_quota

required:    false
scopable:    true

Example:

pg_cpu_quota = 50%@all

The kernel default value is used, which usually is 1024 shares.

In a cpu-bound situation, this setting ensures the service does not use more than its share of cpu resource. The actual percentile depends on shares allowed to other services.

pg_cpu_shares

required:    false
scopable:    true
convert:     size

Example:

pg_cpu_shares = 512

The kernel default value is used, which usually is 1024 shares.

In a cpu-bound situation, this setting ensures the service does not use more than its share of cpu resource. The actual percentile depends on shares allowed to other services.

pg_cpus

required:    false
scopable:    true
depends:     create_pg=true

Example:

pg_cpus = 0-2

Allow service process to bind only the specified cpus.

Cpus are specified as list or range : 0,1,2 or 0-2.

pg_mem_limit

required:    false
scopable:    true
convert:     size

Example:

pg_mem_limit = 512m

Ensures the service does not use more than specified memory (in bytes).

The Out-Of-Memory killer is triggered in case of tresspassing.

pg_mem_oom_control

required:    false
scopable:    true

Example:

pg_mem_oom_control = 1

A flag (0 or 1) that enables or disables the Out of Memory killer for the processes of the group.

  • If enabled (0), tasks that attempt to consume more memory than they are allowed are immediately killed by the OOM killer.
  • If disabled (1), tasks are allowed to continue to try allocating memory, stressing the system.

The OOM killer is enabled by default in every cgroup using the memory controller.

pg_mem_swappiness

required:    false
scopable:    true

Example:

pg_mem_swappiness = 40

Set a swappiness percentile value for the process group.

pg_mems

required:    false
scopable:    true

Example:

pg_mems = 0-2

Allow service process to bind only the specified memory nodes.

Memory nodes are specified as list or range : 0,1,2 or 0-2.

pg_vmem_limit

required:    false
scopable:    true
convert:     size

Example:

pg_vmem_limit = 1g

Ensures the service does not use more than specified memory+swap (in bytes).

The Out-Of-Memory killer is triggered in case of tresspassing. The specified value must be greater than pg_mem_limit.

pool

required:    false
scopable:    true

The name of the pool this volume was allocated from.

pre_monitor_action

required:    false
scopable:    true

Example:

pre_monitor_action = /bin/true

A callout to execute before the monitor_action.

For example, if monitor_action = freezestop, a pre_monitor_action script may decide to crash the server if it detects a situation were freezestop can not succeed (for example, a fs can not be umounted due to an unresponsive storage array).

provision_timeout

required:    false
scopable:    true
convert:     duration

Example:

provision_timeout = 1m30s

Wait for <duration> before declaring the action a failure.

Takes precedence over timeout.

resinfo_schedule

required:    false
scopable:    true
default:     @60m

The instance key-val table emit schedule.

See usr/share/doc/schedule for the schedule syntax.

rollback

required:    false
scopable:    true
default:     true
convert:     bool

If set to false, the default rollback on start action error behaviour is disabled, leaving the instance in its half-started state (avail warn).

The daemon then refuses to failover a service if any instance is in warn availabity state. It is highly recommended to not use rollback=false if orchestrate=ha.

run_schedule

required:    false
scopable:    true

The instance tasks run action default schedule.

See usr/share/doc/schedule for the schedule syntax.

shared

required:    false
scopable:    true
default:     true
convert:     bool

If true, the resource will be considered shared during provision and unprovision actions.

A shared resource driver can implement a different behaviour depending on weither it is run from the leader instance, or not:

  • When --leader is set, the driver creates and configures the system objects. For example the disk.disk driver allocates a SAN disk and discover its block devices.

  • When --leader is not set, the driver does not redo the actions already done by the leader, but may do some. For example, the disk.disk driver skips the SAN disk allocation, but discovers the block devices.

The daemon takes care of setting the --leader flags on the commands it submits during deploy, purge, provision and unprovision orchestrations.

Warning: If admins want to submit --local provision or unprovision commands themselves, they have to set the --leader flag correctly.

Flex objects usually don't use shared resources. But if they do, only the flex primary gets --leader commands.

size

required:    false
scopable:    true
convert:     size

The size used by this volume in its pool.

soft_affinity

required:    false
scopable:    false
convert:     list-lowercase

Example:

soft_affinity = svc1 svc2

A whitespace separated list of services that must be started on the node to allow the monitor to start this service.

If the local node is the only candidate ignore this constraint and allow start.

soft_anti_affinity

required:    false
scopable:    false
convert:     list-lowercase

Example:

soft_anti_affinity = svc1 svc2

A whitespace separated list of services that must not be started on the node to allow the monitor to start this service.

If the local node is the only candidate ignore this constraint and allow start.

start_timeout

required:    false
scopable:    true
convert:     duration

Example:

start_timeout = 1m30s

Wait for <duration> before declaring the action a failure.

Takes precedence over timeout.

stat_timeout

required:    false
scopable:    true
convert:     duration

The fs resources status evaluation includes a stat syscall test. This keyword defines the maximum wait time for those stat calls to respond.

When expired, the resource status is degraded is to warn, which can trigger a monitor action (reboot or crash the node) if the resource is monitored.

status_schedule

required:    false
scopable:    true
default:     @10m

The instance status evaluation schedule.

See usr/share/doc/schedule for the schedule syntax.

status_timeout

required:    false
scopable:    true
default:     1m
convert:     duration

Example:

status_timeout = 10s

The maximum duration of the instance status evaluation.

For example, the total start action duration is constrained by different timeouts:

  • the start_timeout Limiting the start action duration.

  • the stop_timeout Limiting the start rollback duration triggered by start errors.

  • the status_timeout Limiting the post-start instance status evaluation duration.

stonith

required:    false
scopable:    false
depends:     topology=failover
default:     false
convert:     bool

Shoot The Other Node In The Head, aka fence, using a callout.

The callout is triggered after a quorum vote won, when the surviving node is about to start a local instance of a service that was known to be started on a unreachable peer node.

The callout is meant to prevent the peer from writing to shared disks, remote databases, and from responding to clients.

The Fence Agents project is a well known bundle of callout used by many clustering tools.

stop_timeout

required:    false
scopable:    true
convert:     duration

Example:

stop_timeout = 1m30s

Wait for <duration> before declaring the action a failure.

Takes precedence over timeout.

sync_schedule

required:    false
scopable:    true
default:     04:00-06:00

The instance sync default schedule.

See usr/share/doc/schedule for the schedule syntax.

sync_timeout

required:    false
scopable:    true
convert:     duration

Example:

sync_timeout = 1m30s

Wait for <duration> before declaring the action a failure.

Takes precedence over timeout.

timeout

required:    false
scopable:    true
default:     1h
convert:     duration

Example:

timeout = 2h

Wait for <duration> before declaring a state-changing action a failure.

A per-action <action>_timeout can override this value.

topology

required:    false
scopable:    false
candidates:  failover, flex
default:     failover
  • failover

    The service is allowed to be up on one node at a time.

  • flex

    The service can be up on flex_target nodes, where flex_target must be in the [flex_min, flex_max] range.

type

required:    false
scopable:    false

The resource driver name.

unprovision_timeout

required:    false
scopable:    true
convert:     duration

Example:

unprovision_timeout = 1m30s

Wait for <duration> before declaring the action a failure.

Takes precedence over timeout.