Cluster Settings

Warning:
CockroachDB v19.1 is no longer supported as of October 30, 2020. For more details, refer to the Release Support Policy.

Cluster settings apply to all nodes of a CockroachDB cluster and control, for example, whether or not to share diagnostic details with Cockroach Labs as well as advanced options for debugging and cluster tuning.

They can be updated anytime after a cluster has been started, but only by a member of the admin role, to which the root user belongs by default.

Note:

In contrast to cluster-wide settings, node-level settings apply to a single node. They are defined by flags passed to the cockroach start command when starting a node and cannot be changed without stopping and restarting the node. For more details, see Start a Node.

Settings

Warning:

Many cluster settings are intended for tuning CockroachDB internals. Before changing these settings, we strongly encourage you to discuss your goals with Cockroach Labs; otherwise, you use them at your own risk.

SettingTypeDefaultDescription
changefeed.experimental_poll_intervalduration1spolling interval for the prototype changefeed implementation (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
changefeed.push.enabledbooleantrueif set, changed are pushed instead of pulled. This requires the kv.rangefeed.enabled setting. See https://www.cockroachlabs.com/docs/v19.1/change-data-capture.html#enable-rangefeeds-to-reduce-latency
cloudstorage.gs.default.keystringif set, JSON key to use during Google Cloud Storage operations
cloudstorage.http.custom_castringcustom root CA (appended to system's default CAs) for verifying certificates when interacting with HTTPS storage
cloudstorage.timeoutduration10m0sthe timeout for import/export storage operations
cluster.organizationstringorganization name
cluster.preserve_downgrade_optionstringdisable (automatic or manual) cluster version upgrade from the specified version until reset
compactor.enabledbooleantruewhen false, the system will reclaim space occupied by deleted data less aggressively
compactor.max_record_ageduration24h0m0sdiscard suggestions not processed within this duration (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
compactor.min_intervalduration15sminimum time interval to wait before compacting (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
compactor.threshold_available_fractionfloat0.1consider suggestions for at least the given percentage of the available logical space (zero to disable) (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
compactor.threshold_bytesbyte size256 MiBminimum expected logical space reclamation required before considering an aggregated suggestion (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
compactor.threshold_used_fractionfloat0.1consider suggestions for at least the given percentage of the used logical space (zero to disable) (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
debug.panic_on_failed_assertionsbooleanfalsepanic when an assertion fails rather than reporting
diagnostics.forced_stat_reset.intervalduration2h0m0sinterval after which pending diagnostics statistics should be discarded even if not reported
diagnostics.reporting.enabledbooleantrueenable reporting diagnostic metrics to cockroach labs
diagnostics.reporting.intervalduration1h0m0sinterval at which diagnostics data should be reported (should be shorter than diagnostics.forced_stat_reset.interval)
diagnostics.reporting.send_crash_reportsbooleantruesend crash and panic reports
external.graphite.endpointstringif nonempty, push server metrics to the Graphite or Carbon server at the specified host:port
external.graphite.intervalduration10sthe interval at which metrics are pushed to Graphite (if enabled)
jobs.registry.leniencyduration1m0sthe amount of time to defer any attempts to reschedule a job
jobs.retention_timeduration336h0m0sthe amount of time to retain records for completed jobs before
kv.allocator.lease_rebalancing_aggressivenessfloat1set greater than 1.0 to rebalance leases toward load more aggressively, or between 0 and 1.0 to be more conservative about rebalancing leases
kv.allocator.load_based_lease_rebalancing.enabledbooleantrueset to enable rebalancing of range leases based on load and latency
kv.allocator.load_based_rebalancingenumeration2whether to rebalance based on the distribution of QPS across stores [off = 0, leases = 1, leases and replicas = 2]
kv.allocator.qps_rebalance_thresholdfloat0.25minimum fraction away from the mean a store's QPS (such as queries per second) can be before it is considered overfull or underfull
kv.allocator.range_rebalance_thresholdfloat0.05minimum fraction away from the mean a store's range count can be before it is considered overfull or underfull
kv.bulk_io_write.addsstable_max_ratefloat1.7976931348623157E+308maximum number of AddSSTable requests per second for a single store
kv.bulk_io_write.concurrent_addsstable_requestsinteger1number of AddSSTable requests a store will handle concurrently before queuing
kv.bulk_io_write.concurrent_export_requestsinteger3number of export requests a store will handle concurrently before queuing
kv.bulk_io_write.concurrent_import_requestsinteger1number of import requests a store will handle concurrently before queuing
kv.bulk_io_write.max_ratebyte size1.0 TiBthe rate limit (bytes/sec) to use for writes to disk on behalf of bulk io ops
kv.bulk_sst.sync_sizebyte size2.0 MiBthreshold after which non-Rocks SST writes must fsync (0 disables)
kv.closed_timestamp.close_fractionfloat0.2fraction of closed timestamp target duration specifying how frequently the closed timestamp is advanced
kv.closed_timestamp.follower_reads_enabledbooleantrueallow (all) replicas to serve consistent historical reads based on closed timestamp information
kv.closed_timestamp.target_durationduration30sif nonzero, attempt to provide closed timestamp notifications for timestamps trailing cluster time by approximately this duration
kv.follower_read.target_multiplefloat3if above 1, encourages the distsender to perform a read against the closest replica if a request is older than kv.closed_timestamp.target_duration * (1 + kv.closed_timestamp.close_fraction * this) less a clock uncertainty interval. This value also is used to create follower_timestamp(). (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
kv.import.batch_sizebyte size32 MiBthe maximum size of the payload in an AddSSTable request (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
kv.raft.command.max_sizebyte size64 MiBmaximum size of a raft command
kv.raft_log.disable_synchronization_unsafebooleanfalseset to true to disable synchronization on Raft log writes to persistent storage. Setting to true risks data loss or data corruption on server crashes. The setting is meant for internal testing only and SHOULD NOT be used in production.
kv.range.backpressure_range_size_multiplierfloat2multiple of range_max_bytes that a range is allowed to grow to without splitting before writes to that range are blocked, or 0 to disable
kv.range_descriptor_cache.sizeinteger1000000maximum number of entries in the range descriptor and leaseholder caches
kv.range_merge.queue_enabledbooleantruewhether the automatic merge queue is enabled
kv.range_merge.queue_intervalduration1show long the merge queue waits between processing replicas (WARNING: may compromise cluster stability or correctness; do not edit without supervision)
kv.range_split.by_load_enabledbooleantrueallow automatic splits of ranges based on where load is concentrated
kv.range_split.load_qps_thresholdinteger250the QPS over which, the range becomes a candidate for load based splitting
kv.rangefeed.concurrent_catchup_iteratorsinteger64number of rangefeeds catchup iterators a store will allow concurrently before queueing
kv.rangefeed.enabledbooleanfalseif set, rangefeed registration is enabled
kv.snapshot_rebalance.max_ratebyte size8.0 MiBthe rate limit (bytes/sec) to use for rebalance and upreplication snapshots
kv.snapshot_recovery.max_ratebyte size8.0 MiBthe rate limit (bytes/sec) to use for recovery snapshots
kv.transaction.max_intents_bytesinteger262144maximum number of bytes used to track write intents in transactions
kv.transaction.max_refresh_spans_bytesinteger256000maximum number of bytes used to track refresh spans in serializable transactions
kv.transaction.write_pipelining_enabledbooleantrueif enabled, transactional writes are pipelined through Raft consensus
kv.transaction.write_pipelining_max_batch_sizeinteger128if non-zero, defines that maximum size batch that will be pipelined through Raft consensus
kv.transaction.write_pipelining_max_outstanding_sizebyte size256 KiBmaximum number of bytes used to track in-flight pipelined writes before disabling pipelining
rocksdb.ingest_backpressure.delay_l0_fileduration200msdelay to add to SST ingestions per file in L0 over the configured limit
rocksdb.ingest_backpressure.l0_file_count_thresholdinteger20number of L0 files after which to backpressure SST ingestions
rocksdb.ingest_backpressure.max_delayduration5smaximum amount of time to backpressure a single SST ingestion
rocksdb.ingest_backpressure.pending_compaction_thresholdbyte size64 GiBpending compaction estimate above which to backpressure SST ingestions
rocksdb.min_wal_sync_intervalduration0sminimum duration between syncs of the RocksDB WAL
schemachanger.backfiller.buffer_sizebyte size196 MiBamount to buffer in memory during backfills
schemachanger.backfiller.max_sst_sizebyte size16 MiBtarget size for ingested files during backfills
schemachanger.bulk_index_backfill.batch_sizeinteger5000number of rows to process at a time during bulk index backfill
schemachanger.bulk_index_backfill.enabledbooleantruebackfill indexes in bulk via addsstable
schemachanger.lease.durationduration5m0sthe duration of a schema change lease
schemachanger.lease.renew_fractionfloat0.5the fraction of schemachanger.lease_duration remaining to trigger a renew of the lease
server.clock.forward_jump_check_enabledbooleanfalseif enabled, forward clock jumps > max_offset/2 will cause a panic
server.clock.persist_upper_bound_intervalduration0sthe interval between persisting the wall time upper bound of the clock. The clock does not generate a wall time greater than the persisted timestamp and will panic if it sees a wall time greater than this value. When cockroach starts, it waits for the wall time to catch-up till this persisted timestamp. This guarantees monotonic wall time across server restarts. Not setting this or setting a value of 0 disables this feature.
server.consistency_check.intervalduration24h0m0sthe time between range consistency checks; set to 0 to disable consistency checking
server.declined_reservation_timeoutduration1sthe amount of time to consider the store throttled for up-replication after a reservation was declined
server.eventlog.ttlduration2160h0m0sif nonzero, event log entries older than this duration are deleted every 10m0s. Should not be lowered below 24 hours.
server.failed_reservation_timeoutduration5sthe amount of time to consider the store throttled for up-replication after a failed reservation call
server.goroutine_dump.num_goroutines_thresholdinteger1000a threshold beyond which if number of goroutines increases, then goroutine dump can be triggered
server.goroutine_dump.total_dump_size_limitbyte size500 MiBtotal size of goroutine dumps to be kept. Dumps are GC'ed in the order of creation time. The latest dump is always kept even if its size exceeds the limit.
server.heap_profile.max_profilesinteger5maximum number of profiles to be kept. Profiles with lower score are GC'ed, but latest profile is always kept.
server.heap_profile.system_memory_threshold_fractionfloat0.85fraction of system memory beyond which if Rss increases, then heap profile is triggered
server.host_based_authentication.configurationstringhost-based authentication configuration to use during connection authentication
server.rangelog.ttlduration720h0m0sif nonzero, range log entries older than this duration are deleted every 10m0s. Should not be lowered below 24 hours.
server.remote_debugging.modestringlocalset to enable remote debugging, localhost-only or disable (any, local, off)
server.shutdown.drain_waitduration0sthe amount of time a server waits in an unready state before proceeding with the rest of the shutdown process
server.shutdown.lease_transfer_waitduration5sthe amount of time a server waits to transfer range leases before proceeding with the rest of the shutdown process
server.shutdown.query_waitduration10sthe server will wait for at least this amount of time for active queries to finish
server.time_until_store_deadduration5m0sthe time after which if there is no new gossiped information about a store, it is considered dead
server.web_session_timeoutduration168h0m0sthe duration that a newly created web session will be valid
sql.defaults.default_int_sizeinteger8the size, in bytes, of an INT type
sql.defaults.distsqlenumeration1default distributed SQL execution mode [off = 0, auto = 1, on = 2]
sql.defaults.experimental_vectorizeenumeration0default experimental_vectorize mode [off = 0, on = 1, always = 2]
sql.defaults.optimizerenumeration1default cost-based optimizer mode [off = 0, on = 1, local = 2]
sql.defaults.reorder_joins_limitinteger4default number of joins to reorder
sql.defaults.results_buffer.sizebyte size16 KiBdefault size of the buffer that accumulates results for a statement or a batch of statements before they are sent to the client. This can be overridden on an individual connection with the 'results_buffer_size' parameter. Note that auto-retries generally only happen while no results have been delivered to the client, so reducing this size can increase the number of retriable errors a client receives. On the other hand, increasing the buffer size can increase the delay until the client receives the first result row. Updating the setting only affects new connections. Setting to 0 disables any buffering.
sql.defaults.serial_normalizationenumeration0default handling of SERIAL in table definitions [rowid = 0, virtual_sequence = 1, sql_sequence = 2]
sql.distsql.distribute_index_joinsbooleantrueif set, for index joins we instantiate a join reader on every node that has a stream; if not set, we use a single join reader
sql.distsql.flow_stream_timeoutduration10samount of time incoming streams wait for a flow to be set up before erroring out
sql.distsql.interleaved_joins.enabledbooleantrueif set we plan interleaved table joins instead of merge joins when possible
sql.distsql.max_running_flowsinteger500maximum number of concurrent flows that can be run on a node
sql.distsql.merge_joins.enabledbooleantrueif set, we plan merge joins when possible
sql.distsql.temp_storage.joinsbooleantrueset to true to enable use of disk for distributed sql joins
sql.distsql.temp_storage.sortsbooleantrueset to true to enable use of disk for distributed sql sorts
sql.distsql.temp_storage.workmembyte size64 MiBmaximum amount of memory in bytes a processor can use before falling back to temp storage
sql.metrics.statement_details.dump_to_logsbooleanfalsedump collected statement statistics to node logs when periodically cleared
sql.metrics.statement_details.enabledbooleantruecollect per-statement query statistics
sql.metrics.statement_details.plan_collection.enabledbooleantrueperiodically save a logical plan for each fingerprint
sql.metrics.statement_details.plan_collection.periodduration5m0sthe time until a new logical plan is collected
sql.metrics.statement_details.thresholdduration0sminimum execution time to cause statistics to be collected
sql.parallel_scans.enabledbooleantrueparallelizes scanning different ranges when the maximum result size can be deduced
sql.query_cache.enabledbooleantrueenable the query cache
sql.stats.automatic_collection.enabledbooleantrueautomatic statistics collection mode
sql.stats.automatic_collection.fraction_stale_rowsfloat0.2target fraction of stale rows per table that will trigger a statistics refresh
sql.stats.automatic_collection.max_fraction_idlefloat0.9maximum fraction of time that automatic statistics sampler processors are idle
sql.stats.automatic_collection.min_stale_rowsinteger500target minimum number of stale rows per table that will trigger a statistics refresh
sql.stats.max_timestamp_ageduration5m0smaximum age of timestamp during table statistics collection
sql.stats.post_events.enabledbooleanfalseif set, an event is shown for every CREATE STATISTICS job
sql.tablecache.lease.refresh_limitinteger50maximum number of tables to periodically refresh leases for
sql.trace.log_statement_executebooleanfalseset to true to enable logging of executed statements
sql.trace.session_eventlog.enabledbooleanfalseset to true to enable session tracing
sql.trace.txn.enable_thresholdduration0sduration beyond which all transactions are traced (set to 0 to disable)
timeseries.storage.enabledbooleantrueif set, periodic timeseries data is stored within the cluster; disabling is not recommended unless you are storing the data elsewhere
timeseries.storage.resolution_10s.ttlduration240h0m0sthe maximum age of time series data stored at the 10 second resolution. Data older than this is subject to rollup and deletion.
timeseries.storage.resolution_30m.ttlduration2160h0m0sthe maximum age of time series data stored at the 30 minute resolution. Data older than this is subject to deletion.
trace.debug.enablebooleanfalseif set, traces for recent requests can be seen in the /debug page
trace.lightstep.tokenstringif set, traces go to Lightstep using this token
trace.zipkin.collectorstringif set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set
versioncustom validation19.1set the active cluster version in the format '.'

View current cluster settings

Use the SHOW CLUSTER SETTING statement.

Change a cluster setting

Use the SET CLUSTER SETTING statement.

Before changing a cluster setting, please note the following:

  • Changing a cluster setting is not instantaneous, as the change must be propagated to other nodes in the cluster.

  • Do not change cluster settings while upgrading to a new version of CockroachDB. Wait until all nodes have been upgraded before you make the change.

See also


Yes No