Concurrency primitives, safe memory reclamation mechanisms and non-blocking (including lock-free) data structures designed to aid in the research, design and implementation of high performance concurrent systems developed in C99+.

2425
316
C

Concurrency Kit

Modern concurrency primitives and building blocks for high performance applications.

Continuous Integration

GitHub Actions Cirrus
Build Status Build Status

Compilers tested in the past include gcc, clang, cygwin, icc, mingw32, mingw64 and suncc across all supported architectures. All new architectures are required to pass the integration test and under-go extensive code review.

Continuous integration is currently enabled for the following targets:

  • darwin/clang/arm64
  • freebsd/clang/x86-64
  • linux/gcc/arm64
  • linux/gcc/x86-64
  • linux/clang/x86-64

Compile and Build

  • Step 1.
    ./configure
    For additional options try ./configure --help

  • Step 2.
    In order to compile regressions (requires POSIX threads) use
    make regressions. In order to compile libck use make all or make.

  • Step 3.
    In order to install use make install
    To uninstall use make uninstall.

See http://concurrencykit.org/ for more information.

Supported Architectures

Concurrency Kit supports any architecture using compiler built-ins as a fallback. There is usually a performance degradation associated with this.

Concurrency Kit has specialized assembly for the following architectures:

  • aarch64
  • arm
  • ppc
  • ppc64
  • riscv64
  • s390x
  • sparcv9+
  • x86
  • x86_64

Features

Concurrency Primitives

ck_pr

Concurrency primitives as made available by the underlying architecture, includes support for all atomic operations (natively), transactional memory, pipeline control, read-for-ownership and more.

ck_backoff

A simple and efficient (minimal noise) backoff function.

ck_cc

Abstracted compiler builtins when writing efficient concurrent data structures.

Safe Memory Reclamation

ck_epoch

A scalable safe memory reclamation mechanism with support idle threads and various optimizations that make it better than or competitive with many state-of-the-art solutions.

ck_hp

Implements support for hazard pointers, a simple and efficient lock-free safe memory reclamation mechanism.

Data Structures

ck_array

A simple concurrently-readable pointer array structure.

ck_bitmap

An efficient multi-reader and multi-writer concurrent bitmap structure.

ck_ring

Efficient concurrent bounded FIFO data structures with various performance trade-off. This includes specialization for single-reader, many-reader, single-writer and many-writer.

ck_fifo

A reference implementation of the first published lock-free FIFO algorithm, with specialization for single-enqueuer-single-dequeuer and many-enqueuer-single-dequeuer and extensions to allow for node re-use.

ck_hp_fifo

A reference implementation of the above algorithm, implemented with safe memory reclamation using hazard pointers.

ck_hp_stack

A reference implementation of a Treiber stack with support for hazard pointers.

ck_stack

A reference implementation of an efficient lock-free stack, with specialized variants for a variety of memory management strategies and bounded concurrency.

ck_queue

A concurrently readable friendly derivative of the BSD-queue interface. Coupled with a safe memory reclamation mechanism, implement scalable read-side queues with a simple search and replace.

ck_hs

An extremely efficient single-writer-many-reader hash set, that satisfies lock-freedom with bounded concurrency without any usage of atomic operations and allows for recycling of unused or deleted slots. This data structure is recommended for use as a general hash-set if it is possible to compute values from keys. Learn more at https://engineering.backtrace.io/workload-specialization/ and http://concurrencykit.org/articles/ck_hs.html.

ck_ht

A specialization of the ck_hs algorithm allowing for disjunct key-value pairs.

ck_rhs

A variant of ck_hs that utilizes robin-hood hashing to allow for improved performance with higher load factors and high deletion rates.

Synchronization Primitives

ck_ec

An extremely efficient event counter implementation, a better alternative to condition variables.

ck_barrier

A plethora of execution barriers including: centralized barriers, combining barriers, dissemination barriers, MCS barriers, tournament barriers.

ck_brlock

A simple big-reader lock implementation, write-biased reader-writer lock with scalable read-side locking.

ck_bytelock

An implementation of bytelocks, for research purposes, allowing for (in theory), fast read-side acquisition without the use of atomic operations. In reality, memory barriers are required on the fast path.

ck_cohort

A generic lock cohorting interface, allows you to turn any lock into a NUMA-friendly scalable NUMA lock. There is a significant trade-off in fast path acquisition cost. Specialization is included for all relevant lock implementations in Concurrency Kit. Learn more by reading “Lock Cohorting: A General Technique for Designing NUMA Locks”.

ck_elide

A generic lock elision framework, allows you to turn any lock implementation into an elision-aware implementation. This requires support for restricted transactional memory by the underlying hardware.

ck_pflock

Phase-fair reader-writer mutex that provides strong fairness guarantees between readers and writers. Learn more by reading “Spin-Based Reader-Writer Synchronization for Multiprocessor Real-Time Systems”.

ck_rwcohort

A generic read-write lock cohorting interface, allows you to turn any read-write lock into a NUMA-friendly scalable NUMA lock. There is a significant trade-off in fast path acquisition cost. Specialization is included for all relevant lock implementations in Concurrency Kit. Learn more by reading “Lock Cohorting: A General Technique for Designing NUMA Locks”.

ck_rwlock

A simple centralized write-biased read-write lock.

ck_sequence

A sequence counter lock, popularized by the Linux kernel, allows for very fast read and write synchronization for simple data structures where deep copy is permitted.

ck_swlock

A single-writer specialized read-lock that is copy-safe, useful for data structures that must remain small, be copied and contain in-band mutexes.

ck_tflock

Task-fair locks are fair read-write locks, derived from “Scalable reader-writer synchronization for shared-memory multiprocessors”.

ck_spinlock

A basic but very fast spinlock implementation.

ck_spinlock_anderson

Scalable and fast anderson spinlocks. This is here for reference, one of the earliest scalable and fair lock implementations.

ck_spinlock_cas

A basic spinlock utilizing compare_and_swap.

ck_spinlock_dec

A basic spinlock, a C adaption of the older optimized Linux kernel spinlock for x86. Primarily here for reference.

ck_spinlock_fas

A basic spinlock utilizing atomic exchange.

ck_spinlock_clh

An efficient implementation of the scalable CLH lock, providing many of the same performance properties of MCS with a better fast-path.

ck_spinlock_hclh

A NUMA-friendly CLH lock.

ck_spinlock_mcs

An implementation of the seminal scalable and fair MCS lock.

ck_spinlock_ticket

An implementation of fair centralized locks.