Thread safety: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>Neonzada
 
 
Line 1: Line 1:
{{Short description|Concept in multi-threaded computer programming}}
{{Short description|Concept in multi-threaded computer programming}}


In [[multi-threaded]] [[computer programming]], a function is '''thread-safe''' when it can be invoked or accessed concurrently by multiple threads without causing unexpected behavior, [[race condition]]s, or data corruption.<ref>{{cite book |last=Kerrisk |first=Michael |title=The Linux Programing Interface |publisher=[[No Starch Press]] |year=2010 |page=699 |postscript=, "Chapter 31: THREADS: THREAD SAFETY AND  PER-THREAD STORAGE"}}</ref><ref name=":1" /> As in the multi-threaded context where a program executes several threads simultaneously in a shared [[address space]] and each of those threads has access to every other thread's [[computer storage|memory]], thread-safe functions need to ensure that all those threads behave properly and fulfill their design specifications without unintended interaction.<ref name=":0">{{Cite web |publisher=[[Oracle Corporation|Oracle]] |first= |date=November 2020 |title=Multithreaded Programming Guide: Chapter 7 Safe and Unsafe Interfaces |url=https://docs.oracle.com/cd/E37838_01/html/E61057/compat-14994.html#scrolltoc |access-date=2024-04-30 |website=Docs Oracle |postscript=; "Thread Safety"}}</ref>
In [[multi-threaded]] [[computer programming]], a function is '''thread-safe''' when it can be invoked or accessed concurrently by multiple threads without causing unexpected behavior, [[race condition]]s, or [[data corruption]].<ref>{{cite book |last=Kerrisk |first=Michael |title=The Linux Programing Interface |publisher=[[No Starch Press]] |year=2010 |page=699 |postscript=, "Chapter 31: THREADS: THREAD SAFETY AND  PER-THREAD STORAGE"}}</ref><ref name=":1" /> As in the multi-threaded context where a program executes several threads simultaneously in a shared [[address space]] and each of those threads has access to every other thread's [[computer storage|memory]], thread-safe functions need to ensure that all those threads behave properly and fulfill their design specifications without unintended interaction.<ref name=":0">{{Cite web |publisher=[[Oracle Corporation|Oracle]] |first= |date=November 2020 |title=Multithreaded Programming Guide: Chapter 7 Safe and Unsafe Interfaces |url=https://docs.oracle.com/cd/E37838_01/html/E61057/compat-14994.html#scrolltoc |access-date=2024-04-30 |website=Docs Oracle |postscript=; "Thread Safety"}}</ref>


There are various strategies for making thread-safe data structures.<ref name=":0" />
There are various strategies for making thread-safe data structures.<ref name=":0" />
Line 21: Line 21:
; [[Reentrant (subroutine)|Re-entrancy]]<ref>{{cite web |title=Reentrancy and Thread-Safety &#124; Qt 5.6 |url=https://doc.qt.io/qt-5/threads-reentrancy.html |access-date=2016-04-20 |publisher=Qt Project}}</ref>: Writing code in such a way that it can be partially executed by a thread, executed by the same thread, or simultaneously executed by another thread and still correctly complete the original execution. This requires the saving of [[state (computer science)|state]] information in variables local to each execution, usually on a stack, instead of in [[static variable|static]] or [[global variable|global]] variables or other non-local state. All non-local states must be accessed through atomic operations and the data-structures must also be reentrant.
; [[Reentrant (subroutine)|Re-entrancy]]<ref>{{cite web |title=Reentrancy and Thread-Safety &#124; Qt 5.6 |url=https://doc.qt.io/qt-5/threads-reentrancy.html |access-date=2016-04-20 |publisher=Qt Project}}</ref>: Writing code in such a way that it can be partially executed by a thread, executed by the same thread, or simultaneously executed by another thread and still correctly complete the original execution. This requires the saving of [[state (computer science)|state]] information in variables local to each execution, usually on a stack, instead of in [[static variable|static]] or [[global variable|global]] variables or other non-local state. All non-local states must be accessed through atomic operations and the data-structures must also be reentrant.
; [[Thread-local storage]]: Variables are localized so that each thread has its own private copy. These variables retain their values across [[subroutine]]s and other code boundaries and are thread-safe since they are local to each thread, even though the code which accesses them might be executed simultaneously by another thread.
; [[Thread-local storage]]: Variables are localized so that each thread has its own private copy. These variables retain their values across [[subroutine]]s and other code boundaries and are thread-safe since they are local to each thread, even though the code which accesses them might be executed simultaneously by another thread.
; [[Immutable object]]s: The state of an object cannot be changed after construction. This implies both that only read-only data is shared and that inherent thread safety is attained. Mutable (non-const) operations can then be implemented in such a way that they create new objects instead of modifying the existing ones. This approach is characteristic of [[functional programming]] and is also used by the ''string'' implementations in Java, C#, and Python. (See [[Immutable object]].)
; [[Immutable object]]s: The state of an object cannot be changed after construction. This implies both that only read-only data is shared and that inherent thread safety is attained. Mutable (non-const) operations can then be implemented in such a way that they create new objects instead of modifying the existing ones. This approach is characteristic of [[functional programming]] and is also used by the ''string'' implementations in Java, C#, and [[Python (programming language)|Python]]. (See [[Immutable object]].)


The second class of approaches are synchronization-related, and are used in situations where shared state cannot be avoided:
The second class of approaches are synchronization-related, and are used in situations where shared state cannot be avoided:
Line 43: Line 43:
In the [[C (programming language)|C programming language]], each thread has its own stack. However, a [[static variable]] is not kept on the stack; all threads share simultaneous access to it. If multiple threads overlap while running the same function, it is possible that a static variable might be changed by one thread while another is midway through checking it. This difficult-to-diagnose [[logic error]], which may compile and run properly most of the time, is called a [[race condition#Software|race condition]]. One common way to avoid this is to use another shared variable as a [[lock (computer science)|"lock" or "mutex"]] (from '''mut'''ual '''ex'''clusion).
In the [[C (programming language)|C programming language]], each thread has its own stack. However, a [[static variable]] is not kept on the stack; all threads share simultaneous access to it. If multiple threads overlap while running the same function, it is possible that a static variable might be changed by one thread while another is midway through checking it. This difficult-to-diagnose [[logic error]], which may compile and run properly most of the time, is called a [[race condition#Software|race condition]]. One common way to avoid this is to use another shared variable as a [[lock (computer science)|"lock" or "mutex"]] (from '''mut'''ual '''ex'''clusion).


In the following piece of C code, the function is thread-safe, but not reentrant:
In the following piece of C code which calls [[C POSIX library|POSIX headers]], the function is thread-safe, but not reentrant:


<span class="anchor" id="mutexexample"></span>
<span class="anchor" id="mutexexample"></span>
<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
# include <pthread.h>
#include <pthread.h>


int increment_counter ()
int incrementCounter() {
{
    static int counter = 0;
  static int counter = 0;
    static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
  static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;


  // only allow one thread to increment at a time
    // only allow one thread to increment at a time
  pthread_mutex_lock(&mutex);
    pthread_mutex_lock(&mutex);


  ++counter;
    ++counter;


  // store value before any other threads increment it further
    // store value before any other threads increment it further
  int result = counter;
    int result = counter;


  pthread_mutex_unlock(&mutex);
    pthread_mutex_unlock(&mutex);


  return result;
    return result;
}
}
</syntaxhighlight>
</syntaxhighlight>


In the above, <code>increment_counter</code> can be called by different threads without any problem since a mutex is used to synchronize all access to the shared <code>counter</code> variable. But if the function is used in a reentrant interrupt handler and a second interrupt arises while the mutex is locked, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.
In the above, <code>increment_counter</code> can be called by different threads without any problem since a mutex is used to synchronize all access to the shared <code>counter</code> variable. But if the function is used in a reentrant [[interrupt handler]] and a second interrupt arises while the mutex is locked, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.


The same function can be implemented to be both thread-safe and reentrant using the lock-free [[linearizability|atomics]] in [[C++11]]:
The same function can be implemented to be both thread-safe and reentrant using the lock-free [[linearizability|atomics]], which were introduced in [[C++11]]:
<syntaxhighlight lang="cpp">
<syntaxhighlight lang="cpp">
# include <atomic>
import std;


int increment_counter ()
using std::atomic;
{
  static std::atomic<int> counter(0);


  // increment is guaranteed to be done atomically
int incrementCounter() {
  int result = ++counter;
    static atomic<int> counter(0);


  return result;
    // increment is guaranteed to be done atomically
    int result = ++counter;
 
    return result;
}
}
</syntaxhighlight>
</syntaxhighlight>

Latest revision as of 16:49, 6 November 2025

Template:Short description

In multi-threaded computer programming, a function is thread-safe when it can be invoked or accessed concurrently by multiple threads without causing unexpected behavior, race conditions, or data corruption.[1][2] As in the multi-threaded context where a program executes several threads simultaneously in a shared address space and each of those threads has access to every other thread's memory, thread-safe functions need to ensure that all those threads behave properly and fulfill their design specifications without unintended interaction.[3]

There are various strategies for making thread-safe data structures.[3]

Levels of thread safety

Different vendors use slightly different terminology for thread-safety,[4] but the most commonly used thread-safety terminology are:[2]

  • Not thread safe: Data structures should not be accessed simultaneously by different threads.
  • Thread safe, serialization: Uses a single mutex for all resources to guarantee the thread to be free of race conditions when those resources are accessed by multiple threads simultaneously.
  • Thread safe, MT-safe: Uses a mutex for every single resource to guarantee the thread to be free of race conditions when those resources are accessed by multiple threads simultaneously.

Thread safety guarantees usually also include design steps to prevent or limit the risk of different forms of deadlocks, as well as optimizations to maximize concurrent performance. However, deadlock-free guarantees cannot always be given, since deadlocks can be caused by callbacks and violation of architectural layering independent of the library itself.

Software libraries can provide certain thread-safety guarantees.[5] For example, concurrent reads might be guaranteed to be thread-safe, but concurrent writes might not be. Whether a program using such a library is thread-safe depends on whether it uses the library in a manner consistent with those guarantees.

Implementation approaches

Listed are two classes of approaches for avoiding race conditions to achieve thread-safety.

The first class of approaches focuses on avoiding shared state and includes:

Re-entrancy[6]
Writing code in such a way that it can be partially executed by a thread, executed by the same thread, or simultaneously executed by another thread and still correctly complete the original execution. This requires the saving of state information in variables local to each execution, usually on a stack, instead of in static or global variables or other non-local state. All non-local states must be accessed through atomic operations and the data-structures must also be reentrant.
Thread-local storage
Variables are localized so that each thread has its own private copy. These variables retain their values across subroutines and other code boundaries and are thread-safe since they are local to each thread, even though the code which accesses them might be executed simultaneously by another thread.
Immutable objects
The state of an object cannot be changed after construction. This implies both that only read-only data is shared and that inherent thread safety is attained. Mutable (non-const) operations can then be implemented in such a way that they create new objects instead of modifying the existing ones. This approach is characteristic of functional programming and is also used by the string implementations in Java, C#, and Python. (See Immutable object.)

The second class of approaches are synchronization-related, and are used in situations where shared state cannot be avoided:

Mutual exclusion
Access to shared data is serialized using mechanisms that ensure only one thread reads or writes to the shared data at any time. Incorporation of mutual exclusion needs to be well thought out, since improper usage can lead to side-effects like deadlocks, livelocks, and resource starvation.
Atomic operations
Shared data is accessed by using atomic operations which cannot be interrupted by other threads. This usually requires using special machine language instructions, which might be available in a runtime library. Since the operations are atomic, the shared data is always kept in a valid state, no matter how other threads access it. Atomic operations form the basis of many thread locking mechanisms, and are used to implement mutual exclusion primitives.

Examples

In the following piece of Java code, the Java keyword synchronized makes the method thread-safe:

class Counter {
    private int i = 0;

    public synchronized void inc() {
        i++;
    }
}

In the C programming language, each thread has its own stack. However, a static variable is not kept on the stack; all threads share simultaneous access to it. If multiple threads overlap while running the same function, it is possible that a static variable might be changed by one thread while another is midway through checking it. This difficult-to-diagnose logic error, which may compile and run properly most of the time, is called a race condition. One common way to avoid this is to use another shared variable as a "lock" or "mutex" (from mutual exclusion).

In the following piece of C code which calls POSIX headers, the function is thread-safe, but not reentrant:

#include <pthread.h>

int incrementCounter() {
    static int counter = 0;
    static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

    // only allow one thread to increment at a time
    pthread_mutex_lock(&mutex);

    ++counter;

    // store value before any other threads increment it further
    int result = counter;

    pthread_mutex_unlock(&mutex);

    return result;
}

In the above, increment_counter can be called by different threads without any problem since a mutex is used to synchronize all access to the shared counter variable. But if the function is used in a reentrant interrupt handler and a second interrupt arises while the mutex is locked, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.

The same function can be implemented to be both thread-safe and reentrant using the lock-free atomics, which were introduced in C++11:

import std;

using std::atomic;

int incrementCounter() {
    static atomic<int> counter(0);

    // increment is guaranteed to be done atomically
    int result = ++counter;

    return result;
}

See also

References

Template:Reflist

External links

  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".
  1. Script error: No such module "citation/CS1".
  2. a b Script error: No such module "citation/CS1".
  3. a b Script error: No such module "citation/CS1".
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. Script error: No such module "citation/CS1".