User-Level Native Thread Primitives (GThread) Library White Paper (G06.27+, H06.03+, J06.03+)
User-Level Native Thread Primitives (GThread Library) 02/15/2012
540065-004 Page 3 of 44
1. Introduction
1.1. Overview
The GThread library is a NonStop operating system facility that provides primitives supporting user-level
multithreading on TNS/R and TNS/E systems. It is not a comprehensive threading package, but rather
provides the primitives to save and restore thread context, and to create the context for new threads.
GThreads replaces the direct manipulation of TNS registers typically used in thread’s packages for TNS
Guardian systems, allowing package conversions to native mode with minimal source changes.
This document covers GThread external functionality in its entirety. Existing TNS or TNS/R GThread
clients are not required to make any changes. This document introduces new interfaces, which require
modifications to client code in the TNS/E environment.
To promote an early migration to the new interfaces, and provide a consistent interface to GThread clients
designed to run on both TNS/R and TNS/E platforms, the new interfaces are also available on TNS/R.
The client externals are the same on both the platforms, but the internal implementation differs. Clients
running on TNS/R are given the option
of using the new interfaces for two purposes:
1) Migration provides a single-set of interfaces that work on both TNS/R and TNS/E.
2) Clients who do NOT intend to migrate to the new interfaces early are not required to convert.
An additional migration possibility involves using the GThread primitives in a 64-bit TNS/E OSS process
environment. The existing 32-bit clients of the current GThread library need not change. The new or
migrated 64-bit clients are source-compatible, except for allocating stacks from the default heap. For
more information on allocating stacks, see Stack Allocation in LP64 Processes.
1.2. Background and Definitions
The GThread library permits a client to implement user-level "multithreading". Multithreading is defined
as the use of multiple "threads" within a single "process." Abstractly, a "process" is a set of resources for
running a program; it is primarily a set of memory resources and control structures. A "thread" is a set of
resources for the shared use of a processor; it includes a set of processor state (registers) and a stack on
which to activate procedures. In some operating systems, "process" (or "task") and "thread" are distinct
concepts; the simplest execution of a program requires one of each. More commonly, however, a process
includes one thread; this is the case with the NonStop operating system in both the Guardian and OSS
application programming interfaces (APIs). Thus, a process is operationally defined as the entity created
by PROCESS_LAUNCH, PROCESS_SPAWN, fork or one of their alternatives. It has both memory
resources and execution state, including the necessary stacks.
A multithreaded process has more than one execution thread. Each thread retains its processor state and
the history and local state of the procedures that have been entered in that thread and not yet exited. This
definition deliberately excludes mechanisms that do not retain procedure activation history for each
thread; while "state machines" and other mechanisms may enable one process to serve many clients and
interleave processing of their requests, they do not have "threads" in the sense of this document.
Each thread requires its own stack area in which to keep procedure activation records (stack frames).
Classically, there is one stack for each thread. However, for the Itanium® processor architecture in
TNS/E systems, each execution context has a pair of stacks. Within this document, the word “stack” often
refers to aggregate stack resources without regard to whether one or two stacks are involved. When
necessary, the two stack components are distinguished.
Typically, each thread has a stack at a distinct, static address within the virtual memory space of the
process. An alternative approach called "stack swapping" employs only a single stack area timeshared by
all active threads; the frame images of waiting stacks are swapped to separate holding spaces. Static
stacks have the advantage of much faster context switches, but they are impractical when the architecture