Created:

10 minute read

IDs for Processes and Threads in Linux

In Linux, both processes and threads are assigned numeric identifiers, and you can see them show up as peer directories under the /proc pseudo-filesystem. Each schedulable entity appears as a subdirectory in the form /proc/[pid], where that number is often referred to as a “PID”.

Here is the catch: the value that shows up in /proc/[pid] is not strictly a process identifier. Depending on context, it may refer to a thread or a process, because Linux historically built threads on top of processes.

You can observe this terminology overload in tools like htop. By default, htop lists both processes and threads without clearly separating them, so its “PID” column contains identifiers that can correspond to either.

This has a historical reason. Early Linux did not have a first-class notion of threads, only processes. Over time, Linux introduced “thread groups” (Linux 2.4, around 2001), which support the POSIX threads model: multiple threads that conceptually belong to one process. Internally, the shared “process ID” that user space expects is implemented as a thread group identifier (TGID). As described in the clone(2) manual, getpid(2) returns the TGID of the caller, not a per-thread identifier.

As a result, threads in the same process share the same TGID, while each thread also has its own unique thread ID (TID). Practically, getpid() returns the same value across all threads in the process, while gettid() returns a unique value per thread.

How to Distinguish Between a Thread and a “Real Process”

Conceptually, Linux processes and threads are similar because the kernel schedules both as runnable entities. The major difference is what they share: threads in the same process typically share an address space and other resources, while separate processes usually do not (unless explicitly arranged).

To distinguish “threads within a process” from a standalone process, the most reliable place to look is:

  • /proc/[pid]/task/[tid]

The task/ directory enumerates kernel-visible threads. The tid component is the kernel thread ID. This is distinct from user-level threading abstractions (for example, some managed runtimes can implement user-level “threads” that are not one-to-one kernel threads). Those user-level threads are not directly visible as separate kernel thread IDs.

Within a multithreaded process, all threads belong to the same thread group. The main thread has tid == tgid, and the other threads have distinct tids but the same tgid. You will also notice that /proc/[pid]/task/[tid] mirrors /proc/[pid]/ when pid == tid, because that path is effectively describing the same main thread and the same thread group leader.

So, when you inspect /proc/[pid] for a multithreaded process, you can interpret it as follows:

  • The directory name /proc/[pid] corresponds to the process’s TGID (and the main thread’s TID).
  • The /proc/[pid]/task/ subdirectories enumerate all kernel threads in that thread group.
  • The “process” you intuitively think of is the thread group leader (the main thread) that created the other threads.

Last Note: Multiprocessing vs Multithreading

To avoid confusion with thread groups (TGID), it helps to also remember process groups, which use PGID. TGID is about threads within a process, while PGID is about groups of processes (used heavily for job control in shells).

In broad strokes, a new process is created via fork() (and friends), while a new thread is created via pthread_create() in C. Under the hood, Linux commonly uses the clone() syscall for both, with different flags controlling what is shared.

When a process spawns subprocesses, the spawning process is the parent, and it may either create a new process group or inherit an existing one. If it creates a new process group, the PGID is typically set to the PID of the process that created the group. If it inherits a group, the PGID will match the inherited group.

In contrast, the main thread and the threads it creates are best thought of as siblings under the same thread group: they share a TGID, and they also share the same PGID as the main thread. One practical implication is that threads are not visible as child “processes” to the parent of the thread group leader.

Example

A convenient way to generate a multithreaded workload on Linux is to use stress-ng. In stress-ng terminology, a “stressor” (or “hog”) is a process.

For example, the following command runs a memory contention stressor:

hy@node-0:~$ stress-ng --mcontend 1 -t 10h
stress-ng: info:  [56472] dispatching hogs: 1 mcontend

If you open htop, you can view the resulting process and its threads as a hierarchy. In that display, PGRP corresponds to the process group ID (PGID), while the PID column is overloaded and may represent either process IDs or thread IDs depending on the row.

htop results

In the example shown:

  • 56472 is the single-threaded parent process created by your shell when you ran the command.
  • 56473 is the multithreaded child process (and also the main thread of that child).
  • 56474 through 56477 are sibling threads created by the main thread 56473.

If you run pidof on the stressor, you will typically get the TGID (which matches the main thread’s ID):

hy@node-0:~$ pidof stress-ng-mcontend
56473

You can confirm that 56472 is the parent of 56473 by inspecting the parent’s children file:

hy@node-0:~$ cat /proc/56472/task/56472/children
56473

Next, if you list the child’s task/ directory, you will see all kernel threads in that thread group:

hy@node-0:~$ ll /proc/56473/task/
total 0
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 ./
dr-xr-xr-x 9 hy hy 0 Dec 31 22:21 ../
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56473/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56474/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56475/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56476/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56477/

If you inspect the task/ directory for one of the sibling threads, you will still see the full set of thread IDs in the group (because you are still within the same thread group context):

hy@node-0:~$ ll /proc/56476/task/
total 0
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 ./
dr-xr-xr-x 9 hy hy 0 Dec 31 22:21 ../
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56473/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56474/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56475/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56476/
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56477/

Finally, notice that the single-threaded parent process has only itself under task/:

hy@node-0:~$ ll /proc/56472/task/
total 0
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 ./
dr-xr-xr-x 9 hy hy 0 Dec 31 22:21 ../
dr-xr-xr-x 7 hy hy 0 Dec 31 22:21 56472/

Resource Accounting

Once you start looking at threads, resource accounting becomes another place where Linux tooling can be surprising. Different tools choose to aggregate or split resource usage in different ways, and sometimes they report thread-level information only when explicitly asked.

For example, htop typically aggregates the CPU and memory usage of all threads into the main thread’s row by default. Similarly, ps will usually show aggregated usage for the thread group leader when you query it by PID:

hy@node-0:~$ ps -p 56473 -o %cpu,%mem,cmd
%CPU %MEM CMD
 473  0.0 stress-ng-mcontend

If you try the same query on a sibling thread ID, you may get nothing because ps is often oriented around process identifiers unless you request thread detail:

hy@node-0:~$ ps -p 56476 -o %cpu,%mem,cmd
%CPU %MEM CMD

To display threads, you can use ps -L with the main thread’s ID:

hy@node-0:~$ ps -L 56473 -o %cpu,%mem,cmd
%CPU %MEM CMD
97.3  0.0 stress-ng-mcontend
94.0  0.0 stress-ng-mcontend
94.0  0.0 stress-ng-mcontend
94.0  0.0 stress-ng-mcontend
94.0  0.0 stress-ng-mcontend

If you want a more detailed listing, ps -L ... -F includes fields such as LWP (the thread ID) and NLWP (the number of threads):

hy@node-0:~$ ps -L 56473 -F
UID          PID    PPID     LWP  C NLWP    SZ   RSS PSR STIME TTY      STAT   TIME CMD
hy         56473   56472   56473 97    5 22792  2604  13 08:10 pts/2    RLl+ 302:30 stress-ng-mcontend
hy         56473   56472   56474 94    5 22792  2604   7 08:10 pts/2    RLl+ 292:03 stress-ng-mcontend
hy         56473   56472   56475 94    5 22792  2604  31 08:10 pts/2    RLl+ 292:00 stress-ng-mcontend
hy         56473   56472   56476 94    5 22792  2604  15 08:10 pts/2    RLl+ 291:59 stress-ng-mcontend
hy         56473   56472   56477 94    5 22792  2604   0 08:10 pts/2    RLl+ 292:05 stress-ng-mcontend

Thread reporting in top has similar behavior. With -H, top shows per-thread CPU usage, while without it, top aggregates:

hy@node-0:/proc$ top -H -p 56476
....
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  56473 hy        20   0   91168   2708   2272 R  97.3   0.0 127:24.91 stress-ng-mcont
  56474 hy        20   0   91168   2708   2272 R  94.0   0.0 122:55.16 stress-ng-mcont
  56475 hy        20   0   91168   2708   2272 R  94.0   0.0 122:54.44 stress-ng-mcont
  56476 hy        20   0   91168   2708   2272 R  93.7   0.0 122:55.33 stress-ng-mcont
  56477 hy        20   0   91168   2708   2272 R  92.3   0.0 122:56.57 stress-ng-mcont

Without -H, top reports a single aggregated number and attributes it to the thread group leader:

hy@node-0:~$ top -p 56476
....
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  56473 hy        20   0   91168   2708   2272 R 476.3   0.0 621:39.36 stress-ng-mcont

If you are scripting, you can use batch mode to extract either aggregate or per-thread CPU values, depending on whether you include -H:

# Aggregated
hy@node-0:~$ top -b -n 2 -d 0.2 -p 56476 | tail -1 | awk '{print $9}'
465.0

# Per-thread with -H
hy@node-0:~$ top -b -H -n 2 -d 0.2 -p 56476 | tail -1 | awk '{print $9}'
75.0

If you need ground-truth per-thread accounting, /proc is again the most explicit source. A thread’s /proc/[pid]/stat can reflect aggregated values when accessed via certain paths, whereas /proc/[pid]/task/[tid]/stat is the per-thread view:

# Total CPU time (user and kernel) aggregated across the thread group.
hy@node-0:~$ cat /proc/56476/stat | awk '{print $14, $15}'
9460932 12361

# CPU time for only thread 56476.
hy@node-0:~$ cat /proc/56476/task/56476/stat | awk '{print $14, $15}'
1879429 3032

You can also use psutil for a convenient scripting interface, but it is important to know what it aggregates. In many cases, psutil reports CPU and memory usage in a way that effectively attributes thread-group totals to whichever thread you query, because it is fundamentally process-centric:

>>> import psutil

# Great-grandparent process (e.g., tmux session).
>>> tmux_session = psutil.Process(54711)
>>> tmux_session.ppid()
1
>>> [(child.name(), child.pid) for child in tmux_session.children(recursive=True)]
[('bash', 54712), ('bash', 56236), ('python', 56613), ('stress-ng', 56472), ('stress-ng-mcontend', 56473)]

# Parent process.
>>> parent = psutil.Process(56472)
>>> parent.ppid()
54712
>>> parent.children(recursive=True)
[psutil.Process(pid=56473, name='stress-ng-mcontend', status='running', started='11:21:57')]
>>> parent.num_threads()
1

# Child process (main thread / thread group leader).
>>> child = psutil.Process(56473)
>>> child.num_threads()
5
>>> [thread.id for thread in child.threads()]
[56473, 56474, 56475, 56476, 56477]

# Sibling thread example.
>>> sibling = psutil.Process(56476)
>>> child.ppid()
56472
>>> sibling.ppid()
56472

# Accounting.
>>> parent.cpu_percent(interval=1)
0.0
>>> sibling.cpu_percent(interval=1)
471.5
>>> child.cpu_percent(interval=1)
472.4

>>> tmux_session.cpu_times()
pcputimes(user=7.46, system=3.19, children_user=102.18, children_system=153.15, iowait=0.0)
>>> parent.cpu_times()
pcputimes(user=0.0, system=0.0, children_user=0.0, children_system=0.0, iowait=0.0)
>>> child.cpu_times()
pcputimes(user=45250.11, system=57.79, children_user=0.0, children_system=0.0, iowait=0.0)
>>> sibling.cpu_times()
pcputimes(user=45255.42, system=57.79, children_user=0.0, children_system=0.0, iowait=0.0)

>>> parent.memory_full_info()
pfullmem(rss=6475776, vms=59777024, shared=6078464, text=1728512, lib=0, data=32018432, dirty=0, uss=3051520, pss=3749888, swap=0)
>>> child.memory_full_info()
pfullmem(rss=2772992, vms=93356032, shared=2326528, text=1728512, lib=0, data=65581056, dirty=0, uss=126976, pss=735232, swap=0)
>>> sibling.memory_full_info()
pfullmem(rss=2772992, vms=93356032, shared=2326528, text=1728512, lib=0, data=65581056, dirty=0, uss=126976, pss=735232, swap=0)

>>> tmux_session.memory_percent()
0.007239506814671662
>>> parent.memory_percent()
0.009602063988251591
>>> child.memory_percent()
0.004111699759675096
>>> sibling.memory_percent()
0.004111699759675096

The upshot is that “what is the PID” and “what resources belong to a thread” depend strongly on which abstraction and which tool you are using. Many tools default to aggregating at the thread-group level, even when they print thread IDs, and you typically need explicit flags (or direct /proc inspection) to get consistent per-thread views.

To end the running example, sending an interrupt to one thread can terminate the entire stressor, depending on how the program handles signals:

# Note: this returns True even if the process is a zombie.
>>> parent.is_running() == child.is_running() == sibling.is_running() == True
True

>>> import signal
>>> sibling.send_signal(signal.SIGINT)

>>> parent.is_running() == child.is_running() == sibling.is_running() == False
True

(It seems that interrupting one thread has a bottom-up cascading effect in stress-ng 🥴 )

Happy New Year 🎆 ~


Reference

Leave a comment