SLURM Reference Manual for HP XC System Software
Table Of Contents
- Preface
- Introduction
- SLURM Goals and Roles
- SLURM Features
- SLURM Operation
- SLURM Utilities
- SRUN (Submit Jobs)
- SQUEUE (List Jobs)
- SINFO (List Nodes)
- SMAP (Show Job Geometry)
- SCONTROL (Manage Configurations)
- Disclaimer
- Keyword Index
- Alphabetical List of Keywords
- Date and Revisions
SQUEUE Job State Codes
Most SQUEUE reports use short codes (abbreviations) to reveal the state (current status) of each job
that SLURM manages. The SQUEUE job-state codes and what they mean are explained here in alphabetical
order. A separate section covers SINFO node state codes (page 60).
Note that these SQUEUE codes differ from those used by PSTAT to report the status that LCRM/DPCS
assigns to the batch jobs that it schedules (across machines "above" SLURM). See the Status Values for
Batch Jobs (URL: http://www.llnl.gov/LCdocs/dpcs/index.jsp?show=s4.1.2) section of the LCRM/DPCS
Reference Manual for a long explanatory list of those different job states.
CD (COMPLETED)
Job has successfully ended all of its processes on all nodes (LCRM/DPCS:
CMPLETED).
CG (COMPLETING)
Job is in the process of completing, so some processes on some nodes may still be
active.
F (FAILED)
Job has terminated with a nonzero exit code or other failure condition.
NF (NODE_FAIL)
Job has terminated because one or more nodes allocated to it has failed.
PD (PENDING)
Job is awaiting resource allocation (there are many corresponding LCRM/DPCS states
depending on just which resources are needed).
R (RUNNING)
Job is executing successfully now (LCRM/DPCS: RUN).
TO (TIMEOUT)
Job has terminated upon reaching its time limit (LCRM/DPCS: TERMINATED).
SLURM Reference Manual - 52