Basic example on how to use the PBS cluster

Basic example

When you’re running hundreds or thousands of jobs, automation is a necessity. This is where hopla can help you.

A simple example of how to use hopla on a PBS cluster. Please check the user guide for a more in depth presentation of all functionalities.

Imports

import hopla
from pprint import pprint

Executor Context

executor = hopla.Executor(
    cluster="pbs",
    folder="/tmp/hopla",
    queue="Nspin_short",
    image="/tmp/hopla/my-apptainer-img.simg",
    walltime=1
)

Submit Jobs

jobs = [
    executor.submit("sleep", k) for k in range(1, 11)
]
pprint(jobs)
print(jobs[0].delayed_submission)
[DelayedPbsJob(
  job_id=1,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=2,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=3,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=4,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=5,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=6,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=7,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=8,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=9,
  submission_id=None,
),
 DelayedPbsJob(
  job_id=10,
  submission_id=None,
)]
DelayedSubmission(
  command=sleep 1,
  execution_parameters=,
)

Generate a batch

jobs[0].generate_batch()
print(jobs[0].paths)
batch = jobs[0].paths.submission_file
with open(batch) as of:
    print(of.read())
JobPaths(
  flux_dir=/tmp/hopla/logs/1_flux,
  job_id=1,
  joblib_file=/tmp/hopla/submissions/1_joblib_script.py,
  log_folder=/tmp/hopla/logs,
  oneshot_dir=/tmp/hopla/logs/1_oneshot,
  oneshot_file=/tmp/hopla/submissions/1_oneshot_script.sh,
  stderr=/tmp/hopla/logs/1_log.err,
  stdout=/tmp/hopla/logs/1_log.out,
  submission_file=/tmp/hopla/submissions/1_submission.sh,
  submission_folder=/tmp/hopla/submissions,
  task_file=/tmp/hopla/submissions/1_tasks.txt,
  worker_file=/tmp/hopla/submissions/worker.sh,
)
#!/bin/bash

# Parameters
#PBS -q Nspin_short
#PBS -l mem=2gb,ncpus=1,ngpus=0,walltime=1:00:00
#PBS -N hopla
#PBS -e /tmp/hopla/logs/1_log.err
#PBS -o /tmp/hopla/logs/1_log.out

# Environment
echo $PBS_JOBID
echo $HOSTNAME

# Command
apptainer run  /tmp/hopla/my-apptainer-img.simg sleep 1
echo "HOPLASAY-DONE"

Start Jobs

We can’t execute the code on the CI since the PBS infrastructure is not available.

from hopla.config import Config

with Config(dryrun=True, delay_s=3):
    executor(max_jobs=2)
    print(executor.report)
QSUB:   0%|          | 0/10 [00:00<?, ?it/s][command] qsub /tmp/hopla/submissions/1_submission.sh

QSUB:  10%|█         | 1/10 [00:00<00:00, 1718.98it/s][command] qsub /tmp/hopla/submissions/2_submission.sh

QSUB:  20%|██        | 2/10 [00:00<00:00, 1741.82it/s][command] qsub /tmp/hopla/submissions/3_submission.sh

QSUB:  30%|███       | 3/10 [00:03<00:07,  1.00s/it]
QSUB:  30%|███       | 3/10 [00:03<00:07,  1.00s/it][command] qsub /tmp/hopla/submissions/4_submission.sh

QSUB:  40%|████      | 4/10 [00:03<00:06,  1.00s/it][command] qsub /tmp/hopla/submissions/5_submission.sh

QSUB:  50%|█████     | 5/10 [00:06<00:06,  1.24s/it]
QSUB:  50%|█████     | 5/10 [00:06<00:06,  1.24s/it][command] qsub /tmp/hopla/submissions/6_submission.sh

QSUB:  60%|██████    | 6/10 [00:06<00:04,  1.24s/it][command] qsub /tmp/hopla/submissions/7_submission.sh

QSUB:  70%|███████   | 7/10 [00:09<00:04,  1.35s/it]
QSUB:  70%|███████   | 7/10 [00:09<00:04,  1.35s/it][command] qsub /tmp/hopla/submissions/8_submission.sh

QSUB:  80%|████████  | 8/10 [00:09<00:02,  1.35s/it][command] qsub /tmp/hopla/submissions/9_submission.sh

QSUB:  90%|█████████ | 9/10 [00:12<00:01,  1.41s/it]
QSUB:  90%|█████████ | 9/10 [00:12<00:01,  1.41s/it][command] qsub /tmp/hopla/submissions/10_submission.sh

QSUB: 100%|██████████| 10/10 [00:12<00:00,  1.41s/it]
QSUB: 100%|██████████| 10/10 [00:15<00:00,  1.50s/it]
----------------------------------------
DelayedPbsJob<job_id=1>exitcode: failure
DelayedPbsJob<job_id=1>submission: /tmp/hopla/submissions/1_submission.sh
DelayedPbsJob<job_id=1>stdout: none
DelayedPbsJob<job_id=1>stderr: none
----------------------------------------
DelayedPbsJob<job_id=2>exitcode: failure
DelayedPbsJob<job_id=2>submission: /tmp/hopla/submissions/2_submission.sh
DelayedPbsJob<job_id=2>stdout: none
DelayedPbsJob<job_id=2>stderr: none
----------------------------------------
DelayedPbsJob<job_id=3>exitcode: failure
DelayedPbsJob<job_id=3>submission: /tmp/hopla/submissions/3_submission.sh
DelayedPbsJob<job_id=3>stdout: none
DelayedPbsJob<job_id=3>stderr: none
----------------------------------------
DelayedPbsJob<job_id=4>exitcode: failure
DelayedPbsJob<job_id=4>submission: /tmp/hopla/submissions/4_submission.sh
DelayedPbsJob<job_id=4>stdout: none
DelayedPbsJob<job_id=4>stderr: none
----------------------------------------
DelayedPbsJob<job_id=5>exitcode: failure
DelayedPbsJob<job_id=5>submission: /tmp/hopla/submissions/5_submission.sh
DelayedPbsJob<job_id=5>stdout: none
DelayedPbsJob<job_id=5>stderr: none
----------------------------------------
DelayedPbsJob<job_id=6>exitcode: failure
DelayedPbsJob<job_id=6>submission: /tmp/hopla/submissions/6_submission.sh
DelayedPbsJob<job_id=6>stdout: none
DelayedPbsJob<job_id=6>stderr: none
----------------------------------------
DelayedPbsJob<job_id=7>exitcode: failure
DelayedPbsJob<job_id=7>submission: /tmp/hopla/submissions/7_submission.sh
DelayedPbsJob<job_id=7>stdout: none
DelayedPbsJob<job_id=7>stderr: none
----------------------------------------
DelayedPbsJob<job_id=8>exitcode: failure
DelayedPbsJob<job_id=8>submission: /tmp/hopla/submissions/8_submission.sh
DelayedPbsJob<job_id=8>stdout: none
DelayedPbsJob<job_id=8>stderr: none
----------------------------------------
DelayedPbsJob<job_id=9>exitcode: failure
DelayedPbsJob<job_id=9>submission: /tmp/hopla/submissions/9_submission.sh
DelayedPbsJob<job_id=9>stdout: none
DelayedPbsJob<job_id=9>stderr: none
----------------------------------------
DelayedPbsJob<job_id=10>exitcode: failure
DelayedPbsJob<job_id=10>submission: /tmp/hopla/submissions/10_submission.sh
DelayedPbsJob<job_id=10>stdout: none
DelayedPbsJob<job_id=10>stderr: none

Total running time of the script: (0 minutes 15.236 seconds)

Estimated memory usage: 109 MB

Gallery generated by Sphinx-Gallery