back to the schedule

Intro to the HTCondor Python API on a laptop cluster

Matthew West


Abstract

The philosophy of HTCondor is to allow researchers to easily automate and scale their workflows for greater overall throughput with minimal changes to the analysis code itself. The objective is to run jobs as efficiently as possible wherever there are available resources. CHTC's HTCondor software suite provides not just the batch system but a toolset that includes workflow pipeline automation, performance evaluation, and containerized environments. This demo will cover:

  • Running a cluster within a Docker container on Windows
  • Using the Python API to construct and submit a multi-layer workflow
  • Parsing log-files for performance information

Prior knowledge

  • Some experience with a cluster batch scheduling system
  • Familiarity with the Python programming language

References

  • Github repo for scripts
  • DockerHub link for HTCondor-Scipy container
  • HTCondor Python API documentation page