530 lines
26 KiB
HTML
530 lines
26 KiB
HTML
<!--lint ignore-->
|
||
<h1 id="awesome-hpc-awesome">Awesome HPC <a
|
||
href="https://awesome.re"><img src="https://awesome.re/badge-flat.svg"
|
||
alt="Awesome" /></a></h1>
|
||
<p>High Performance Computing tools and resources for engineers and
|
||
administrators.</p>
|
||
<p><a href="https://en.wikipedia.org/wiki/Supercomputer">High
|
||
Performance Computing (HPC)</a> most generally refers to the practice of
|
||
aggregating computing power in a way that delivers much higher
|
||
performance than one could get out of a typical desktop computer or
|
||
workstation in order to solve large problems in science, engineering, or
|
||
business.</p>
|
||
<h2 id="contents">Contents</h2>
|
||
<details>
|
||
<summary>
|
||
<b>(click to expand)</b>
|
||
</summary>
|
||
<ul>
|
||
<li><a href="#provisioning">Provisioning</a></li>
|
||
<li><a href="#workload-managers">Workload Managers</a></li>
|
||
<li><a href="#pipelines">Pipelines</a></li>
|
||
<li><a href="#applications">Applications</a></li>
|
||
<li><a href="#compilers">Compilers</a></li>
|
||
<li><a href="#mpi">MPI</a></li>
|
||
<li><a href="#parallel-computing">Parallel Computing</a></li>
|
||
<li><a href="#benchmarking">Benchmarking</a></li>
|
||
<li><a href="#miscellaneous">Miscellaneous</a></li>
|
||
<li><a href="#performance">Performance</a></li>
|
||
<li><a href="#parallel-shells">Parallel Shells</a></li>
|
||
<li><a href="#containers">Containers</a></li>
|
||
<li><a href="#environment-management">Environment Management</a></li>
|
||
<li><a href="#visualization">Visualization</a></li>
|
||
<li><a href="#parallel-filesystems">Parallel Filesystems</a></li>
|
||
<li><a href="#programming-languages">Programming Languages</a></li>
|
||
<li><a href="#monitoring">Monitoring</a></li>
|
||
<li><a href="#journals">Journals</a></li>
|
||
<li><a href="#podcasts">Podcasts</a></li>
|
||
<li><a href="#blogs">Blogs</a></li>
|
||
<li><a href="#conferences">Conferences</a></li>
|
||
<li><a href="#websites">Websites</a></li>
|
||
<li><a href="#user-groups">User Groups</a></li>
|
||
</ul>
|
||
</details>
|
||
<h2 id="provisioning">Provisioning</h2>
|
||
<ul>
|
||
<li><a href="https://grendel.readthedocs.io/">Grendel</a> - Bare Metal
|
||
Provisioning system for HPC Linux clusters (<a
|
||
href="https://github.com%60ubccr/grendel">Source Code</a>)
|
||
<code>GPL-3</code>.</li>
|
||
<li><a href="https://xcat.org/">XCat</a> - xCAT is a toolkit for
|
||
deployment and administration of clusters of all sizes (<a
|
||
href="https://github.com/xcat2/xcat-core">Source Code</a>)
|
||
<code>EPL-1.0</code>.</li>
|
||
<li><a href="https://warewulf.hpcng.org/">Warewulf</a> - Warewulf is a
|
||
stateless and diskless container operating system provisioning system
|
||
for large clusters of bare metal and/or virtual systems (<a
|
||
href="https://github.com/hpcng/warewulf">Source Code</a>)
|
||
<code>BSD-3</code>.</li>
|
||
<li><a href="http://www.rocksclusters.org/">Rocks</a> - A Linux
|
||
distribution for developing Linux clusters <code>other</code>.</li>
|
||
<li><a href="https://cobbler.github.io/">Cobbler</a> - Cobbler is a
|
||
Linux installation server that allows for rapid setup of network
|
||
installation environments (<a
|
||
href="https://github.com/cobbler/cobbler">Source Code</a>)
|
||
<code>GPL-2.0</code>.</li>
|
||
<li><a
|
||
href="https://docs.nvidia.com/base-command-manager/index.html">Base
|
||
Command Manager</a> - Base Command Manager allows administrator to
|
||
quickly build and manage heterogeneous clusters
|
||
<code>Proprietary</code>.</li>
|
||
<li><a
|
||
href="https://www.penguinsolutions.com/computing/products/software/scyld-clusterware/">Scyld</a>
|
||
- Scyld Clusterware Scyld ClusterWare is developed based on the
|
||
continuing evolution of Beowulf clusters first developed at NASA in the
|
||
1990s <code>Proprietary</code>.</li>
|
||
<li><a href="https://bluebanquise.com">BlueBanquise</a> - BlueBanquise
|
||
is an open source cluster deployment and management stack built on
|
||
Python and Ansible (<a
|
||
href="https://github.com/bluebanquise/bluebanquise">Source Code</a>)
|
||
<code>MIT</code>.</li>
|
||
</ul>
|
||
<h2 id="workload-managers">Workload Managers</h2>
|
||
<ul>
|
||
<li><a href="https://slurm.schedmd.com/documentation.html">Slurm</a> - A
|
||
free and open source job scheduler (<a
|
||
href="https://github.com/SchedMD/slurm">Source Code</a>)
|
||
<code>OSS</code>.</li>
|
||
<li><a
|
||
href="https://www.ibm.com/products/hpc-workload-management">LSF</a> - A
|
||
job scheduler and workload management software developed by IBM
|
||
<code>Proprietary</code>.</li>
|
||
<li><a href="https://adaptivecomputing.com/moab-hpc-suite/">Moab</a> -
|
||
Moab is a workload management and job scheduler <code>other</code>.</li>
|
||
<li><a href="https://en.wikipedia.org/wiki/TORQUE">Torque</a> - Torque
|
||
is a workload management and job scheduler <code>other</code>.</li>
|
||
<li><a href="https://en.wikipedia.org/wiki/OpenLava">OpenLava</a> -
|
||
OpenLava is a workload management and job scheduler
|
||
<code>other</code>.</li>
|
||
<li><a
|
||
href="https://en.wikipedia.org/wiki/Univa_Grid_Engine">UGE/SGE</a> -
|
||
Univa Grid Engine is a workload management engine for HPC
|
||
<code>Proprietary</code>.</li>
|
||
<li><a href="https://volcano.sh/">Volcano</a> - Volcano is a batch
|
||
system built on Kubernetes <code>Apache-2.0</code>.</li>
|
||
<li><a href="https://www.mhpcc.hpc.mil/">Maui</a> - Maui is a workload
|
||
management and job scheduler <code>other</code>.</li>
|
||
<li><a href="https://github.com/kubernetes-sigs/kube-batch">Kube
|
||
Batch</a> - A batch scheduler of kubernetes for high performance
|
||
workload, e.g. AI/ML, BigData, HPC <code>Apache-2.0</code>.</li>
|
||
<li><a href="https://www.openpbs.org/">OpenPBS</a> - OpenPBS® software
|
||
optimizes job scheduling and workload management in high-performance
|
||
computing (HPC) environments (<a
|
||
href="https://github.com/openpbs/openpbs">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
</ul>
|
||
<h2 id="pipelines">Pipelines</h2>
|
||
<ul>
|
||
<li><a href="https://nextflow.io">Nextflow</a> - Data drive
|
||
computational pipelines <code>Apache-2.0</code>.</li>
|
||
<li><a href="https://cromwell.readthedocs.io/en/stable/">Cromwell</a> -
|
||
Scientific workflow engine designed for simplicity & scalability (<a
|
||
href="https://github.com/broadinstitute/cromwell">Source Code</a>)
|
||
<code>BSD-3</code>.</li>
|
||
<li><a href="https://pegasus.isi.edu/">Pegasus</a> - A configurable
|
||
system for mapping and executing scientific workflows over a wide range
|
||
of computational infrastructure (<a
|
||
href="https://github.com/pegasus-isi/pegasus">Source
|
||
Code</a>)<code>Apache-2.0</code>.</li>
|
||
</ul>
|
||
<h2 id="applications">Applications</h2>
|
||
<ul>
|
||
<li><a href="https://spack.io">Spack</a> - A flexible package manager
|
||
that supports multiple versions, configurations, platforms, and
|
||
compilers (<a href="https://github.com/spack/spack">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://easybuild.io/">EasyBuild</a> - EasyBuild - building
|
||
software with ease (<a
|
||
href="https://github.com/easybuilders/easybuild">Source Code</a>)
|
||
<code>GPL-2</code>.</li>
|
||
</ul>
|
||
<h2 id="compilers">Compilers</h2>
|
||
<ul>
|
||
<li><a href="https://developer.nvidia.com/hpc-compilers">Nvidia</a> -
|
||
NVIDIA HPC compiler suite for Fortran, C/C++ with OpenACC
|
||
<code>Proprietary</code>.</li>
|
||
<li><a href="https://www.pgroup.com/index.htm">Portland Group</a> - The
|
||
Portland Group compilers were Fortran, C/C++ compilers now integrated
|
||
into NVIDIA HPC SDK <code>Proprietary</code>.</li>
|
||
<li><a
|
||
href="https://software.intel.com/content/www/us/en/develop/tools/oneapi/all-toolkits.html#hpc-kit">Intel</a>
|
||
- The Intel compiler suite offers many language compilers for use in the
|
||
HPC space <code>Proprietary</code>.</li>
|
||
<li><a
|
||
href="https://bluewaters.ncsa.illinois.edu/cray-compiler">Cray</a> - A
|
||
suite of compilers designed and optimized to target the AMD interlagos
|
||
instruction set <code>Proprietary</code>.</li>
|
||
<li><a href="https://gcc.gnu.org/">GNU</a> - The GNU Compiler Collection
|
||
is a suite of compilers targeting many languages (<a
|
||
href="https://gcc.gnu.org/git.html">Source Code</a>)
|
||
<code>GPL-3</code>.</li>
|
||
<li><a href="https://llvm.org/">LLVM</a> - The LLVM project is a
|
||
collection of modular compilers and toolchains (<a
|
||
href="https://github.com/llvm/llvm-project">Source Code</a>)
|
||
<code>OSS</code>.</li>
|
||
</ul>
|
||
<h2 id="mpi">MPI</h2>
|
||
<ul>
|
||
<li><a href="https://www.open-mpi.org/">OpenMPI</a> - OpenMPI is an open
|
||
source implementation of the MPI-3.1 standard (<a
|
||
href="https://github.com/open-mpi/ompi">Source Code</a>)
|
||
<code>BSD</code>.</li>
|
||
<li><a href="https://www.mpich.org/">MPICH</a> - MPICH is a
|
||
high-performance and widely portable implementation of the MPI-3.1
|
||
standard (<a href="https://github.com/pmodels/mpich">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://mvapich.cse.ohio-state.edu/">MVAPICH</a> - MVAPICH
|
||
is an open source implementation of the MPI-3.1 standard developed by
|
||
Ohio State University <code>BSD</code>.</li>
|
||
<li><a
|
||
href="https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html">Intel-MPI</a>
|
||
- Intel-MPI is Intel’s MPI-3.1 implementation included in their compiler
|
||
suite <code>other</code>.</li>
|
||
</ul>
|
||
<h2 id="parallel-computing">Parallel Computing</h2>
|
||
<ul>
|
||
<li><a href="https://arrayfire.org/docs/index.htm">ArrayFire</a> - A
|
||
general purpose tensor library that simplifies the process of software
|
||
development for parallel architectures <code>other</code>.</li>
|
||
<li><a href="https://www.openmp.org/">OpenMP</a> - OpenMP is an
|
||
application programming interface that supports multi-platform
|
||
shared-memory multiprocessing programming <code>other</code>.</li>
|
||
</ul>
|
||
<h2 id="benchmarking">Benchmarking</h2>
|
||
<ul>
|
||
<li><a href="https://mvapich.cse.ohio-state.edu/benchmarks/">OSU
|
||
Benchmarks</a> - A collection of benchmarking tools for MPI developed by
|
||
Ohio State University <code>other</code>.</li>
|
||
<li><a
|
||
href="https://software.intel.com/content/www/us/en/develop/articles/intel-mpi-benchmarks.html">Intel
|
||
MPI Benchmarks</a> - A set of benchmarks developed by Intel for use with
|
||
their Intel MPI <code>other</code>.</li>
|
||
<li><a href="https://hpccsystems.com/">HPCC Systems</a> - HPCC Systems
|
||
(High Performance Computing Cluster) is an open source, massive
|
||
parallel-processing computing platform for big data processing and
|
||
analytics (<a
|
||
href="https://github.com/hpcc-systems/HPCC-Platform">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://www.netlib.org/linpack/">LINPACK</a> - LINPACK is a
|
||
set of efficient fortran subroutines for solving linear systems which
|
||
benchmarks are useful for HPC <code>other</code>.</li>
|
||
<li><a href="https://www.iozone.org/">IOzone</a> - IOzone is a
|
||
filesystem benchmark tool <code>OSS</code>.</li>
|
||
<li><a href="https://www.vi4io.org/tools/benchmarks/ior">IOR</a> -
|
||
Interleaved or Random is a useful benchmarking tool for testing parallel
|
||
filesystems <code>other</code>.</li>
|
||
<li><a href="https://www.vi4io.org/tools/benchmarks/mdtest">MDtest</a> -
|
||
MDtest is an MPI-based application for evaluating the metadata
|
||
performance of a file system <code>other</code>.</li>
|
||
<li><a href="https://fio.readthedocs.io/en/latest/fio_doc.html">FIO</a>
|
||
- Flexible I/O is an advanced disk benchmark that depends upon the
|
||
kernel’s AIO access library (<a
|
||
href="https://git.kernel.dk/cgit/fio/">Source Code</a>)
|
||
<code>GPL-2</code>.</li>
|
||
<li><a href="https://github.com/breuner/elbencho">elbencho</a> - A
|
||
distributed storage benchmark for files, objects & blocks with
|
||
support for GPUs <code>GPL-3</code>.</li>
|
||
</ul>
|
||
<h2 id="miscellaneous">Miscellaneous</h2>
|
||
<ul>
|
||
<li><a href="https://openondemand.org/">OpenOnDemand</a> - Open OnDemand
|
||
helps computational researchers and students efficiently utilize remote
|
||
computing resources by making them easy to access from any device (<a
|
||
href="https://github.com/OSC/openondemand.org">Source Code</a>)
|
||
<code>MIT</code>.</li>
|
||
<li><a href="https://open.xdmod.org">Open XDMod</a> - Open XDMoD is an
|
||
open source tool to facilitate the management of high performance
|
||
computing resources (<a href="https://github.com/ubccr/xdmod/">Source
|
||
Code</a>) <code>LGPL-3</code>.</li>
|
||
<li><a href="https://coldfront.readthedocs.io/en/latest/">Coldfront</a>
|
||
- ColdFront is an open source resource allocation system designed to
|
||
provide a central portal for administration, reporting, and measuring
|
||
scientific impact of HPC resources (<a
|
||
href="https://github.com/ubccr/coldfront">Source Code</a>)
|
||
<code>GPL-3</code>.</li>
|
||
<li><a href="https://pavilion2.readthedocs.io/">Pavilion2</a> - Pavilion
|
||
is a Python 3 (3.6+) based framework for running and analyzing tests
|
||
targeting HPC systems (<a href="https://github.com/hpc/pavilion2">Source
|
||
Code</a>) <code>other</code>.</li>
|
||
<li><a href="https://reframe-hpc.readthedocs.io/en/stable/">Reframe</a>
|
||
- A powerful Python framework for writing and running portable
|
||
regression tests and benchmarks for HPC systems. (<a
|
||
href="https://github.com/reframe-hpc/reframe">Source Code</a>)
|
||
<code>BSD-3</code>.</li>
|
||
<li><a href="https://olcf.github.io/olcf-test-harness/">OLCF Test
|
||
Harness</a> - The OLCF Test Harness (OTH) helps automate the testing of
|
||
applications, tools, and other system software (<a
|
||
href="https://github.com/olcf/olcf-test-harness">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://github.com/CLIP-HPC/goslmailer">GoSlmailer</a> -
|
||
Goslmailer is a drop-in notification delivery solution for slurm that
|
||
can do slack, mattermost, teams, and more.</li>
|
||
</ul>
|
||
<h2 id="performance">Performance</h2>
|
||
<ul>
|
||
<li><a href="https://totalview.io/products/totalview">TotalView</a> -
|
||
TotalView is a debugging tool for HPC applications
|
||
<code>Proprietary</code>.</li>
|
||
<li><a href="https://www.cs.uoregon.edu/research/tau/home.php">Tau</a> -
|
||
TAU Performance System® is a portable profiling and tracing toolkit for
|
||
performance analysis of parallel programs written in Fortran, C, C++,
|
||
UPC, Java, Python <code>other</code>.</li>
|
||
<li><a href="https://www.valgrind.org/">Valgrind</a> - Valgrind is a
|
||
tool designed to profile programs to determine memory leaks (<a
|
||
href="https://sourceware.org/git/?p=valgrind.git">Source Code</a>)
|
||
<code>GPL-2</code>.</li>
|
||
<li><a href="https://tools.bsc.es/paraver">Paraver</a> - Paraver is a
|
||
very flexible data browser that is part of the CEPBA-Tools toolkit
|
||
<code>other</code>.</li>
|
||
<li><a href="http://icl.cs.utk.edu/papi">PAPI</a> - Performance
|
||
Application Programming Interface (PAPI) is a performance analysis tool
|
||
(<a href="https://bitbucket.org/icl/papi/src/master/">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
</ul>
|
||
<h2 id="parallel-shells">Parallel Shells</h2>
|
||
<ul>
|
||
<li><a href="https://linux.die.net/man/1/pdsh">pdsh</a> - pdsh runs
|
||
terminal commands across multiple hosts in parallel (<a
|
||
href="https://github.com/chaos/pdsh">Source Code</a>)
|
||
<code>GPL-2</code>.</li>
|
||
<li><a
|
||
href="https://clustershell.readthedocs.io/en/latest/intro.html">ClusterShell</a>
|
||
- Scalable cluster administration Python framework (<a
|
||
href="https://github.com/cea-hpc/clustershell">Source Code</a>)
|
||
<code>LGPL-2.1</code> .</li>
|
||
</ul>
|
||
<h2 id="containers">Containers</h2>
|
||
<ul>
|
||
<li><a href="https://apptainer.org">Apptainer</a> - Apptainer is an open
|
||
source container system (<a
|
||
href="https://github.com/apptainer/apptainer">Source Code</a>)
|
||
<code>BSD</code>.</li>
|
||
<li><a href="https://hpc.github.io/charliecloud/">Charliecloud</a> -
|
||
Charliecloud provides user-defined software stacks (UDSS) for
|
||
high-performance computing (HPC) centers (<a
|
||
href="https://github.com/hpc/charliecloud">Source Code</a>)
|
||
<code>Apache-2.0</code>.</li>
|
||
<li><a href="https://www.docker.com/">Docker</a> - Docker is a set of
|
||
platform as a service products that use OS-level virtualization to
|
||
deliver software in packages called containers <code>other</code>.</li>
|
||
<li><a href="https://indigo-dc.github.io/udocker/">uDocker</a> - A basic
|
||
user tool to execute simple docker containers in batch or interactive
|
||
systems without root privileges (<a
|
||
href="https://github.com/indigo-dc/udocker">Source Code</a>)
|
||
<code>Apache-2.0</code>.</li>
|
||
<li><a
|
||
href="https://www.nersc.gov/research-and-development/user-defined-images/">Shifter</a>
|
||
- Shifter is Linux containers for HPC (<a
|
||
href="https://github.com/NERSC/shifter">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://github.com/NVIDIA/hpc-container-maker">HPC
|
||
Container Maker</a> - HPC Container Maker is an open source tool to make
|
||
it easier to generate container specification files.
|
||
<code>Apache-2.0</code>.</li>
|
||
<li><a href="https://github.com/eth-cscs/sarus">Scarus</a> - An
|
||
OCI-compatible container engine for HPC <code>BSD</code>.</li>
|
||
<li><a href="https://singularity-hpc.readthedocs.io">Singularity HPC</a>
|
||
- Singularity Registry HPC (shpc) allows you to install containers as
|
||
modules (<a
|
||
href="https://github.com/singularityhub/singularity-hpc">Source
|
||
Code</a>) <code>MPL 2.0</code>.</li>
|
||
</ul>
|
||
<h2 id="environment-management">Environment Management</h2>
|
||
<ul>
|
||
<li><a href="https://lmod.readthedocs.io/en/latest/">Lmod</a> - Lmod: An
|
||
Environment Module System based on Lua, Reads TCL Modules, Supports a
|
||
Software Hierarchy (<a href="https://github.com/TACC/Lmod">Source
|
||
Code</a>) <code>other</code>.</li>
|
||
<li><a href="https://modules.readthedocs.io/en/latest/">Environment
|
||
Modules</a> - Environment Modules: provides dynamic modification of a
|
||
user’s environment (<a href="https://github.com/cea-hpc/modules">Source
|
||
Code</a>) <code>GPL-2</code>.</li>
|
||
<li><a href="https://www.anaconda.com/">Anaconda</a> - Anaconda is a
|
||
Python and R distribution for use in computational science
|
||
<code>other</code>.</li>
|
||
<li><a href="https://mamba.readthedocs.io/en/latest/">Mamba</a> - Mamba
|
||
is a reimplementation of the conda package manager in C++ (<a
|
||
href="https://github.com/mamba-org/mamba">Source Code</a>)
|
||
<code>BSD</code>.</li>
|
||
</ul>
|
||
<h2 id="visualization">Visualization</h2>
|
||
<ul>
|
||
<li><a href="https://visit-dav.github.io/visit-website/">Visit</a> -
|
||
VisIt - Visualization and Data Analysis for Mesh-based Scientific Data
|
||
(<a href="https://github.com/visit-dav/visit">Source Code</a>)
|
||
<code>BSD-3</code>.</li>
|
||
<li><a href="https://www.paraview.org/">Paraview</a> - ParaView is an
|
||
open-source, multi-platform data analysis and visualization application
|
||
based on Visualization Toolkit (VTK) (<a
|
||
href="https://github.com/Kitware/ParaView">Source Code</a>)
|
||
<code>BSD-3</code>.</li>
|
||
</ul>
|
||
<h2 id="parallel-filesystems">Parallel Filesystems</h2>
|
||
<ul>
|
||
<li><a
|
||
href="https://www.ibm.com/docs/en/gpfs/4.1.0.4?topic=guide-introducing-general-parallel-file-system">GPFS</a>
|
||
- GPFS is a high-performance clustered file system software developed by
|
||
IBM <code>Proprietary</code>.</li>
|
||
<li><a
|
||
href="https://www.quobyte.com/storage-for/high-performance-computing-hpc?gclid=EAIaIQobChMI-fv1pfKG8wIV5x6tBh367Q5CEAAYASABEgJTgPD_BwE">Quobyte</a>
|
||
- A high performance filesystem <code>Proprietary</code>.</li>
|
||
<li><a href="https://ceph.io/en/">Ceph</a> - Ceph is a distributed
|
||
object, block, and file storage platform (<a
|
||
href="https://github.com/ceph/ceph">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://www.weka.io/">Weka</a> - A file system designed for
|
||
HPC <code>Proprietary</code> .</li>
|
||
<li><a href="https://www.lustre.org/">Lustre/Exascaler</a> - Lustre is
|
||
an open-source, distributed parallel file system software platform
|
||
designed for scalability, high-performance, and high-availability (<a
|
||
href="https://git.whamcloud.com/fs/lustre-release.git">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://www.beegfs.io/c/">BeeGFS</a> - BeeGFS is a
|
||
hardware-independent POSIX parallel file system developed with a strong
|
||
focus on performance and designed for ease of use, simple installation,
|
||
and management <code>Proprietary</code>.</li>
|
||
<li><a href="http://www.orangefs.org/">OrangeFS</a> - OrangeFS is a next
|
||
generation parallel file system for Linux clusters (<a
|
||
href="https://github.com/waltligon/orangefs">Source Code</a>)
|
||
<code>other</code>.</li>
|
||
<li><a href="https://moosefs.com/">MooseFS</a> - Moose File System is an
|
||
Open-source, POSIX-compliant distributed file system developed by Core
|
||
Technology (<a href="https://github.com/moosefs/moosefs">Source
|
||
Code</a>) <code>GPL-2.0</code>.</li>
|
||
</ul>
|
||
<h2 id="programming-languages">Programming Languages</h2>
|
||
<ul>
|
||
<li><a href="https://julialang.org/">Julia</a> - Julia is a high-level,
|
||
high-performance dynamic language for technical computing
|
||
<code>MIT</code>.</li>
|
||
<li><a href="https://futhark-lang.org/">Futhark</a> - Futhark is a
|
||
purely functional data-parallel programming language in the ML family
|
||
<code>isc</code>.</li>
|
||
<li><a href="https://chapel-lang.org/">Chapel</a> - Chapel is a
|
||
programming language designed for productive parallel computing at scale
|
||
<code>Apache-2.0</code>.</li>
|
||
</ul>
|
||
<h2 id="monitoring">Monitoring</h2>
|
||
<h3 id="prometheus-based">Prometheus Based</h3>
|
||
<ul>
|
||
<li><a
|
||
href="https://github.com/treydock/prometheus-slurm-exporter">Slurm
|
||
Exporter</a> - Prometheus exporter for performance metrics from Slurm
|
||
<code>GPL-3.0</code>.</li>
|
||
<li><a href="https://github.com/ubccr/slurm-exporter">Slurm Exporter</a>
|
||
- Slurm Exporter for Prometheus using Rest API
|
||
<code>GPL-3.0</code>.</li>
|
||
<li><a href="https://github.com/treydock/infiniband_exporter">Infiniband
|
||
Exporter</a> - The InfiniBand exporter collects counters from InfiniBand
|
||
switches and HCAs <code>Apache-2.0</code>.</li>
|
||
<li><a href="https://github.com/treydock/cgroup_exporter">Cgroup
|
||
Exporter</a> - Produces metrics from cgroups
|
||
<code>Apache-2.0</code>.</li>
|
||
<li><a href="https://github.com/phpHavok/cgroups_exporter">Cgroup
|
||
Exporter</a> - A Prometheus exporter for cgroup-level metrics
|
||
<code>unknown</code>.</li>
|
||
<li><a href="https://github.com/treydock/gpfs_exporter">GPFS
|
||
Exporter</a> - The GPFS exporter collects metrics from the GPFS
|
||
filesystem <code>Apache-2.0</code>.</li>
|
||
<li><a href="https://github.com/GSI-HPC/lustre_exporter">Lustre
|
||
Exporter</a> - Prometheus exporter for use with the Lustre parallel
|
||
filesystem <code>GPL-3.0</code>.</li>
|
||
<li><a href="https://github.com/NVIDIA/dcgm-exporter">DCGM Exporter</a>
|
||
- NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
|
||
<code>Apache-2.0</code>.</li>
|
||
</ul>
|
||
<h2 id="journals">Journals</h2>
|
||
<ul>
|
||
<li><a href="https://www.springer.com/journal/11227">Journal of Super
|
||
Computing</a> - An International Journal of High-Performance Computer
|
||
Design, Analysis, and Use.</li>
|
||
</ul>
|
||
<h2 id="podcasts">Podcasts</h2>
|
||
<ul>
|
||
<li><a href="https://www.intersect360.com/media/podcasts/">This week in
|
||
HPC</a> - Each week, Intersect360 Research CEO Addison Snell and HPCwire
|
||
editor Tiffany Trader dissect the week’s top HPC stories.</li>
|
||
<li><a href="https://www.exascaleproject.org/podcast/">Exascaler
|
||
Project</a> - ECP’s Let’s Talk Exascale podcast goes behind the scenes
|
||
to chat with some of the people who are bringing a capable and
|
||
sustainable exascale computing ecosystem to fruition.</li>
|
||
<li><a
|
||
href="https://insidehpc.com/category/resources/hpc-podcast/"><span
|
||
class="citation" data-cites="HPCpodcast">@HPCpodcast</span></a> - Join
|
||
Shahin Khan and Doug Black as they discuss Supercomputing technologies
|
||
and the applications, markets, and policies that shape them.</li>
|
||
</ul>
|
||
<h2 id="blogs">Blogs</h2>
|
||
<ul>
|
||
<li><a href="https://www.hpcwire.com/">HPCWire</a> - Since 1987 covering
|
||
the fastest computers in the world and the people who run them.</li>
|
||
<li><a href="https://insidehpc.com/">InsideHPC</a> - insideHPC is a
|
||
global publication recognized for its comprehensive and insightful
|
||
coverage of the HPC-AI community, linking vendors, end-users and HPC
|
||
strategists.</li>
|
||
<li><a href="https://www.nextplatform.com/category/hpc/">The Next
|
||
Platform</a> - Offers in-depth coverage of high-end computing at large
|
||
enterprises, supercomputing centers, hyperscale data centers, and public
|
||
clouds.</li>
|
||
<li><a href="http://www.theregister.co.uk/data_centre/hpc/">The Register
|
||
HPC</a> - The Register is a leading and trusted global online enterprise
|
||
technology news publication, reaching roughly 40 million readers
|
||
worldwide.</li>
|
||
<li><a href="http://hpcatdell.com">HPC at Dell</a> - High-Performance
|
||
Computing knowledge base articles from Dell.</li>
|
||
</ul>
|
||
<h2 id="conferences">Conferences</h2>
|
||
<ul>
|
||
<li><a href="https://pearc.acm.org/">Pearc</a> - Practice &
|
||
Experience in Advanced Research Computing.</li>
|
||
<li><a href="https://supercomputing.org/">Supercomputing (SC)</a> - The
|
||
International Conference for High Performance Computing, Networking,
|
||
Storage, and Analysis.</li>
|
||
<li><a href="https://www.isc-hpc.com/">Supercomputing International
|
||
(ISC)</a> - The International Conference for High Performance Computing,
|
||
Networking, Storage, and Analysis.</li>
|
||
<li><a href="https://dl.acm.org/conference/ccgrid">CCGrid</a> - IEEE/ACM
|
||
International Symposium on Cluster, Cloud and Internet Computing.</li>
|
||
<li><a href="https://ieee-hpec.org/">IEEE-HPEC</a> - IEEE High
|
||
Performance Embedded Computing.</li>
|
||
<li><a href="https://hotchips.org">Hot Chips</a> - Semiconductor
|
||
industry’s leading conference on high-performance microprocessors and
|
||
related circuits.</li>
|
||
<li><a href="https://hoti.org">Hot Interconnects</a> - IEEE conference
|
||
on software architectures and implementations for interconnection
|
||
networks of all scales.</li>
|
||
<li><a href="https://sites.google.com/view/essa-2024/">ESSA</a> -
|
||
Workshop on Extreme-Scale Storage and Analysis.</li>
|
||
<li><a href="https://www.ipdps.org/">IEEE-IPDPS</a> - IEEE International
|
||
Parallel & Distributed Processing Symposium.</li>
|
||
<li><a href="http://nowlab.cse.ohio-state.edu/espm2/">ESPM2 Workshop</a>
|
||
- International Workshop on Extreme Scale Programming Models and
|
||
Middleware.</li>
|
||
<li><a href="https://linuxclustersinstitute.org/workshops/">LCI
|
||
Workshops</a> - The Linux Clusters Institute (LCI) is providing
|
||
education and advanced technical training for the deployment and use of
|
||
computing clusters to the high performance computing community
|
||
worldwide.</li>
|
||
<li><a href="https://www.hpc-carpentry.org/">HPC Carpentry</a> -
|
||
Teaching basic skills for high-performance computing.</li>
|
||
</ul>
|
||
<h2 id="websites">Websites</h2>
|
||
<ul>
|
||
<li><a href="https://top500.org">Top500</a> - The TOP500 project ranks
|
||
and details the 500 most powerful non-distributed computer systems in
|
||
the world.</li>
|
||
</ul>
|
||
<h2 id="user-groups">User Groups</h2>
|
||
<ul>
|
||
<li><a href="https://mug.mvapich.cse.ohio-state.edu/">MVAPICH</a> - The
|
||
MUG conference provides an open forum for all attendees (users, system
|
||
administrators, researchers, engineers, and students) to discuss and
|
||
share their knowledge on using MVAPICH libraries.</li>
|
||
<li><a href="https://slurm.schedmd.com/slurm_ug_agenda.html">Slurm</a> -
|
||
The annual Slurm user group meeting.</li>
|
||
</ul>
|
||
<h2 id="contributing">Contributing</h2>
|
||
<p>Contributing guidelines can be found in <a
|
||
href="contributing.md">contributing.md</a>.</p>
|
||
<p><a href="https://github.com/dstdev/awesome-hpc">hpc.md Github</a></p>
|