Scientific Applications

A list of applications downloaded on the cluster, and a description of what they do.

Bioinformatics

Abawaca (1.07)– abawaca (A Binning Algorithm Without A Cool Acronym) is a binning program that can take advantage of different types of information such as differential coverage and DNA signature

Abyss (2.1.0, 2.15)– ABySS is a parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes.

AFNI (17.2.16)– AFNI is a set of C programs for processing, analyzing, and displaying functional MRI (FMRI) data

angsd (0.932)– ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data. The software is written in C++ and has been used on large sample sizes.

Ants (2.2)– Advanced Normalization Tools (ANTs) extracts information from complex datasets that include imaging.

baypass (3.36)– The package BayPass is a population genomics software which is primarily aimed at identifying genetic markers subjected to selection and/or associated to population-specific covariates (e.g., environmental variables, quantitative or categorical phenotypic characteristics).

BBMAP (37.93, 38.16)– BBMap is a splice-aware global aligner for DNA and RNA sequencing reads. It can align reads from all major platforms – Illumina, 454, Sanger, Ion Torrent, Pac Bio, and Nanopore. BBMap has a large array of options, described in its shell script. It can output many different statistics files, such as an empirical read quality histogram, insert-size distribution, and genome coverage, with or without generating a sam file.

bcl2fastq2 (2.20)– Demultiplexes sequencing data and converts base call (BCL) files into FASTQ files.

Bedtools (2.27.1)– The bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF.

blast (2.6.0, 2.10.0)– Basic Local Alignment Search Tool is a sequence comparison algorithm optimized for speed used to search sequence databases for optimal local alignments to a query.

blat (35)– BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more. It may miss more divergent or shorter sequence alignments. It will find perfect sequence matches of 20 bases. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids or more.

bowtie2 (1.2.1.1, 2.3.2)– Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).

bwa (3.16)– BWA is a program for aligning sequencing reads against a large reference genome (e.g. human genome). It has two major components, one for read shorter than 150bp and the other for longer reads.

cat (5.0.3)– CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)

cellranger (3.0.2, 3.1.0, 4.0.0)– CCell Ranger is a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis.

checkm (1.0.9)– Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes

concoct (0.4.0)– A program for unsupervised binning of metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data from paired end reads.

cufflinks (2.2.1)– Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.

DAS_Tool (1.1)– DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

deeparg (0.7.9.58)– DeepARG is a machine learning solution that uses deep learning to characterize and annotate antibiotic resistance genes in metagenomes. It is composed of two models for two types of input: short sequence reads and gene-like sequences.

diamond (0.9.24)– DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.

emboss (6.6.0)– EMBOSS is "The European Molecular Biology Open Software Suite". EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community.

epacts (3.2.6)– EPACTS (Efficient and Parallelizable Association Container Toolbox) is a versatile software pipeline to perform various statistical tests for identifying genome-wide association from sequence data through a user-friendly interface, both to scientific analysts and to method developers.

evigene (may2018)– EvidentialGene is a genome informatics project for "Evidence Directed Gene Construction for Eukaryotes", for constructing high quality, accurate gene sets for animals and plants (any eukaryotes).

FastQC (0.11.5)– FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines

freesurfer (2017, 2020 beta)– FreeSurfer is a set of tools for analysis and visualization of structural and functional brain imaging data

fsl (5.0.10)– FSL is a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data.

GATK (3.80, 4.0.5.1, 4.1.3.0)– The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data

kaiju (1.72)– Kaiju is a program for sensitive taxonomic classification of high-throughput sequencing reads from metagenomic whole genome sequencing or metatranscriptomics experiments.

kallisto 0.44.0– kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.

lammps (mar 2017, aug 2019)– LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.

longranger (2.2.2)– Long Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants.

lumpy (0.2.13)– Lumpy is a general probabilistic framework for structural variant discovery.

maxbin (2.2.4)– MaxBin is a software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.

megahit (1.1.3, 1.2.7)– An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph

metabat (2.12.1)– A robust statistical framework for reconstructing genomes from metagenomic data.

meta-marc (0)– Metagenomic Markov Models for Antimicrobial Resistance Characterization.

miRA (1.2.0)– MIRA is a whole genome shotgun and EST sequence assembler

miRExpress (2.1.4)– A database-supported, efficient and flexible tool for detecting miRNA expression profiles.

MiRge (2018)– A fast, smart small RNA-seq solution to process samples in a highly multiplexed fashion. miRge employs a Bayesian alignment approach, whereby reads are sequentially aligned against customized mature miRNA, hairpin miRNA, noncoding RNA and mRNA sequence libraries.

MMTSB (2019)– Multiscale Modeling Tools for Structural Biology toolset.

mrbayes (3.2.7a)– MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.

Mrtix (3)– MRtrix3 provides a set of tools to perform various types of diffusion MRI analyses, from various forms of tractography through to next-generation group-level analyses. It is designed with consistency, performance, and stability in mind, and is freely available under an open-source license. It is developed and maintained by a team of experts in the field, fostering an active community of users from diverse backgrounds.

MUMmer (3.23)– MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form.

namd (2020-03-20)– NAMD is a parallel molecular dynamics program for UNIX platforms designed for high-performance simulations in structural biology.

ngsTools (1.02)– NGS (Next-Generation Sequencing) technologies have revolutionized population genetic research by enabling unparalleled data collection from the genomes or subsets of genomes from many individuals.

oases (0.2.09)– Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly.

plink (6.2)– PLINK is a whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner

picard (2017, 2018)– A set of Java command line tools for manipulating high-throughput sequencing data (HTS) data and formats

Pplacer (1.1)– Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment.

prank (v 170427)– PRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences. It’s based on a novel algorithm that treats insertions correctly and avoids over-estimation of the number of deletion events.

prodigal (2.6.3)– Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program

rsem (1.3.0)– Accurate quantification of gene and isoform expression from RNA-Seq data

samblaster (0.1.26)– samblaster is a tool to mark duplicates and extract discordant and split reads from sam files.

simka (1.5.1)– Simka is a de novo comparative metagenomics tool. Simka represents each dataset as a k-mer spectrum and compute several classical ecological distances between them.

samtools (1.5)– SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments

snpEff (12_2017)– Genetic variant annotation and effect prediction toolbox

SNPiR (12_2017)– Identifies single nucleotides polymorphisms (SNPs) in RNA-seq data. SNPiR consists of (1) a modified RNA-seq read-mapping procedure that allows alignment of reads to the reference in a splice-aware manner, (2) variant calling using the Genome Analysis Toolkit (GATK) and (3) vigorous filtering of false-positive calls.

sortmerna (2.1b)– SortMeRNA is a program tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and metagenomic data. The core algorithm is based on approximate seeds and allows for fast and sensitive analyses of nucleotide sequences. The main application of SortMeRNA is filtering ribosomal RNA from metatranscriptomic data.

SOAPdenovo (r240)– SOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads. It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way.

stacks (2.4.1)– Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.

spades (3.11, 3.12)– SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies.

star (2.5, 2.7.2b, star-fusion)– RNAseq aligner

stringtie (1.3.3)– StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus.

subread(1.5.3)– The Subread software package is a tool kit for processing next-gen sequencing data. It includes Subread aligner, Subjunc exon-exon junction detector and featureCounts read summarization program

survivor (1.0.7)– SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.

tabix (0.2.6)– C library for high-throughput sequencing data formats.

tophat (2.1.1)– TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

transabyss (2.0.1)– de novo assembly of RNA-Seq data using ABySS

trimmomatic (0.36)– a flexible read trimming tool for Illumina NGS data.

Trinity (2.6.6, 2.8.4)– Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.

Virmap (2020)– Maximal viral information recovery from sequence data

vcf2maf (2017)– Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms

vmd (1.9.3)– VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.

Chemistry/Material Sciences

Abaqus (V6R2017, V6R2018, V6R2020)– Abaqus is a finite element analysis software used for engineering simulations.

abinit (8.10.2)– ABINIT is a software suite to calculate the optical, mechanical, vibrational, and other observable properties of materials. Starting from the quantum equations of density functional theory, you can build up to advanced applications with perturbation theories based on DFT, and many-body Green's functions (GW and DMFT).

atat (3.36)– Alloy Theoretic Automated Toolkit (ATAT) refers to a collection of alloy theory tools developped by Axel van de Walle

Amber(16)– “Amber” refers to two things: a set of molecular mechanical force fields for the simulation of biomolecules (which are in the public domain, and are used in a variety of simulation programs); and a package of molecular simulation programs which includes source code and demos

converge (2.4.27)– Computational fluid dynamics (CFD) simulator for physical processes.

critic2– Critic2 is a program for the manipulation and analysis of structures and chemical information in molecules and periodic solids. Critic2 can be used to read and transform between file formats, and to perform operations on molecular and crystal structures.

DFTB+ (18.2, 19.1, 20.2)– DFTB+ is a fast and efficient versatile quantum mechanical simulation software package. Using DFTB+ you can carry out quantum mechanical simulations similar to density functional theory but in an approximate way, typically gaining around two orders of magnitude in speed.

firefly (8.2.0)– Firefly is a freely available ab initio and DFT computational chemistry program developed to offer high performance on Intel-compatible x86, AMD64, and EM64T processors.

gamess (apr2017)– The General Atomic and Molecular Electronic Structure System (GAMESS) is a general ab initio quantum chemistry package.

gwyddion (2.5.0)– Gwyddion is a modular program for SPM (scanning probe microscopy) data visualization and analysis. Primarily it is intended for the analysis of height fields obtained by scanning probe microscopy techniques (AFM, MFM, STM, SNOM/NSOM) and it supports. However, it can be used for general height field and (greyscale) image processing, for instance for the analysis of profilometry data or thickness maps from imaging spectrophotometry.

lammps (mar 2017)– LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.

mmc-mcx (2017)– A GPU-accelerated photon transport simulator

nwchem (6.8)– NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.

qe (gcc (6.2.0, 6.3.0-developer), intel (6.4.1, 6.5))– a software suite for ab initio quantum chemistry methods of electronic-structure calculation and materials modeling, distributed for free under the GNU General Public License. It is based on Density Functional Theory, plane wave basis sets, and pseudopotentials.

orca (4.2.1)– ORCA is an ab initio quantum chemistry program package that contains modern electronic structure methods including density functional theory, many-body perturbation, coupled cluster, multireference methods, and semi-empirical quantum chemistry methods.

SAPT (2016.1)– SAPT is a collection of computer codes designed to implement the many-body (body = electron) version of Symmetry-Adapted Perturbation Theory for intermolecular interactions.

ShengBTE(1.1.1)– ShengBTE is a software package for solving the Boltzmann Transport Equation for phonons. Its main purpose is to compute the lattice contribution to the thermal conductivity of bulk crystalline solids, but nanowires can also be treated under a hypothesis of diffusive boundary conditions.

vasp (5.4.4)– intel - The Vienna Ab initio Simulation Package, better known as VASP, is a package for performing ab initio quantum mechanical molecular dynamics using either Vanderbilt pseudopotentials, or the projector augmented wave method, and a plane wave basis set.

VESTA (gtk3)– VESTA is a 3D visualization program for structural models, volumetric data such as electron/nuclear densities, and crystal morphologies.

wannier (2.1.0)– Wannier90 generates maximally-localized Wannier functions and using them to compute advanced electronic properties of materials with high efficiency and accuracy

Mathematics

ACML 5.3.1– ACML provides a free set of thoroughly optimized and threaded math routines for HPC, scientific, engineering and related compute-intensive applications. ACML is ideal for weather modeling, computational fluid dynamics, financial analysis, oil and gas applications and more.

blacs (1.1)– The BLACS (Basic Linear Algebra Communication Subprograms) project is an ongoing investigation whose purpose is to create a linear algebra oriented message passing interface that may be implemented efficiently and uniformly across a large range of distributed memory platforms.

blas (3.6.0)– The BLAS (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations. The Level 1 BLAS perform scalar, vector and vector-vector operations, the Level 2 BLAS perform matrix-vector operations, and the Level 3 BLAS perform matrix-matrix operations. Because the BLAS are efficient, portable, and widely available, they are commonly used in the development of high quality linear algebra software.

Cmdstan (2.17)– CmdStan is the command line interface to Stan, a state-of-the-art platform for statistical modeling and high-performance statistical computation.

Cmgui (7.3)– Cmgui is an advanced 3D visualisation software package with modelling capabilities. Cmgui is part of CMISS, a mathematical modelling environment initially developed by the University of Auckland Bioengineering Institute. CMISS stands for Continuum Mechanics, Image analysis, Signal processing and System Identification.

fftw2 (2.1.5), fftw3 (3.3.4, 3.3.8)–– FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST).

gsl (2.4)– The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.

JAGS (4.3.0)– JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation. JAGS was written with three aims in mind: To have a cross-platform engine for the BUGS language, to be extensible, allowing users to write their own functions, distributions and samplers, and to be a platform for experimentation with ideas in Bayesian modelling

lapack 3.6.0 – LAPACK provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems.

matlab (R2017b, R2019a, R2019b, R2020a)– MATLAB® is a high-level language and interactive environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications.

metis (5.1.0)– MSerial Graph Partitioning and Fill-reducing Matrix Ordering

octave (5.2.0)– GNU Octave is a high-level language primarily intended for numerical computations. It is typically used for such problems as solving linear and nonlinear equations, numerical linear algebra, statistical analysis, and for performing other numerical experiments. It may also be used as a batch-oriented language for automated data processing.

Openblas (0.2.18)– OpenBLAS is an optimized BLAS library, which is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication.

R (3.1.0, 3.3.1, 3.4.1, 3.5.0, 3.5.1, 3.5.2, 3.6.0, 3.6.1, 3.6.2, 4.0.2)– R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files.

scalapack (2.0.2)– ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. ScaLAPACK solves dense and banded linear systems, least squares problems, eigenvalue problems, and singular value problems.

stan (2.17)– Stan® is a state-of-the-art platform for statistical modeling and high-performance statistical computation. Thousands of users rely on Stan for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business.

stata (14)– Stata is a complete, integrated software package that provides all of your data science needs—data manipulation, visualization, statistics, and reproducible reporting.

vtk (9.0)– The Visualization Toolkit (VTK) is open source software for manipulating and displaying scientific data. It comes with state-of-the-art tools for 3D rendering, a suite of widgets for 3D interaction, and extensive 2D plotting capability.

Developers

anaconda (python2.7 (2019.10), python 3 (2019.03, 2019.07, 2019.10, 2020.02, ai-lab)– Anaconda (CONDA) is a distribution of for data science, machine learning, and many other packages it manages and deploys.

Bamtools (2.4.1)– BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.

bazel (2.0.0)– A built and testing tool for Java, C++, Android, iOS, Go, among others. Uses advanced local and distributed caching, optimized dependency analysis and parallel execution.

BCFtools (1.5)– BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF.

beagle (1.10)– BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. It can make use of highly-parallel processors such as those in graphics cards (GPUs) found in many PCs.

Bonnie++ (1.97.1)– Bonnie++ is a free file system benchmarking tool for Unix-like operating systems that is aimed at performing a number of simple tests of hard drive and file system performance.

caffe (1.0)– Caffe is a deep learning framework

chromium (2020)– Open source web browser developed by Google.

Cmake (3.9.6, 3.12, 3.13.5, 3.18.1)– CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.

cuda (7.5, 8.0, 9.0, 9.1, 9.2, 10.0, 10.1, 11.1)– CUDA (aka Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce

dos2unix (7.4)– unix2dos is a tool to convert line breaks in a text file from Unix format to DOS format and vice versa. When invoked as unix2dos the program will convert a Unix text file to DOS format, when invoked as dos2unix it will convert a DOS text file to UNIX format

fastp (0.20.0)– A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.

fastx (0.0.13)– The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.

ffmpeg (2020_08)– Audio and video recorder and converter.

gcc 4.9.1– The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, Ada, and Go, as well as libraries for these languages (libstdc++, libgcj,…).

Gdb (7.11)– GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed. GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act: Start your program, specifying anything that might affect its behavior, make your program stop on specified conditions, examine what has happened when your program has stopped, and change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.

globalarrays (5.4)– Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model. It provides primitives for one-sided communication (Get, Put, Accumulate) and Atomic Operations (read increment). It supports blocking and non-blocking primitives, and supports location consistency.

hdf5 (1.6.10, 1.8.17)– HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.

hwloc (1.11.3)– The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs.

Intel Compiler– The Intel® Composer XE suites are available in several configurations that combine industry leading C, C++ and Fortran compilers, programming models including Intel® Cilk™ Plus and OpenMP*, performance libraries including Intel® Math Kernel Library (Intel® MKL), Intel® Integrated Performance Primitives (Intel® IPP) and Intel® Threading Building Blocks (Intel® TBB) for leadership application performance on systems using Intel® Core™ and Xeon® processors, Intel® Xeon Phi™ coprocessors and compatible processors.

Iozone3 (434)– Iozone is a filesystem benchmark tool. The benchmark generates and measures a variety of file operations. Iozone has been ported to many machines and runs under many operating systems. Iozone is useful for performing a broad filesystem analysis of a vendor’s computer platform.

JAVA (1.8.0_151, 1.8.0_162)– For Java Developers. Includes a complete JRE plus tools for developing, debugging, and monitoring Java applications.

mpich (3.2rc2)– MPICH2 is an implementation of the Message-Passing Interface (MPI). The goals of MPICH2 are to provide an MPI implementation for important platforms, including clusters, SMPs, and massively parallel processors. It also provides a vehicle for MPI implementation research and for developing new and better parallel programming environments.

mvapich2 (2.2rc1)– MVAPICH2 (MPI-3 over InfiniBand) is an MPI-3 implementation based on MPICH ADI3 layer.

netpbm (10.73.30)– Netpbm is a toolkit for manipulation of graphic images, including conversion of images between a variety of different formats. There are over 300 separate tools in the package including converters for about 100 graphics formats

netcdf (4.4.0, 4.6.2-c-fortran)– NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.

Netperf (2.7.0)– Netperf is a benchmark that can be used to measure the performance of many different types of networking. It provides tests for both unidirectional throughput, and end-to-end latency.

open64 (4.5.2.1)– Open64 has been well-recognized as an industrial-strength production compiler. It is the final result of research contributions from a number of compiler groups around the world. Formerly known as Pro64, Open64 was initially created by SGI from SGI’s MIPSPro compiler, and licensed under the GNU Public License (GPL v2).

opencv3(3.1.0, 3.30, 3.4.9), opencv4(4.0.1)– OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library

openmpi(1.2.9, 1.10.1, 3.1.4, 4.0.1, 4.0.3)– The Open MPI Project is an open source MPI-2 implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. Open MPI offers advantages for system and software vendors, application developers and computer science researchers.

paraview (5.8, 5,8.1, 5.9.0)– ParaView is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities.

PCL (1.10)– The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing.

pcre2 (10.35)– The PCRE (Perl Compatible Regular Expressions) library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API.

pear (0.9.11)– PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.

Pgi (17.3)– PGI® Workstation: PGI’s suite of compilers and toolsPGI Workstation™ is PGI’s single-user scientific and engineering compilers and tools product.

pycharm (2017.2.3)– PyCharm is an integrated development environment used in computer programming, specifically for the Python language.

python (2.7, 3)– Python is a remarkably powerful dynamic programming language that is used in a wide variety of application domains. Python is often compared to Tcl, Perl, Ruby, Scheme or Java.

qt (4.8.7, 5.15.0, 5.9.0)– qt is a widget toolkit used to develop GUIs for operating systems and embedded systems.

readline (2020_08)– The GNU Readline library provides a set of functions for use by applications that allow users to edit command lines as they are typed in. Both Emacs and vi editing modes are available. The Readline library includes additional functions to maintain a list of previously-entered command lines, to recall and perhaps reedit those lines, and perform csh-like history expansion on previous commands.

rstudio (1.3)– An IDE for R

sbt (1.0.1)– sbt is a build tool for Scala, Java, and more.

sge (2011.11p1)– The Sun Grid Engine queuing system is useful when you have a lot of tasks to execute and want to distribute the tasks over a cluster of machines.

Singularity (2.6.0, 3.5.2)– Singularity is a container platform focused on supporting "Mobility of Compute". Mobility of Compute encapsulates the development to compute model where developers can work in an environment of their choosing and creation, and when the developer needs additional compute resources, this environment can easily be copied and executed on other platforms. Additionally, as the primary use case for Singularity is targeted towards computational portability. Many of the barriers to entry of other container solutions do not apply to Singularity, making it an ideal solution for users (both computational and non-computational) and HPC centers.

slurm (16.05.8)– The Slurm Workload Manager, or Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.

tensorflow (1.3)– TensorFlow is an open-source machine learning library for research and production.

torch (7)– Torch is the main package in Torch7 where data structures for multi-dimensional tensors and mathematical operations over these are defined. Additionally, it provides many utilities for accessing files, serializing objects of arbitrary types and other useful utilities.

Torque (6.0.2)– The Terascale Open-source Resource and QUEue Manager (TORQUE) is a distributed resource manager providing control over batch jobs and distributed compute nodes

TSN (2017)– a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video.

valgrind (3.14)– Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.

xerces (3.2.2)– Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data.

General Sciences

cloudcompare (2.6.3.1)– Open source 3D point cloud and mesh processing software.

cloudy (c17)– Cloudy is a spectral synthesis code designed to simulate conditions in interstellar matter under a broad range of conditions.

gflow (1.7)– GFLOW is a highly efficient stepwise groundwater flow modeling system developed by Haitjema Software. It models steady state flow in a single heterogeneous aquifer using the Dupuit-Forchheimer assumption.

moose (3.11.4)– Multiphysics Object-Oriented Simulation Environment, parallel finite element framework

SNANA (10_60d)– SNANA contains a light curve fitter and simulation that can be applied to any supernova (SN) model and to any data set.

sumo (1.5.0)– "Simulation of Urban MObility" (SUMO) is an open source, highly portable, microscopic and continuous traffic simulation package designed to handle large networks. It allows for intermodal simulation including pedestrians and comes with a large set of tools for scenario creation.

tracmass (6.0.0)– TRACMASS is a Lagrangian trajectory code for ocean and atmospheric general circulation models. The code makes it possible to estimate water paths, Lagrangian stream functions (barotropic and overturning), exchange times, particle sedimentation, etc. TRACMASS has been used in studies of the global ocean circulation, of sea circulation in The Baltic, The Mediterranean and in coastal regions.

wrf (3.9.1)– WRF is a state-of-the-art atmospheric modeling system designed for both meteorological research and numerical weather prediction. It offers a host of options for atmospheric processes and can run on a variety of computing platforms.