Overview

From ACCRE Wiki

800

This MediaWiki instance is a primary source for ACCRE user support documentation.

ACCRE is the premier resource for the high-performance computing needs of research throughout Vanderbilt University. With over 600 multi-core systems in a 4,000 square foot facility, the ACCRE cluster is used for research in a wide variety of fields, including genetics research, particle physics, and astronomy.

What is a cluster?

To put it simply, a cluster is a bunch of computers that are networked together to perform intensive computing jobs. Users of the cluster write programming code that would normally take a long time to process, then schedule a job (or group of jobs) to run the program on the ACCRE cluster. This way, the program can run faster and have access to more memory, and users can get more done in less time.

Where is the ACCRE cluster?

The cluster and ACCRE staff offices are located in Hill Center on Peabody Campus next to the Commons.

You don’t need to be physically at the cluster to operate it – in fact, only ACCRE and certain VUIT staff can access the cluster, and we only go there to do occasional maintenance. Mainly, to access the cluster, you use a special program called a secure shell client, or SSH client, which will let you work on the cluster from anywhere.


Being in front of the SSH client is like being in front of the cluster. As you type, the keystrokes get sent to the cluster, and any results get sent back in real time as well. Because of the way it is designed, many ACCRE users can log in at once.

There are other tools that can be used to interact with the ACCRE cluster. You can transfer files back and forth using a Secure Copy (SCP) client, and run graphical programs using an X Window server or X11 server. In theory, you could run a web browser on ACCRE; it appears as though it’s running on the desktop, but it’s actually running on the gateway. (We don’t recommend doing this for everyday use, by the way. It’s very slow!)

How does the ACCRE cluster work?

The ACCRE cluster has many different types of computers, all of which work together:

  • The compute nodes perform the actual work. Jobs (i.e. instances of running programs) are executed on compute nodes. ACCRE consists of over 600 computing nodes.
  • The gateways are what you see when you log in to ACCRE. These gateways are accessed interactively from a remote “shell” session using a Linux tool called ssh. Gateways allow users to submit jobs, edit files, and get the results from jobs. Some gateways are specific to a certain research group; these are called custom gateways.
  • The job scheduler server takes jobs that have been submitted and assigns them to a particular compute node. The job scheduler software that ACCRE uses is called SLURM. SLURM tracks and manages compute and memory resources on the compute nodes, and decides when and where to run your job. SLURM will email you updates about your job.
  • The filesystem store all the code and data so they can be used by all the different nodes and gateways. ACCRE uses a tool called Panasas to store users’ data. It’s available on each gateway and compute node, eliminating the need to copy data between the various compute resources on the cluster.

Some of the nodes contain NVIDIA graphics processing units, or GPUs. Traditionally GPUs were designed to power video games to perform calculations quickly. Because of the nature of their design, GPUs are being used more and more for non-graphics applications as well (e.g. for deep learning applications, molecular dynamics, image processing, and much more).

When was the ACCRE cluster set up?

ACCRE has existed in its current form since 2003, while predecessors to ACCRE have existed as early as 1994. You can read more about our history here.

Who runs ACCRE?

At ACCRE, our governance structure plays a crucial role in ensuring that we provide top-tier computational resources and support. Our Steering Committee, composed of distinguished faculty and leaders from various schools within Vanderbilt University, guides our strategic direction, policy, and operational oversight. Read more about ACCRE governance here.