General Topics

MARCC is a resource for JHU and University of Maryland researchers who require access to advanced computing resources and expertise of research collaborators in the analysis and management of big data sets.
After your PI is granted an allocation, users can request accounts that must be approved by the Pi via email:
Please fill out the Request an Account form.
HOME Directory: 20 GBytes (/home/userid)
Lustre: 1 TByte per group (/scratch/users/userid and /scratch/groups/groupid)
Data: 1 TByte per group
Storage can be increased up to 10TB per request from the PI.
Type the command “groups”.
HOME directories and “data”
ssh -l userid
A two-factor authentication code is needed and a robust password

This research project (or part of this research project) was conducted using computational resources (and/or scientific computing services) at the Maryland Advanced Research Computing Center (MARCC).”

Please feel free to edit it and/or include more details.


Any faculty member at Johns Hopkins University can request an allocation. All requests will be approved by the Dean’s office
Please fill out and Allocation Request form.

The Deans of the schools will make decisions on how to allocate resources.

To find out the group utilization the PI should use the command sbalance. It will give information about utilization by all members of the group.
Research groups can add resources to the main Bluecrab cluster by purchasing “condos”. A condo is composed of compute nodes (GPU, standard or high memory) that are paid by the research group but are shared with the rest of the cluster. The research group will get an additional allocation in wall-hours equivalent to the number of cores added by the condo. Research groups will have higher priority (QOS) on these additional allocations. For more information please send and email to


For a summary of available partitions and their status, use the command sinfo -s
Use the “interact” command and “interact -usage" for options.

Users who need to run jobs immediately may be able to find out if resources are available. Use the command:
sinfo -s

A/I/O/T = Allocated/Idle/Out/Total

The parallel and gpuk80 partitions (queues) have three different type of Intel processors. The original Haswell nodes have 120GB RAM and 24 cores. The Broadwell nodes have 120GB RAM and 28 cores. The newest set of nodes have Skylake processors with 90GB RAM and 24 cores. The SLURM batch utility is set so that parallel jobs stay within a unique architecture. This is done using a keyword (–constraint). The default is set to use an exclusive span of Haswell, Broadwell, or Skylake processors. If you want to use the Skylake processors you need to add this keyword to your script:

#SBATCH -C skylake

Also make sure the total memory is no higher than 90GB, when using the skylake nodes. The keywords are:


and skylake


An email will be sent out to all users. Also, please check the website home page.
Use these flags: “-axCORE-AVX512,AVX2,AVX,SSE4.2”
The executable will run on Haswell, Broadwell, Ivy-bridge, and Skylake processors. Note that the performance may not be the same as compiling for a particular processor.
Windows 10 users can take advantage of “Shell” on Windows by installing the “Linux Bash Shell”. You can follow these instructions.

Data Transfer

DTNs are a set of dedicated nodes for file transfer. These servers are GlobusConnect end points and should be used to transfer large amounts of data.

  1. ssh (with your username and passwd)
  2. scp largefile.ext userid@your-destination
    Note that the speed is limited by the connectivity at your destination
  3. From your machine to MARCC: scp largefile.ext
  4. Use the Globus connect end point
    1. Request a GlobusConnect account
    2. Login into your globus connect account
    3. Select the end points (MARCC)
    4. Authenticate to your end points
    5. Select the file(s) to transfer
    6. Start the file transfer
  5. If you need to transfer many (Thousands) of small files:
    1. Compress many files into a tar file of at least 100GB in size. This will give better performance and will not ‘break’ the data transfer node. For example: “tar -zcvf junk.tgz JUNK”. This command will compress all the files in directory JUNK into the compressed file junk.tgz
    2. Follow the same process as above
    3. Please note that if you have terabytes of data to move, the DTN will give better performance  if you split them into several chunks instead of one big file
  1. ssh (with your username, TFA and password)  [make sure you connect to]
  2. module load aspera
  3. ascp-marcc is an alias to “ascp -T -l8G -i /software/apps/aspera/” : -T do not encrypt, -l8G 8000MB bandwidth. You can change these parameters but use the ascp command
  4. To download a file from ncbi: ascp-marcc /scratch/users/userid"
  1. Download Filezilla  (Web search)
  2. Install Filezilla
  3. Launch Filezilla. Your local machine files and folders should be visible on the left side
  4. Click on the top left “icon” or click File-> Site Manager. A new window pops up
  5. Click on New site and name it “MARCC”
  6. Click on “General”
    1. Host: Port 22   (Type)
    2. Protocol: SFTP – SSH File Transfer Protocol   (select)
    3. Logon Type: Interactive (select)
    4. User: Your MARCC userid  (  (Type)
    5. Password: Leave blank (recommended)
  7. Click on “Transfer Settings”
    1. Select the Limit number of simultaneous connections and set it to “1”
  8. Click on “Connect”
  9. You should be connected. MARCC files and Folders should be visible on the right side
  10. Click and Drag files/folders

That is it.


HIPAA and Data Subject to Restrictions

Yes. “MARCC Secure Environment” (MSE) is a HIPAA compliant system that can be used to analyze PHI data. PIs will need to provide some additional information regarding the study and the PIs need to include MARCC’s involvement as a research collaborator explicitly in the IRB protocol and in any consent signed by subjects. Please note: The MSE is a separate more secure system than the Bluecrab cluster. The latter was designed to handle open, non-restricted data. Please do not upload PHI data to the Bluecrab cluster
You should submit a change in protocol to your IRB to include MARCC’s involvement as a research collaborator on the protocol.
No. MARCC is a research resource and will collaborate with you on your research needs. MARCC is not a “covered entity,” as that term is defined in HIPAA and will not perform any “covered functions.” Research activities are generally not considered “covered functions” under HIPAA, so MARCC would not be functioning as a “business associate” under HIPAA.
While MARCC is not a “covered entity” or “business associate” under HIPAA, it (the MSE) does maintain security standards and safeguards consistent with the HIPAA Security Rule. Specifically,…[to be included once security standards have been established].
If you have confirmation that your dataset is de-identified as required by HIPAA or your data is not otherwise subject to HIPAA, you do not need to name MARCC’S MSE in your research protocol. An example would be dbGaP data.