FAQ

This page contains the answers to a few questions that we recieve often. As more questions are asked, this page may be updated.

How do I get an account?

To request an account, follow the instructions and answer the questions on our Requesting an Account Page. We will reach out to you once your account is created, or if we have any questions for you.

Can I add a new ssh key?

If you have a new computer, or want to add keys for additional computers that you use, you can add your own key on our web portal. Log in with your credentials (for MIT and other educational institutions this is the middle option when you go to https://txe1-portal.mit.edu) and then click on the “sshkeys” link. Scroll to the bottom and paste your key in the box. See our Web Portal page for more information, including how to log in.

How much storage do I have for my account?

We do not impose storage limits. However, it is recommended that users not use their accounts as primary storage. Further, we do not back up the storage on the system, so we strongly recommend transfering your code, data, and any other important files to another machine for backup.

How do I set/change my password?

You most likely do not need to set a password. If you have an active MIT Kerberos or login from another University, you can most likely log in using your institution's credentials. On the Supercloud Web Portal Login page, select the middle option "MIT Touchstone/InCommon Federation". You may have to select your institution from the dropdown list, which should take you to your institution's login page. After you log in, you should see the Portal main page. If you have trouble logging in this way, please contact us and we can help.

If you cannot log in using "MIT Touchstone/InCommon Federation", we may set you up with a password. If you have not yet reset your password, or remember your previous password, then follow the instructions on the Web Portal page. If you have previously set your password and cannot remember it, contact us and we will help you reset your password.

Are there any resource limits?

By default, all users have a limit of 320 processors/CPUs/cores/slots and 16 GPUs, the equivalent of 8 full nodes. This includes using multiple slots or cores for a single process. For example, if you are using 2 slots or cores per process, you can run 160 processes at a time. If you have a deadline and need additional resources you can request more by contacting us. Please state the number of additional processors you need and the length of time for which you need it. Remember this is a shared system, so during busy times we may not be able to grant your request.

While there is no enforced limit on memory use, it is important to keep in mind what your fair share of memory is for each process and request additional resources if needed. For example, if there are 40 cores and 384GB of RAM on the machine you are using, each processor's fair share would be about 9GB. If you think your processes will go over this, request additional slots as needed. This ensures you have sufficient memory without killing your job or someone else's.

What do I do if my job won't be deleted?

Occasionally this will happen if the node where your job is running goes down, or your job does not exit gracefully. If this happens, contact us with the Job ID, and we'll delete the job and reboot the node if needed.

Why do I get an error when I try to install a package?

There are two common reasons you get an error when you try to install a package. If you get a "Permission Denied" or similar error, it is because you are trying to install the package system-wide, rather than your own home directory. See the Software and Package Management page for more information on how to install packages.

If you get a "Network Error", or similar, this is because we don't have internet/network connection on the compute nodes, this includes Jupyter and any interactive jobs. You will have to install the package on one of the login nodes.

Jobs I ran previously are no longer working. What has changed?

The cluster has recently gone through some changes. First, read through the Transition Guide. If after trying the suggestions in that guide you are still running into trouble, send email to supercloud@mit.edu for more help.