Interacting with AWS on a Mac/Linux Bash terminal is done using the AWS Command Line Interface (awscli). You’ll need to download and install that onto whatever system that you are using (in this case, Rocket) in order to do so. Note: This will allow you to do anything that you user is able to do on your user, so make sure that you trust the system that you are using before doing this.

Installing the awscli on Rocket

  1. Install the awscli
    • Download and unpack the AWS command line interface with:
      mkdir -p /nobackup/$USER/bin/ /nobackup/$USER/bin/AWS
      cd /nobackup/$USER/bin/AWS
      curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
      unzip awscliv2.zip
      

      For more info on what just happened, go to https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html

    • Now, you can access the awscli with /nobackup/$USER/bin/AWS/aws/dist/aws. This is a bit rubbish though, and just using aws as a command would be better. You can create a launcher for it using the instructions in this post.
  2. If you already have the awscli installed elsewhere, you can copy your configuration files across.
    • Log into a machine where you have the awscli installed and working. For Mac/Linux, go to ~/.aws. Here are two files: config and credentials that you will need to scp across into Rocket’s ~/.aws/ folder. That target folder is unlikely to exist, so create it with mkdir if necessary. Alternatively, you can run aws configure to do so manually for the first time.

Logging in with an MFA device

  • If you are logging into an account which requires MFA, you will need to start a session which stores temporary credentials.
  • More info here.
  • If you are a member of the Sarra Ryan Lab, you will have access to the UserSessionStart.sh script, which does it all for you.
    • Grab the version of that which you have access to, and make sure that you have the MFA Device ID configured on line 3, according to the information on the documentation which comes with the script.
    • scp that across to Rocket.
    • For some reason, Rocket doesn’t have the JSON parser called jq installed.
      • Go to a folder that is in your $PATH. use echo $PATH to find an existing one, and go to this article for info on adding another one to your $PATH.
      • Run the following commands to download and install jq:
        curl -s https://api.github.com/repos/jqlang/jq/releases/latest | grep 'browser_download_url.*jq-linux-amd64' \
            | sed -E 's|.*https(.*)"|https\1|' \
            | xargs wget -O jq
        chmod 700 jq
        
      • Test the result by running jq. If you get help information, and not a command not found error, you were successful!
    • Now, you can use the UserSessionStart.sh script to login.

Copying data from Rocket to AWS

  • The documentation for this is actually quite good.
  • Looking for files in S3 is similar to the bash ls command.
    • Use aws s3 ls to look for buckets that you have access to.
    • Use aws s3 ls bucketName for contents of the bucket.
    • Use aws s3 ls bucketName/folder/ to check a folder’s contents. NB: The folder path ends in a forward slash!
  • The aws s3 cp command can be used to copy individual files to and from and S3 bucket, and the documentation is here. That command can’t copy multiple files at once, so you will need to use this command multiple times to do more files.
  • The aws s3 sync command can be used to do movement of folders, similarly to the Bash rsync command. Documentation is here.

Other notes to bear in mind:

  • Locations/files that are on the bucket must be preceded by s3:// eg: aws s3 cp /path/to/local/file.txt s3://bucketName/folderLocation/subFolder/
  • Remote locations should end with a forward slash.
  • If you have multiple files in a folder, but don’t want to copy all of them, and they all have something in common (eg: they’re all BAM/BAI files), you can use the following:
    for f in *.bam *.bai
    do
        aws s3 cp $f s3://bucketName/target/location/
    done
    

    This is good for messy folders full of lots of different file types.

  • Otherwise, you can make a folder, and link all files that you want to copy into them, then sync the folder.
    mkdir stagingFolder
    ln *.bam *.bai specificFile1.bla specificFile2.foo stagingFolder
    aws s3 sync stagingFolder s3://bucketName/target/location/