This is a regularly modified post which holds all of the small bits and tips that don’t warrant their own post. If there is a group of related tips that pass a critical mass, they will be spun out into their own post, and a placeholder will remain here.

Where to get other cheatsheets

A base R cheatsheet base R cheatsheet
For string matching, substitutions and Regexes stringr cheatsheet
For data visualisation ggplot2 cheatsheet
For data wrangling and transformations, in an intuitive way dplyr/tidyr cheatsheet

Inputting data

Command line arguments, etc.

args = commandArgs(trailingOnly=TRUE)

#   Access with
args[2]
  • Getting the location of the current script regardless of if it’s running in RStudio or not.
    library(tidyverse)
    getCurrentFileLocation <-  function()
    {
        this_file <- commandArgs() %>% 
        tibble::enframe(name = NULL) %>%
        tidyr::separate(col=value, into=c("key", "value"), sep="=", fill='right') %>%
        dplyr::filter(key == "--file") %>%
        dplyr::pull(value)
        if (length(this_file)==0)
        {
          this_file <- rstudioapi::getSourceEditorContext()$path
        }
        return(dirname(this_file))
    }
    

    (from https://stackoverflow.com/questions/47044068/get-the-path-of-current-script)

Loading a FASTA file and converting it into a df

library(Biostrings)
temp <- readDNAStringSet('file.fa')
dss2df <- function(dss) data.frame(width=width(dss), seq=as.character(dss), names=names(dss))
tempdf <- dss2df(temp)

Optional: removing SAM formatted tags from the df and putting them into new columns

Loading multiple files into a single df and subsetting them

Previous post

Outputting data

Functions