R cheatsheet
This is a regularly modified post which holds all of the small bits and tips that don’t warrant their own post. If there is a group of related tips that pass a critical mass, they will be spun out into their own post, and a placeholder will remain here.
Where to get other cheatsheets
| A base R cheatsheet | base R cheatsheet |
| For string matching, substitutions and Regexes | stringr cheatsheet |
| For data visualisation | ggplot2 cheatsheet |
| For data wrangling and transformations, in an intuitive way | dplyr/tidyr cheatsheet |
Inputting data
Command line arguments, etc.
args = commandArgs(trailingOnly=TRUE)
# Access with
args[2]
- Getting the location of the current script regardless of if it’s running in RStudio or not.
library(tidyverse) getCurrentFileLocation <- function() { this_file <- commandArgs() %>% tibble::enframe(name = NULL) %>% tidyr::separate(col=value, into=c("key", "value"), sep="=", fill='right') %>% dplyr::filter(key == "--file") %>% dplyr::pull(value) if (length(this_file)==0) { this_file <- rstudioapi::getSourceEditorContext()$path } return(dirname(this_file)) }(from https://stackoverflow.com/questions/47044068/get-the-path-of-current-script)
Loading a FASTA file and converting it into a df
library(Biostrings)
temp <- readDNAStringSet('file.fa')
dss2df <- function(dss) data.frame(width=width(dss), seq=as.character(dss), names=names(dss))
tempdf <- dss2df(temp)
Optional: removing SAM formatted tags from the df and putting them into new columns