This is a regularly modified post which holds all of the small bits and tips that don’t warrant their own post. If there is a group of related tips that pass a critical mass, they will be spun out into their own post, and a placeholder will remain here.

Inputting into a script

Command line arguments

The command line args are put into an array called sys.argv. The following prints only the first one.

#!/usr/bin/python3
import sys
print(sys.argv[1])

Run it from the terminal like so:

> ./script.py argument1
argument1

Reading files in

#   Assign the file to a handle
f = open("/path/to/file.txt", "r")
#   Read the whole file and print it
print(f.read()) 
#   Read only 5 charaters
print(f.read(5))
#   Read a line
print(f.readline())
#   Process a file line-by-line
for x in f:
  print(x) 
#   Close the file
f.close() 

Logging

Logging runtime for benchmarking

#!/usr/bin/python3
import time

#   Save the start time
startTime = time.time()
######################
#   Do some stuff here
######################
#   Logging
print("Duration:", time.time() - startTime)

If you want to benchmark something that actually takes a while to run, you may want to have times converted to hh:mm:ss format.

from datetime import timedelta
fullTime = timedelta(seconds=(time.time() - startTime))
print('Duration:', fullTime)

Writing outputs

#   Append to a file
f = open("outFile.txt", "a")
f.write("Now the file has more content!")
f.close()

#   Open for overwriting
f = open("outFile.txt", "w")

#   Print with no newlines
print("This won't end with an enter.", end = '')

Data structures

Lists

Define a list of zeroes of length n

listofzeros = [0] * n

Print elements from a list with ‘ ‘ separator

print('List contents:', (' '.join(str(x) for x in nameOfList)))

Dicts

Define a dict

outDict = {
            'bla': 'value for bla',
            'year': 1955
}

Get values from dict

print(outDict.get('bla'))

Print values in a dict of lists

print('List contents:', (', '.join(str(x) for x in dictName.get('listName'))))

String operations

Search and replace with Regex

For some reason, the default is to do a global replace: changing all occurrences at once.

import re
str = "xyz@gmail.com"
#   re.sub(pattern, replacement, string, count=0, flags=0)
print(re.sub("[a-z]*@", "abc@", str))

Substrings

Get a substring

#   nameOfString[start:end:step]
print(strName[4:20:1])

Loop over a string

for i in range(0, len(bigSeq)):
    print(bigSeq[i:i+1])

Loops and things

#   Basic for loop with range
for x in range(5, 60):
  print(x)

#   Changing step
for x in range(5, 60, 3):
  print(x)

#   Loop through string
for x in "Flipping heck":
  print(x)

Conditionals

if b > a:
  print("b is greater than a")
elif a == b:
  print("a and b are equal")
else:
  print("a is greater than b")

if not a == b:
    print('a and b are not equal')

If a variable exists either locally or globally

if 'myVar' in locals():
  # myVar exists.

if 'myVar' in globals():
  # myVar exists.

Dicts and conditionals

if not outDict.get('A'):
    print('The key "A" is not in outDict')

Multiple conditions

if b > a and b > c:
  print("multiple conditions are true")

Functions

def functionName(paramOne, paramTwo):
    #   Do something
    print("First parameter:", paramOne, "Second parameter:", paramTwo)

#   Call it
functionName("Some text for param 1", "Some text for param 2")

Bioinformatics-specific things

Hamming distance between two strings

#   Hamming distance function
def hamming_distance(s1, s2):
    if len(s1) != len(s2):
        raise ValueError("Strand lengths are not equal!")
    return sum(ch1 != ch2 for ch1,ch2 in zip(s1,s2))