Python cheatsheet
This is a regularly modified post which holds all of the small bits and tips that don’t warrant their own post. If there is a group of related tips that pass a critical mass, they will be spun out into their own post, and a placeholder will remain here.
Inputting into a script
Command line arguments
The command line args are put into an array called sys.argv.
The following prints only the first one.
#!/usr/bin/python3
import sys
print(sys.argv[1])
Run it from the terminal like so:
> ./script.py argument1
argument1
Reading files in
# Assign the file to a handle
f = open("/path/to/file.txt", "r")
# Read the whole file and print it
print(f.read())
# Read only 5 charaters
print(f.read(5))
# Read a line
print(f.readline())
# Process a file line-by-line
for x in f:
print(x)
# Close the file
f.close()
Logging
Logging runtime for benchmarking
#!/usr/bin/python3
import time
# Save the start time
startTime = time.time()
######################
# Do some stuff here
######################
# Logging
print("Duration:", time.time() - startTime)
If you want to benchmark something that actually takes a while to run, you may want to have times converted to hh:mm:ss format.
from datetime import timedelta
fullTime = timedelta(seconds=(time.time() - startTime))
print('Duration:', fullTime)
Writing outputs
# Append to a file
f = open("outFile.txt", "a")
f.write("Now the file has more content!")
f.close()
# Open for overwriting
f = open("outFile.txt", "w")
# Print with no newlines
print("This won't end with an enter.", end = '')
Data structures
Lists
Define a list of zeroes of length n
listofzeros = [0] * n
Print elements from a list with ‘ ‘ separator
print('List contents:', (' '.join(str(x) for x in nameOfList)))
Dicts
Define a dict
outDict = {
'bla': 'value for bla',
'year': 1955
}
Get values from dict
print(outDict.get('bla'))
Print values in a dict of lists
print('List contents:', (', '.join(str(x) for x in dictName.get('listName'))))
String operations
Search and replace with Regex
For some reason, the default is to do a global replace: changing all occurrences at once.
import re
str = "xyz@gmail.com"
# re.sub(pattern, replacement, string, count=0, flags=0)
print(re.sub("[a-z]*@", "abc@", str))
Substrings
Get a substring
# nameOfString[start:end:step]
print(strName[4:20:1])
Loop over a string
for i in range(0, len(bigSeq)):
print(bigSeq[i:i+1])
Loops and things
# Basic for loop with range
for x in range(5, 60):
print(x)
# Changing step
for x in range(5, 60, 3):
print(x)
# Loop through string
for x in "Flipping heck":
print(x)
Conditionals
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")
else:
print("a is greater than b")
if not a == b:
print('a and b are not equal')
If a variable exists either locally or globally
if 'myVar' in locals():
# myVar exists.
if 'myVar' in globals():
# myVar exists.
Dicts and conditionals
if not outDict.get('A'):
print('The key "A" is not in outDict')
Multiple conditions
if b > a and b > c:
print("multiple conditions are true")
Functions
def functionName(paramOne, paramTwo):
# Do something
print("First parameter:", paramOne, "Second parameter:", paramTwo)
# Call it
functionName("Some text for param 1", "Some text for param 2")
Bioinformatics-specific things
Hamming distance between two strings
# Hamming distance function
def hamming_distance(s1, s2):
if len(s1) != len(s2):
raise ValueError("Strand lengths are not equal!")
return sum(ch1 != ch2 for ch1,ch2 in zip(s1,s2))