It is possible to write a python script that can take standard-in as an input. This is very useful for quick and dirty bioinformatics pipelines.
Here's an example: Pretend I have a genotype file with chromsomes in the first column, and genotypes in the second column. I would like to find the complement of the genotypes (perhaps to get the data in the opposite orientation). Furthermore, say I am only interested in data that has to do with chromosome 1.
cat genotypefile.txt | awk '{if ($1="chr1"){print $0}}' | python complement.py
This would do it.
The python script (complement.py) would look something like this:
import sys
def reverseBase(base):
if base == "A":
return "T"
elif base == "T":
return "A"
elif base == "G":
return "C"
elif base == "C":
return "G"
else:
return "N"
def complement(genotype):
newgeno=''
for i in genotype:
newgeno=newgeno+reverseBase(i)
return newgeno
for i in sys.stdin.readlines():
line = i.strip('\n').split('\t')
print complement(line[1])
The function names could be better in the example script, but the key is that we loop through "sys.stdin.readlines()". "sys.stdin" allows us to pipe in standard input.
Here's an example: Pretend I have a genotype file with chromsomes in the first column, and genotypes in the second column. I would like to find the complement of the genotypes (perhaps to get the data in the opposite orientation). Furthermore, say I am only interested in data that has to do with chromosome 1.
cat genotypefile.txt | awk '{if ($1="chr1"){print $0}}' | python complement.py
This would do it.
The python script (complement.py) would look something like this:
import sys
def reverseBase(base):
if base == "A":
return "T"
elif base == "T":
return "A"
elif base == "G":
return "C"
elif base == "C":
return "G"
else:
return "N"
def complement(genotype):
newgeno=''
for i in genotype:
newgeno=newgeno+reverseBase(i)
return newgeno
for i in sys.stdin.readlines():
line = i.strip('\n').split('\t')
print complement(line[1])
The function names could be better in the example script, but the key is that we loop through "sys.stdin.readlines()". "sys.stdin" allows us to pipe in standard input.


