Differential+Gene+Expression+Python+Pipeline

1. Map the raw reads. If your mapping program does not produce a sam file as its output transform the output to the sam format. code format="bash" bwa samse database.fasta aln_sa.sai short_read.fastq > aln.sam
 * 1) convert single-end bwa output to sam file

bwa sampe database.fasta aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln.sam
 * 1) convert paired-end bwa output to sam file

samtools view -h -o out.sam in.bam code 2. If you are mapping paired-end reads, sort the sam file. code format="bash" sort -s -k 1,1 my_file.sam > my_file.sorted.sam code 3. Extract count data from the sam file. You can use either of the python scripts below or an existing program such as [|htseq-count]. [|counts.py]: counts mapped reads. [|counts-paired.py]: counts only reads for which both pairs map to the same scaffold and both pairs sequence successfully (neither pair is composed of all N's).
 * 1) convert bam file to sam file

The out put of both python scripts above has the following format.

File A: File B: To use this output in DESeq you will have to merge the count files for all datasets used for the differential expression analysis. You can do this in excel and save the merged tables as a .csv file or use the bash commands below. Either way your final file should have the following format: code format="bash" ( head -1 ${in:=my_counts.csv} ; tail -n +2 $in | sort -s -t, -k1,1 ) > my_counts_sorted.csv
 * > contig ||> File A ||
 * > gene1 ||> 0 ||
 * > gene2 ||> 100 ||
 * > contig ||> File B ||
 * > gene1 ||> 100 ||
 * > gene2 ||> 0 ||
 * > contig ||> File A ||> File B ||
 * > gene1 ||> 100 ||> 0 ||
 * > gene2 ||> 0 ||> 100 ||
 * 1) Sort the mapped read count files.

awk 'BEGIN{FS=","} FILENAME=="my_counts_sorted_1.csv" { dat[$1]=","$2 } ; FILENAME=="my_counts_sorted_2.csv" { print dat[$1]","$2 }' my_counts_sorted_1.csv my_counts_sorted_2.csv code 4.
 * 1) Combine the mapped read count files. Example with 3 files.
 * paste my_counts_sorted_3.csv - > all_sorted.csv

TO FINISH