DESeq is an R package for quantifying differential gene expression using count data from mapping experiments. To learn more about DESeq you can read the DESeq paper in Genome Biology. Below is an outline of a possible work flow, using DESeq, for a differential expression analysis. For more details you can see the package vignette.

1. Map your raw reads to an assembled or existing genome or transcriptome. If your mapping program does not produce a sam file as its output transform the output to the sam format.
#convert single-end bwa output to sam file
bwa samse database.fasta aln_sa.sai short_read.fastq > aln.sam
 
#convert paired-end bwa output to sam file
bwa sampe database.fasta aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln.sam
 
#convert bam file to sam file
samtools view -h -o out.sam in.bam
2. If you are mapping paired-end reads, sort the sam file.
sort -s -k 1,1 my_file.sam > my_file_sorted.sam
3. Extract count data from the sam file. You can use either of the python scripts below or an existing program such as htseq-count.
counts.py: counts mapped reads.
counts-paired.py: counts only reads for which both pairs map to the same scaffold and both pairs sequence successfully (neither pair is composed of all N's).

The out put of both python scripts above has the following format.

File A:
contig
File A
gene1
0
gene2
100
File B:
contig
File B
gene1
100
gene2
0
To use this output in DESeq you will have to merge the count files for all datasets used for the differential expression analysis. You can do this in excel and save the merged tables as a .csv file or use the bash commands below. Either way your final file should have the following format:
contig
File A
File B
gene1
100
0
gene2
0
100
#Sort the mapped read count files.
( head -1 ${in:=my_counts.csv} ; tail -n +2 $in | sort -s -t , -k1,1 ) > my_counts_sorted.csv
 
#Combine the mapped read count files. Example with 3 files.
awk 'BEGIN{FS=","} FILENAME=="my_counts_sorted_1.csv" { dat[$1]=","$2 } ;
FILENAME=="my_counts_sorted_2.csv" { print dat[$1]","$2 } 'my_counts_sorted_1.csv my_counts_sorted_2.csv
| paste my_counts_sorted_3.csv - > all_sorted.csv
4. Start an R session and import DESeq. Make sure you are running R version 2.14.0 or greater.
source("http://www.bioconductor.org/biocLite.R")
biocLite("DESeq")
library( "DESeq" )
5. Read in the combined read counts file.
#header=TRUE indicates that the first line contains column names and row.names=1 means that the
#first column should be used as row names.
countTable <- read.csv( "path/to/counts.csv", header=TRUE, row.names=1)
#Print the top 7 lines of the file.
head(countTable)
 
               A      B     C
gene1       6374   2758  1553
gene2       6908  42544    17
gene3      16225  20364   173
gene4          0      1     0
gene5        690    201   281
gene6        191    258     0
You might want to filter out genes which have very low cumulative read mapping counts (low row sum, 10 in the example below)
rs <- rowSums(countTable)
use <- (rs > 10)
countTableFilt <- countTable[ use, ]
 
6. Create a factor which describes the condition for each sample. For example if A is a weevil larva library, B is a weevil larve library and C is a weevil adult library then the conditions would be: larva, larva, adult. In the case where multiple libraries have the same condition, they are treated as biological replicates.
#replicates
conds <- factor( c( "larva", "larva", "adult", "adult" ) )
 
#partial replicates
conds < factor( c( "larva", "larva", "adult" ) )
 
#no replicates
conds <- factor( c( "larva", "adult" ) )
7. Create a CountDataSet, the central data structure in the DESeq package.
cds <- newCountDataSet( countTable, conds )
8. Estimate the effective library size for each sample. The following description is given for this step in the DESeq vignette: "This information is called the “size factors” vector, as the package only needs to know the relative library sizes. So, if the counts of non-differentially expressed genes in one sample are, on average, twice as high as in another, the size factor for the first sample should be twice as large as the one for the other sample...If we divide each column of the count table by the size factor for this column, the count values are brought to a common scale, making them comparable. When called with normalized=TRUE, the counts accessor function [counts( cds, normalized=TRUE )] performs this calculation. This is useful, e.g., for visualization."
cds <- estimateSizeFactors( cds )
sizeFactors( cds )
 
        A         B         C
1.2395577 0.7021389 1.1811623
9. Estimate the dispersion/variance of the data.

From the DESeq vignette:
"The inference in DESeq relies on an estimation of the typical relationship between the data’s variance and their mean, or, equivalently, between the data’s dispersion and their mean. The dispersion can be understood as the square of the coefficient of biological variation. So, if a gene’s expression typically differs from replicate to replicate sample by 20%, this gene’s dispersion is 0.22 = .04... Without any replicates at all, the estimateDispersions function will refuse to proceed unless we instruct it to ignore the condition labels and estimate the variance by treating all samples as if they were replicates of the same condition."
#With biological replicates or with partial replicates.
#When you have partial replicates only the conditions with replicates will be used to estimate the dispersion.
cds <- estimateDispersions( cds )
 
#without biological replicates
cds <- estimateDispersions( cds, method="blind", sharingMode="fit-only" )
 
#Correcting a warning message
cds <- estimateDispersions(cds)
Warning message:
In parametricDispersionFit(means, disps) : Dispersion fit did not converge.
#fitType="local" uses a numerical integration to fit the dispersion rather than the default parametric fit.
cds <- estimateDispersions(cds, fitType="local")
 
10. Test for differential expression.
res <- nbinomTest( cds, "larva", "adult" )
head( res )
 
     id   baseMean  baseMeanA  baseMeanB  foldChange  log2FoldChange      pval    padj
1 gene1    3008.14    8828.14      98.14      0.0111           -6.49  3.19e-08  0.0007
2 gene2    5760.50   16657.55     311.97      0.0187           -5.73  1.12e-07  0.0021
3 gene3   12969.76   37190.68     859.31      0.0231           -5.43  1.40e-07  0.0022
4 gene4   11404.17   32631.80     790.36      0.0242           -5.36  1.76e-07  0.0024
5 gene5   11896.77   34154.92     767.70      0.0224           -5.47  2.48e-07  0.0026
6 gene6    7586.69   21461.68     649.20      0.0302           -5.04  1.61e-06  0.0078
Note: The pval column contains the p-value which corresponds to the significance of the gene's differential expression, the padj column contains adjusted p-values (often called q-values) controlling false discovery rate. You can read more about q-values and false discovery rates here.

11. Plot the log2 fold changes against the base means (sometimes call the MA-plot), coloring in red those genes that are significant at 10% false discovery rate.
#Create a plotting function which takes your nbinomTest() object and the desired percent false discovery rate as arguments.
plotDE <- function( res, sig )
    plot(
        res$baseMean,
        res$log2FoldChange,
        log="x", pch=20, cex=.3,
        col = ifelse( res$padj < sig, "red", "black" ), xlab="Base Mean", ylab=expression(Log[2]~Fold~Change) )
#Plot the MA-plot using a 10% false discovery rate
plotDE(res, .1)
temp.png
Example MA-plot. Significant differentially expressed genes at a 10% flase discovery rate in red.

12. Optional: Plot the distribution of p-values.
hist(res$pval, breaks=100, main="", xaxt = "p-value")
 
13. At this point you may just want to export your table of genes and associates p-values and order and extract the information you need in excel.
write.csv(res, "DESeq_genes.csv")
 
Or you could order and filter your data in R and export each of these tables using the command above.

Filter for significant genes, according to some chosen threshold for the false discovery rate:
#Filter significant genes with a false discovery rate threshold less than 10%.
resSig <- res[ res$padj < 0.1, ]
Print the top 10 differentially expressed genes with a 10% false discovery rate:
resSig[order(resSig$pval)[1:10, length(resSig$id))],]
Print the top 10 down regulated genes with a 10% false discovery rate:
resSig[order(resSig$foldChange, -resSig$baseMean)[1:min(10, length(resSig$id))],]
Print the top 10 up regulated genes with a 10% false discovery rate:
resSig[order(-resSig$foldChange, -resSig$baseMean)[1:min(10, length(resSig$id))],]
14. Optional: Visualizing differential expression using heat maps.
1. Transform the count data.
cdsBlind <- estimateDispersions( cds, method="blind" )
vsd <- getVarianceStabilizedData( cdsBlind )
2. Order the genes by their significance values and display the top 100 genes.
select <- order(res$pval)[1:100]
3A. Set up the color pallet and plot the heat map.
colors <- colorRampPalette(c("white","darkblue"))(100)
heatmap( vsd[select,], col = colors, scale = "none")
 
3B. Plot the heat map using gplots.
#install gplots if it is not already installed.
install.packages("gplots")
library("gplots")
#produce a heatmap using a reversed heat color palette.
heatmap.2( vsd[select,], col=rev(heat.colors(25)),trace="none")
Note: you can find more information about R color palettes here.
heat3.png
A. Heatmap generated using commands in 3A. B. Heatmap generated using commands in 3B.
You can also use a heat map to summarize the euclidean distances between your samples.
1. Transform the count data.
cdsBlind <- estimateDispersions( cds, method="blind" )
vsd <- getVarianceStabilizedData( cdsBlind )
2. Create a distance matrix.
dists <- dist( t(vsd))
dists
          C        B
B  470.0339
A  611.5110 570.5806
3. Plot the distance matrix.
heatmap( as.matrix( dists ), symm=TRUE, scale="none", margins=c(10,10),
col = colorRampPalette(c("black","blue","lightblue"))(20))

dist.png
Example euclidian distance heat map. In this example the overall gene expression in C and B is more simmilar than in A.