The+Most+Bang+for+Your+Buck

=Making efficient use of the resources you rent from Amazon= At Amazon, and any other cloud computing service for that matter, you have to pay for server time per unit time (usually by the hour). So while you're renting a machine you want to use all the resources you are paying for only as long as you really need them. Here are a few tips for speeding up your work and reducing the amount of time you rent a server.

A lot of tasks can be parallelized easily with GNU parallel. You can download it from here. Installation is as easy as untarring the bz2 archive and then issuing the following commands. GNU parallel is installed on our AMI (check the REDAME in the home directory to see what programs are installed). code format="bash" ./configure make make install code GNU parallel handles parallelization of tasks for you by maximizing the number of jobs that run in parallel. Let's say you have 4 cores on a machine and you want to run the same script on 8 different datasets. You can write a simple batch text file for quality trimming sequences in fastq files that looks as follows. code format="bash" qtrimScript someFile1.fastq qtrimScript someFile2.fastq qtrimScript someFile3.fastq qtrimScript someFile4.fastq qtrimScript someFile5.fastq qtrimScript someFile6.fastq qtrimScript someFile7.fastq qtrimScript someFile8.fastq code You could issue all these commands one after the other which would actually slow the server down since you'd be trying to run 8 instances of the script at the same time on only 4 cores. Running them sequentially wastes your time and server resources. GNU parallel can take care of this for you. code format="bash" parallel -j+0 < batch.txt code Instead of waiting for the jobs you just started to finish you can unmout the volume you mounted to your server and shut down the server automatically when you're done. code format="bash" parallel < batch.txt && umount /path/to/volume && poweroff & parallel < batch.txt ; umount /path/to/volume && poweroff & parallel < batch.txt && poweroff & code Following the above steps, you should be able to get your results quicker while spending less time and money. There's no reason to pay for an idle server! Note that GNU parallel doesn't optimize RAM usage, so assembly and mapping won't benefit from it. Using the last scripts I mentioned will still save money by shutting down your instance after your assembly is done for example.
 * 1) do ./configure in the directory that was created
 * 2) when you untarred the bz2 file
 * 1) This runs as many instance of the commands in batch.txt
 * 2) in parallel as you have cores (-j+0) and schedules the
 * 3) jobs in an efficient manner
 * 1) unmount and poweroff are only executed if parallel exits without error
 * 1) unmounts volume regardless of exit status of parallel but doesn't
 * 2) shut down server unless volume unmounts clean
 * 1) the poweroff command should unmount your volume automatically
 * 2) use at your own risk -- adding umount is probably saver

You can also send yourself an email before the machine shuts down to inform you that your data is ready to be picked up using mailx. I may get to an example of this at some point.