Benchmark tests of NCBI Blast+ on Amazon Elastic Compute Cloud (EC2)
Amazon Elastic Compute Cloud (EC2) is a well-known cloud service.  If NCBI Blast+ works fast enough on EC2 instances, we can purchase EC2 instance when we need it and do not have to invest on expensive computers such as Mac Pro.
We performed the benchmark tests of NCBI Blast+ 2.2.25 on several EC2 instances (4/5/2012).
We performed the benchmark tests on Rackspace Open Cloud Severs and added the results (8/18/2012).
Benchmark results for the large databases are added (12/28/2012).
 
Experience the power of BlastStation-Workgroup with BlastStation-Free in the cloud.
 
Click here for high performance Cloud BLAST, BlastStation in the Cloud.
We will give away three free hours for a limited time for all users. Sign up Now for free account.
 
Up-to-date as of 8/29/2013
 
 
1. Test Conditions
EC2 instances tested
Standard Small: 1.7 GB Memory, 1 virtual core
Standard Medium: 3.75 GB Memory, 1 virtual core
Standard Large: 7.5 GB Memory, 2 virtual cores
Standard Extra Large: 15 GB Memory, 4 virtual cores
High-CPU Extra Large: 7 GB Memory, 8 virtual cores
M3 Extra Large: 15 GB Memory, 13 EC2 Compute Units
M3 Double Extra Large: 30 GB Memory, 26 EC2 Compute Units
Cluster Compute Quadruple Extra Large: 23 GB Memory, 33.5 EC2 Compute Units
Cluster Compute Eight Extra Large: 60.5 GB Memory, 88 EC2 Compute Units
OS 64-bit Ubuntu 10.04 for Results 2.1. 11.10 for Results 2.2.
Query Sequences AF287139, 606 letters for blastn, ACL81455, 301 letters for blastp
Databases
env_nt 19,650,356 sequences; 8,273,484,763 total letters, 6.0 GB
est_mouse 4,853,570 sequences; 2,249,971,710 total letters, 1.7 GB
env_nr 6,050,066 sequences; 1,213,377,474 total letters, 2.4 GB
nt 16,953,301 sequences; 43,701,263,139 total letters
nr 22,321,465 sequences; 7,672,129,783 total letters
BlastStation-Workgroup 1.14 for Linux was used for 64-bit NCBI Blast+ searches.
NCBI Blast+ version is 2.2.25.Blast+ searches were performed for 10 query sequences and numbers in the table below are total elapsed time divided by 10.
Numbers in the table below are in seconds.
2.1. Results for smaller databases
Smaller number means faster instance
EC2 instance Price/hour1) blastn against env_nt blastn against est_mouse blastp against env_nr
Standard Small $0.08 85.5 26.9 86.0
Standard Medium $0.16 34.8 12.2 42.0
Standard Large $0.32 20.4 8.6 25.4
Standard Extra Large $0.64 11.0 5.2 16.9
High-CPU Extra Large $0.66 7.8 4.3 11.7
Cluster Compute Quadruple Extra Large2) $1.30 4.5 2.3 5.1
Rackspace 8GB/4vCPUs3) $0.48 15.7 5.9 18.5
Rackspace 15GB/6vCPUs3) $0.90 10.6 5.6 13.7
Regular PC4)   29.7 10.0 33.3
1) US East(Virginia) Region as of 4/5/2012
2) 64-bit CentOS 5.4
3) Instances created in DFW.  64-bit Ubuntu 12.04. Different from Amazon EC2, it takes 8-13 minutes to create an instance. It took 453 seconds to download nt.00.tar.gz from ftp.ncbi.nlm.nih.gov. The download speed is around 1.8Mbytes/sec.
It took 89 seconds to download nt.00.tar.gz from ftp.ncbi.nlm.nih.gov to EC2 instance. The download speed is around 9.1Mbytes/sec.
4) Core 2 Duo@3.06 GHz, 3.75 GB Memory, 1 core, 64-bit Ubuntu 10.04
2.2. Results for large databases
Smaller number means faster instance
EC2 instance Price/hour1) blastn against nt blastp against nr
Standard Extra Large2) $0.52 21.1 78.8
M3 Extra Large2) $0.58 18.7 67.4
M3 Double Extra Large2) $1.16 11.7 35.4
Cluster Compute Quadruple Extra Large2) $1.30 6.8 20.2
Cluster Compute Eight Extra Large2) $2.40 5.7 12.3
NCBI Website3)   54) 164)
1) US East(Virginia) Region as of 12/28/2012
2) 64-bit Ubuntu 11.10
3) NCBI Blast+ 2.2.27
4) nt has16,967,778 sequences and 43,713,081,416 total letters. nr has 22,325,752 sequences and 7,673,265,207 total letters
3. Conclusions
  1. Since Standard Small does not have enough memory for env_nt and env_nr, meaning results for Standard Small is est_mouse results only.  From the est_mouse results, Standard Small and Standard Medium are slower than regular PC.  There is no reason to use these instances instead of your PC.
  2. Standard Extra Large is almost twice as fast as Standard Large for all databases.  Since the price of Standard Extra Large is twice as much as that of Standard Large, this result is reasonable.
  3. Cluster Compute Quadruple Extra Large is the fastest for all databases.  Since the price of this instance is almost twice of that of Standard Extra Large and more than twice faster than that,  Cluster Compute Quadruple Extra Large is the best instance from a cost/performance viewpoint.
  4. For large databases such as env_nt and env_nr, High-CPU Extra Large is almost 1.4 times faster than Standard Extra Large.  Since the price of High-CPU Extra Large is almost the same as that of Standard Extra Large, High-CPU Extra Large is the better instance.
  5. Since High-CPU Extra Large has only 7 GB memory, Standard Extra Large which has 15 GB memory or Cluster Compute Quadruple Extra Large which has 23 GB memory should be used for huge databases such as nt and nr.
  6. The performance of Rackspace 8GB/4vCPUs is almost same as those of EC2 instances.  However, Rackspace 15GB/6vCPUs is not recommended from a cost/performance viewpoint.
  7. Cluster Compute Eight Extra Large is almost as fast as the NCBI Website. This instance is suitable for large Blast searches which will be aborted by "Error: CPU usage limit was exceeded, resulting in SIGXCPU (24)." error in the NCBI Website.
  8. Since Cluster Compute Eight Extra Large has 60.5 GB memory, it can be used for databases created from NGS data.