The 1000Genomes project is using the AWS service S3 to distribute the data on 1,700 human genomes to genetics researchers.
That’s what Amazon and the National Institute of Health (NIH) have done with the 1000Genomes project, using Amazon’s S3 storage service to offer over 1,700 human genomes to genetics researchers across the globe. “This is what allows us to drive more complex maps of how genes interact with each other and their environment and zoom in on areas that may have a role to play in human health and disease,” says Matt Wood, who oversees Amazon’s side of the project and holds a PhD in bioinformatics. “This is the seed to create a tree of data.”
Amazon and the NIH made a big splash last month when they announced that anyone with an S3 account could now access this data, but the move is only part of a much larger effort to reinvent genetics using the proverbial cloud, with researchers tapping into public services from the likes of Amazon, Google, and Microsoft but also building their own cloud services using tools such as Hadoop, the open source platform for crunching large amounts of data across a sea of ordinary servers.