How much does it energy does it take to store a terabyte of data?
August 9, 2013 9:21 AM Subscribe
On average, or an educated guess, how much energy does it take to store a terabyte of data? I don't mean how much does storage cost. I want to know what the environmental impact of a terabyte of data.
What I'm looking for is the energy impact of including a little symbol people attach to their emails. There is a size of the file, which means it replicates across mail servers, which means that little file becomes very big very fast - especially as it is adopted by multiple people and included on every email and response.
What I'm looking for is the energy impact of including a little symbol people attach to their emails. There is a size of the file, which means it replicates across mail servers, which means that little file becomes very big very fast - especially as it is adopted by multiple people and included on every email and response.
Unless it is in a volatile (i.e. required to be powered to retain data) storage medium, storing data doesn't require energy. Energy is used to write the data to storage, and to read the data from storage, and to make sure it is being backed up properly and related maintenance but you can write to a hard drive, take it out of the computer, put it on a shelf, and data is stored and taking no energy.
Are you asking how much energy it takes to transmit data by email from point A to point B, and to write it to storage?
posted by griphus at 9:29 AM on August 9, 2013 [1 favorite]
Are you asking how much energy it takes to transmit data by email from point A to point B, and to write it to storage?
posted by griphus at 9:29 AM on August 9, 2013 [1 favorite]
I think the OP is thinking of things like Facebook and Google's giant data centers that have their own weather. They are generally discussed in terms of having x terabytes of storage capacity. And the assumption is that emails and the like need to be kept in "live" storage, instantly accessible.
So, if you knew y, the carbon impact of say, 1 data center, and you knew that x, the number of terabytes of data that a data center could store, and you could estimate z, the number of different data center that a given email lives on, you could come up with a rough approximation of how much carbon your 5mb .pptx attachment is responsible for.
Obviously the amount for each email would be vanishingly small, but maybe the aggregate effect of the 15GB that each Gmail user gets could be interesting.
I think it would be a nice rejoinder to those stupid "please consider the trees before printing this email" signatures.
posted by sparklemotion at 9:48 AM on August 9, 2013 [1 favorite]
So, if you knew y, the carbon impact of say, 1 data center, and you knew that x, the number of terabytes of data that a data center could store, and you could estimate z, the number of different data center that a given email lives on, you could come up with a rough approximation of how much carbon your 5mb .pptx attachment is responsible for.
Obviously the amount for each email would be vanishingly small, but maybe the aggregate effect of the 15GB that each Gmail user gets could be interesting.
I think it would be a nice rejoinder to those stupid "please consider the trees before printing this email" signatures.
posted by sparklemotion at 9:48 AM on August 9, 2013 [1 favorite]
The energy required to store a terabyte of data is the same as the amount of energy required to manufacture a a 1tb hard drive. The amount of energy this is is somewhere under $62 worth of energy, since I can buy a 1tb hard drive for that price on Amazon, and that's with Amazon's and the manufacturer's markup. Presumably the manufacturer builds the thing for $20 all-inclusive, including their energy costs.
The energy required to move and replicate and access 1tb of data is something else entirely.
posted by tylerkaraszewski at 9:49 AM on August 9, 2013
The energy required to move and replicate and access 1tb of data is something else entirely.
posted by tylerkaraszewski at 9:49 AM on August 9, 2013
Here's a back-o'-the-envelope calculation based on a WD Red 2TB drive:
At max transfer rate, time to transfer 1 terabyte = 8E12 / 6E9 = 1333 s
Energy used = 1333 × 4.4 = 5865 J = about 1.4 calories = about half a peanut.
By thinking about this, all the readers of this AskMe will likely have expended more than that …
posted by scruss at 9:53 AM on August 9, 2013 [7 favorites]
WD Red 2 TB Hard Drive for NAS ( WD20EFRX) Transfer Rates Buffer To Host (Serial ATA) 6 Gb/s (Max) Power Dissipation Read/Write 4.40 WattsSo: 1 Terabyte = 1E12 bytes = 8E12 bits.
At max transfer rate, time to transfer 1 terabyte = 8E12 / 6E9 = 1333 s
Energy used = 1333 × 4.4 = 5865 J = about 1.4 calories = about half a peanut.
By thinking about this, all the readers of this AskMe will likely have expended more than that …
posted by scruss at 9:53 AM on August 9, 2013 [7 favorites]
You need to specify a few things
- type of data storage - RAM? flash? HDD? CD?
- do you mean the electricity cost of writing to memory 1TB of storage? Or just once it's sitting there?
- once you get into electricity cost, you are looking at the cost to produce the electricity itself (coal, nuclear, hydro)
- do you care about environmental impact of making the chip (CD, HDD) that stores 1TB?
- writing 1TB of data over a network or just on your computer?
If you mean environmental impact of the entire thing, well it's huge. If you narrow it just to the chip (say Flash) it is still huge because the chemicals and metals required to make devices on silicon wafers is pretty hazardous. If you mean just the electricity itself, it is VERY small. Think of how many nerds are torrenting files that add up to 1,000GB and you don't see the police raiding their house like it's a grow-op or something. Just think of how much it costs to keep your computer on for 1h. Not much.
posted by St. Peepsburg at 9:56 AM on August 9, 2013
- type of data storage - RAM? flash? HDD? CD?
- do you mean the electricity cost of writing to memory 1TB of storage? Or just once it's sitting there?
- once you get into electricity cost, you are looking at the cost to produce the electricity itself (coal, nuclear, hydro)
- do you care about environmental impact of making the chip (CD, HDD) that stores 1TB?
- writing 1TB of data over a network or just on your computer?
If you mean environmental impact of the entire thing, well it's huge. If you narrow it just to the chip (say Flash) it is still huge because the chemicals and metals required to make devices on silicon wafers is pretty hazardous. If you mean just the electricity itself, it is VERY small. Think of how many nerds are torrenting files that add up to 1,000GB and you don't see the police raiding their house like it's a grow-op or something. Just think of how much it costs to keep your computer on for 1h. Not much.
posted by St. Peepsburg at 9:56 AM on August 9, 2013
Much more energy is being used in keeping a computer up and running than by a computer transferring files between various storage devices. Transferring to a remote host introduces complications because you'd have to account for the changes in uptime energy costs at all of the rendezvous points traversed en route by the packets. Even if you weren't transferring 1 Tb of data between remote locations, data centers would still have their servers running, LEDs a'blinkin', fans a'spinnin', HVAC a'pumpin', TCP a'listenin', etc. That's the real energy suck that people focus on. That much larger amount of energy expended by computers that are turned on and not doing much besides not being turned off and unplugged is probably what people are thinking about when getting snarky in this thread.
If you want to see how things differ on a home network, you could transfer 1 Tb of data between devices and use a few Kill a watts and a team of scientists (people holding clipboards and stopwatches) to record changes at each vantage point over a time series. You could use a Kill a watt on computer A, computer B, and the WiFi router, and see which one produces the biggest spikes in power consumption. Then you could compare the spikes to baseline power consumption, and add them all up to get an estimate of net energy use across your network, which devices use more energy to transfer files on your network, etc.
posted by oceanjesse at 11:02 AM on August 9, 2013
If you want to see how things differ on a home network, you could transfer 1 Tb of data between devices and use a few Kill a watts and a team of scientists (people holding clipboards and stopwatches) to record changes at each vantage point over a time series. You could use a Kill a watt on computer A, computer B, and the WiFi router, and see which one produces the biggest spikes in power consumption. Then you could compare the spikes to baseline power consumption, and add them all up to get an estimate of net energy use across your network, which devices use more energy to transfer files on your network, etc.
posted by oceanjesse at 11:02 AM on August 9, 2013
If you really want to calculate it, let's say all you are asking is the cost of writing 1T of data in RAM (i.e. what memory your email application uses - could be SRAM but similar ideas apply). Everything else is already on (the computer, the servers etc.) and you just want the differential cost of the 1TB showing up in your RAM and changing bits around.
Googling around...
"2gb ddr3 module running at 1333mhz uses around 9.6w when idle and just 10.5w at load"
so (10.5-9.6W) = 0.9W used
At 1333MHz it will take ~750s or 12.5minutes to write 1T of data.
0.9W @ 12.5mins = 0.19 watt*hours
In California as of June 2013 the cost was 20.3 cents per kwh. (13.7 cents is the US average).
So you are looking at 3.9 cents ADDITIONAL power for 1TB (vs. your computer idle).
These are rough calculations but you get the idea.
posted by St. Peepsburg at 11:20 AM on August 9, 2013 [1 favorite]
Googling around...
"2gb ddr3 module running at 1333mhz uses around 9.6w when idle and just 10.5w at load"
so (10.5-9.6W) = 0.9W used
At 1333MHz it will take ~750s or 12.5minutes to write 1T of data.
0.9W @ 12.5mins = 0.19 watt*hours
In California as of June 2013 the cost was 20.3 cents per kwh. (13.7 cents is the US average).
So you are looking at 3.9 cents ADDITIONAL power for 1TB (vs. your computer idle).
These are rough calculations but you get the idea.
posted by St. Peepsburg at 11:20 AM on August 9, 2013 [1 favorite]
Response by poster: sparklemotion has added the clarity my original question lacked, and scruss started a good back of hand calculation on the cost to transfer the data, but it still lacks the forced availability of the server.
The discussion started with a 1 meg signature attachment that asked me to think green. Then I thought about how much space and resources and the strain that the signature requirement puts on the environment. Much to St. Peepsburg's comment there are a lot of resources that are required to store a seemingly innocuous note at the end of the email. Server storage needs to be planned, acceptable access times for the rest of the network needs to be planned out, the time spent transferring the file over the fiber... all of it is just straight data pollution. I want to get a rough cost of that.
posted by Nanukthedog at 11:21 AM on August 9, 2013
The discussion started with a 1 meg signature attachment that asked me to think green. Then I thought about how much space and resources and the strain that the signature requirement puts on the environment. Much to St. Peepsburg's comment there are a lot of resources that are required to store a seemingly innocuous note at the end of the email. Server storage needs to be planned, acceptable access times for the rest of the network needs to be planned out, the time spent transferring the file over the fiber... all of it is just straight data pollution. I want to get a rough cost of that.
posted by Nanukthedog at 11:21 AM on August 9, 2013
Greenpeace also wrote up a "How Green is your Cloud" report on dataservers. Summary is here.
posted by St. Peepsburg at 11:32 AM on August 9, 2013
posted by St. Peepsburg at 11:32 AM on August 9, 2013
Don't forget about efficiencies that a place as sophisticated as Google or Microsoft (I'd say offhand that at least 80% of the world's personal email is stored on their servers) are able to do at large scales. For starters, everything is compressed within an inch of its life until it's needed. I also have to think that those guys aggregate duplicate files (by looking at checksums) and just reference those in individual emails. What I'm basically trying to say is that I'm fairly sure that that 1Mb attachment doesn't get stored n times everywhere.
posted by mkultra at 6:28 PM on August 9, 2013
posted by mkultra at 6:28 PM on August 9, 2013
I work on a very large scale datacenter-hosted service (though not in a datacenter itself), and the Greenpeace report that St. Peepsburg released is probably indicative of the numbers you'll be likely to find. Unfortunately, as the report itself notes, datacenter operators aren't generally interested in sharing their power consumption numbers, and you'd still need the number of terabytes of storage in that datacenter which usually is also a pretty hard number to find.
Since energy usage is the primary cost of running a datacenter, there is, however, plenty of incentive to reduce that cost. This is usually a good thing as it means that companies are looking at increasing efficient usage of their server capacity (e.g. hosting more users with less hardware) and ways to reduce power usage through energy-efficient cooling systems (a lot of energy has traditionally gone into big AC units that keep the environment of the datacenter cool and dry). A number of companies have also begun looking at ways to put their datacenters in locations with easier access to renewable energy, typically hydroelectric at this point, which also has the advantage of also being very cheap. There are several datacenters being built along the Columbia River for this reason.
mkultra: due to the failure rate of server equipment at scale, I guarantee that there are at least 2 and probably around 3-4 copies of any given unique bit of data in order to failover gracefully. Your point about compression and de-duplication across messages is valid, though.
posted by Aleyn at 10:29 PM on August 9, 2013
Since energy usage is the primary cost of running a datacenter, there is, however, plenty of incentive to reduce that cost. This is usually a good thing as it means that companies are looking at increasing efficient usage of their server capacity (e.g. hosting more users with less hardware) and ways to reduce power usage through energy-efficient cooling systems (a lot of energy has traditionally gone into big AC units that keep the environment of the datacenter cool and dry). A number of companies have also begun looking at ways to put their datacenters in locations with easier access to renewable energy, typically hydroelectric at this point, which also has the advantage of also being very cheap. There are several datacenters being built along the Columbia River for this reason.
mkultra: due to the failure rate of server equipment at scale, I guarantee that there are at least 2 and probably around 3-4 copies of any given unique bit of data in order to failover gracefully. Your point about compression and de-duplication across messages is valid, though.
posted by Aleyn at 10:29 PM on August 9, 2013
Environmental impact depends on a huge variety of factors:
* Deduplication. A clever storage system, when asked to store multiple copies of a thing, can store it once and use a relatiively small reference to the actual storage location. Conversely, high availability requires redundant drives and backups, adding to the cost of a terrabyte.
* Media Latency. If you don't have to recall it very often, Amazon Glacier (aka magnetic tape) is nearly free. SSDs are also lower power. There's also caching effects, which is typically done for time rather than energy.
* Power Use Efficiency. Facebook, for example, is moving their datacenters on high voltage lines (277V) to avoid some transformer loss. In places where the weather runs cool, you'll have less energy wasted on cooling (just don't accidentally turn your cloud into a cloud).
* Proximity to you vs the power generation source. Transmission loss is huge, so you'd want to cluster heavy users of power near the source. At a cost of minor amounts of latency.
* Impact of power sources. Because most datacenters can locate anywhere in the US and still be acceptably low latency, you can also move your datacenter near hydroelectric, which makes for a cheap, low impact method of powering servers. Oregon and Washington have several DCs in the area. On the otherhand, if power is unreliable, they may have to resort to diesel, which is pretty bad.
* Manufacturing costs to mine the raw materials, ship it, shape it, construct the chips, soldier it, and ship the finished product to a datacenter.
Ultimately, I think the price of printer ink makes a better admonishment than saving the environment -- paper makes a surprisingly good carbon sink.
posted by pwnguin at 1:45 AM on August 10, 2013
* Deduplication. A clever storage system, when asked to store multiple copies of a thing, can store it once and use a relatiively small reference to the actual storage location. Conversely, high availability requires redundant drives and backups, adding to the cost of a terrabyte.
* Media Latency. If you don't have to recall it very often, Amazon Glacier (aka magnetic tape) is nearly free. SSDs are also lower power. There's also caching effects, which is typically done for time rather than energy.
* Power Use Efficiency. Facebook, for example, is moving their datacenters on high voltage lines (277V) to avoid some transformer loss. In places where the weather runs cool, you'll have less energy wasted on cooling (just don't accidentally turn your cloud into a cloud).
* Proximity to you vs the power generation source. Transmission loss is huge, so you'd want to cluster heavy users of power near the source. At a cost of minor amounts of latency.
* Impact of power sources. Because most datacenters can locate anywhere in the US and still be acceptably low latency, you can also move your datacenter near hydroelectric, which makes for a cheap, low impact method of powering servers. Oregon and Washington have several DCs in the area. On the otherhand, if power is unreliable, they may have to resort to diesel, which is pretty bad.
* Manufacturing costs to mine the raw materials, ship it, shape it, construct the chips, soldier it, and ship the finished product to a datacenter.
Ultimately, I think the price of printer ink makes a better admonishment than saving the environment -- paper makes a surprisingly good carbon sink.
posted by pwnguin at 1:45 AM on August 10, 2013
« Older Could exercise be worsening the underlying cause... | I don't know about processors, but I know what I... Newer »
This thread is closed to new comments.
posted by Jairus at 9:24 AM on August 9, 2013