Thursday, August 18, 2011 · Posted by eran at 20:40 PM
Many data centers sit on a lot of "cold storage" -- servers containing terabytes of user data that must be retained but is rarely accessed, because users no longer need that data. While the servers are considered cold because they are rarely utilized, their hard drives are usually spinning at full speed although they are not serving data. The drives must keep rotating in case a user request actually requires retrieving data from disk, as spinning up a disk from sleep can take up to 30 seconds. In RAID configurations this time can be even longer if the HDDs in the RAID volume are staggered in their spin up to protect the power supply. Obviously, these latencies would translate into unacceptable wait times for a user who wishes to view a standard resolution photo or a spreadsheet.
Reducing HDD RPM by half would save roughly 3-5W per HDD. Data centers today can have up to tens and even hundreds of thousands of cold drives, so the power savings impact at the data center level can be quite significant, on the order of hundreds of kilowatts, maybe even a megawatt. The reduced HDD bandwidth due to lower RPM would likely still be more than sufficient for most cold use cases, as a data rate of several (perhaps several dozen) MBs should still be possible. In most cases a user is requesting less than a few MBs of data, meaning that they will likely not notice the added service time for their request due to the reduced speed HDDs. What is critical is that the latency response time of the HDD isn’t higher than 100 ms in order to not degrade the user experience.
HDDs however aren’t "born" cold; they progress into that state. In early stages, when user data is constantly being uploaded, or when data is recovered to the system due to a failure on a different machine, high disk bandwidth is a valuable asset. However, when a system's data turns cold, there is no value to the high bandwidth. But copying over the data to a low bandwidth system requires too much overhead and would be slow (since the target is low bandwidth), and as a result isn’t a standard mode of operation for most providers. Therefore, having HDDs that can operate in either full speed (7200RPM) or reduced speed (say, 3600RPM), with a toggle to control the setting, could be useful. The transition between these states can be long (like 15 seconds), as this would likely be a one-time event, triggered by an entity capable of determining that the box is no longer hot.
The following table summarizes the proposed specification for a 3TB or larger SATA enterprise 7200 RPM HDD at normal and reduced speeds.
What are your thoughts? Does your storage infrastructure experience similar behavior and would it be capable of realizing serious power savings provided the HDDs had this feature?