Slides
Short Description
Download Slides...
Description
MANAGING DISTRIBUTED UPS ENERGY FOR EFFECTIVE POWER CAPPING IN DATA CENTERS
Vasileios Kontorinis, L.Zhang, B.Aksanli, J.Sampson, H.Homayoun, E. Pettis*, D. Tullsen, T. Rosing
*Google
ISCA 2012
UCSD
Datacenter market is growing 2
World is becoming more IT dependent.
Internet users increased from 16% to 30% of world population in 5 years [Internet World Stats]
Smart phones are projected to jump from 500M in 2011 to 2B in 2015 [Inter.Telecom.Union]
Internet heavily depends on Datacenters
Data center power will double in 5 years
Expected worldwide Datacenter Investment in 2012: 35B$ (equivalent to GDP of Lithuania) [DataCenterDynamics]
Important to build cost-effective Datacenters
Power Oversubscription - Opportunity 3
Datacenter
More servers
Server Cost
Total Cost of Ownership / Server
No Oversubscription
One time capital expenses
Servers
Supporting equipment
Recurring Costs
With Oversubscription
Same infrastructure Power Oversubscription More Cost-effective Data centers
Power Oversubscription – Opportunity 4
[Barroso et al. + APC TCO calc] Assumptions:
Server cost: 1500$ 28000 servers (10MW) Energy: 4.7c/KWh Power: 12$/kW Amort. Time DC: 10y, servers: 4y Distributed LA-based UPS
Available at:
http://cseweb.ucsd.edu/~tullsen/DCmodeling.html
Utility Peak 5.5%
Facility Space 4.5%
Utility Energy 11.7%
Power Infrastructure 7.9%
Cooling Infrastructure 3.3%
PUE overhead 2.6% Server Opex 2.0%
Rest 11.9%
DC opex 9.9%
Server Depreciation 40.6% UPS LA 0.2%
Power Oversubscription using Stored Energy 5
Power Profile Pulse Model Shaping
Diurnal Power Profile
Power
Power
Peak Power
M
Tu
W
Time
…
Peak Power Pulse Peak Power Reduction Low Power Pulse
…
UPS stored Energy
+ _
Su
Time
Leverage diurnal patterns of web services Discharge UPS batteries during high activity (once per day) Recharge during high (once per day)
Centralized UPS 6
Used in most small / medium data centers Scales poorly High losses in AC-DC-AC conversion (5-10%) Centralized single point of failure, requires redundancy
X
Increasingly cost-inefficient for large data centers
Distributed UPS 7
Used in large data centers Scales with data center size Avoids AC-DC-AC conversion Distributed points of failure
Facebook
Cheaper UPS solution
Google
Related work and our proposal 8
Utility
Diesel Generator
UPS
+ _
Centralized UPSs for power capping [Govindan, ISCA 2011] Distributed UPSs for rare power emergencies [Govindan, ASPLOS 2012] Our proposal:
PDUs
…
Racks
Provision distributed UPS for peak power capping Different battery technology Shave power on daily basis Place more servers under same power infrastructure
Better amortize capex costs
Outline 9
Introduction Choosing the right battery for power shaving Datacenter workload and power modeling Policies and results Conclusions
Outline 10
Introduction Choosing the right battery for power shaving Datacenter workload and power modeling Policies and results Conclusions
Competing Battery Technologies 11
Lead Acid (LA)
Lithium Cobalt Oxide (LCO)
Lithium Iron Phosphate (LFP)
Electric
Metrics 12
Backup UPS batteries rarely used (3-4 times per year) Proper metrics:
Cost Size
Wh / $ Volumetric Density (Wh / liter)
Backup + peak shaving UPS batteries used on daily basis Proper metrics:
Charge cycles Cost Size Recharge speed
Wh * cycles / $ Volumetric Density (Wh / litre) ( % charge / hour)
Battery Technology Comparison 13
Backup: Lead Acid (cheaper) Backup+Peak Shaving: Lithium Iron Phosphate (cost effective)
Battery Capacity-Cost Estimation
Power
14
Peak Duration
E shaved Peak Reduction
Time
LFP
Lead Acid
Assumptions 15
Number of servers
28K
Server Type
Custom Sun Fire X4270 - Intel Xeon (8-core), 8 GB Mem. - Idle Power: 175W - Max Power: 350W
PSU efficiency
80%
Workload
Pulse Model, utilization 50%
Batteries
LFP (5$/Ah), LA (2$/Ah)
TCO savings with peak duration 16
LFP
LFP size constraint
LA
LA size constraint
LA
The more we shave, the more we gain! LFP more space,energy efficient than LA, can shave more!
TCO savings with battery DoD 17
When shaving same energy:
Low DoD
High DoD
+ _
(a) LA
+ _
(b) LFP
Sweet DoD spot for TCO savings (LA: 40%, LFP: 60%)
Key points for battery selection 18
When using batteries for peak power shaving: Shave as much power as possible (reasonably sized battery) There is a DoD sweet spot, maximizing TCO savings LFP better technology because:
lots of recharges more efficient discharge higher energy density cheaper in the future
What if: - Servers with unbalanced load? - Day-to-day variation in demand?
Outline 19
Introduction Choosing the right battery for power shaving Datacenter workload and power modeling Policies and results Conclusions
Workload Modeling 20
Whole year traffic data from Google Transparency Report Apply weights according to web presence: (Search 29.2%, Social Networking 55.8%, Map Reduce 15%) Present results for 3 worst consecutive days (11/17/2010-11/19/2010)
Workload Modeling (cont.) 21
Model 1000 machine cluster, with 5 PDUs, 10 racks per PDU, 20 servers (2u) per rack. We simulate load based on M/M/8 queues and scale inter-arrival time according to workload traffic Interarrival Time Job Job Job
Job Job Job Job
Job Job Job
Job Job Job Job
8 Cores (consumers)/ Server
Job Job Job Job Job
Scheduler (Round Robin or Load-aware)
…….. Job
Service Time
Outline 22
Introduction Choosing the right battery for power shaving Datacenter workload and power modeling Policies and results Conclusions
Policy goals 23
Guarantee power budget at specific level of power hierarchy Discharge during only high activity, charge during only low activity Effective irrespective of job scheduling Make uniform battery usage
Uncoordinated Policy 24
Power over Threshold
Recharge Complete
Available
In Use
Power below Threshold
Recharge
Reached DoD Goal
Applied at the server level Easy to implement Runs independently per server DoD goal set to 60% of battery capacity (LFP)
Not Available
(Power + Bat. Recharge Power) below Threshold
Uncoordinated Policy Results 25
Round Robin Scheduling
Batteries discharge when not required
Batteries recharge during peak
Fails to guarantee budget
Budget violation
Uncoordinated Policy Results (cont.) 26
Load-aware Scheduling
Batteries discharge all together (wasteful)
Recharge all together (violates budget)
Fails to guarantee budget
Coordination is required!!
Budget violation
Coordinated Control 27
Applied at higher levels (PDU, Cluster) Requires remote battery enable/disable, initiate recharge Number of batteries enabled proportional to peak magnitude Batteries used spatially distributed
Overall Power
300 server 100 server equivalent equivalent 200 server 200 server equivalent equivalent 0 server equivalent
Day1
Day2
rack1
Day3
rack2
Coordinated Policies 28
Pdu-level
Cluster-level
Power cap close to Average power (ideal) of 250W Peak power reduction of 19% 23% more servers 6.2% TCO/server reduction
Discussion: Energy proportionality 29
Modern Servers
Sharper, thinner peaks We can shave more power, with same stored energy
Overall Power
Energy Proporional Servers
Day1
Day2
Day3
Peak power reduction of up to 37.5% with the 40Ah LFP battery
Concluding remarks 30
Battery provisioning of distributed UPS topologies to cap power and oversubscribe data center is beneficial Critical to reconsider battery properties (technology, capacity, DoD) Coordination of charges and discharges is required We cap peak power by 19%, allow 23% more servers and better amortize capex costs Achieve 6.2% reduction in TCO/server ($15M -- 28k server DC)
31
BACKUP SLIDES
TCO savings with battery cost 32
LA is stable technology LFP advancements expected, due to electric vehicles
TCO savings increase over time with LFP!
When things go wrong? 33
Scenario 1: Unexpected daily traffic We use the additional 35% capacity in our batteries (DoD optimized for TCO savings at 60%)
Scenario 2: Batteries are not replaced immediately With 50% of batteries dead we can still reduce peak by 15% Grouping battery maintenance/replacement for cost savings possible
Exploration of Dead Batteries 34
Discussion: DVFS 35
To DVFS or not DVFS? Datacenter SLAs violations likely during peak load DVFS bad during high demand Great during low demand Creates higher margins for aggressive battery capping
Overall Power
Potential SLA violation
WITH No DVFS SLA violation unlikely Day1
Day2
Day3
Battery Capacity-Cost Estimation 36
E Datacenter,shaved = Power
PeakReduction * PeakDuration
Peak Duration
E shaved
Peak Reduction
Time
E server,shaved
= E Datacenter,shaved* PSUEff # servers
Cbattery
1 1 Eserver,shaved PE-1 *I * * = DoD 0.8 V
Cbattery *CostperAh * # servers UPSdepreciation = Min(servicelife, DoD(cycles) / 30)
LFP
Lead Acid (~twice volume)
Battery Related Assumptions 37
Workload partitioning 38
Distributed Algorithm 39
View more...
Comments