ABSTRACT
Due to imbalances in technology scaling, the energy consumption of data storage and communication by far exceeds the energy consumption of actual data production, i.e., computation. As a consequence, recomputing data can become more energy efficient than storing and retrieving precomputed data. At the same time, recomputation can relax the pressure on the memory hierarchy and the communication bandwidth. This study hence assesses the energy efficiency prospects of trading computation for communication. We introduce an illustrative proof-of-concept design, identify practical limitations, and provide design guidelines.
- ABRAHAM, S. G., SUGUMAR, R. A., WINDHEISER, D., RAU, B. R., AND GUPTA, R. Predictability of Load/Store Instruction Latencies. In International Symposium on Microarchitecture (MICRO) (1993). Google ScholarCross Ref
- BAILEY, D. H., BARSZCZ, E., BARTON, J. T., BROWNING, D. S., CARTER, R. L., DAGUM, L., FATOOHI, R. A., FRED-ERICKSON, P. O., LASINSKI, T. A., SCHREIBER, R. S., SIMON, H. D., VENKATAKRISHNAN, V., AND WEERATUNGA, S. K. The NAS Parallel Benchmarks: Summary and Preliminary Results. In Conference on Supercomputing (SC) (1991). Google ScholarDigital Library
- BERGMAN, K., BORKAR, S., CAMPBELL, D., CARLSON, W., DALLY, W., DENNEAU, M., FRANZON, P., HARROD, W., HILLER, J., AND KARP, S. Exascale Computing Study: Technology Challenges in Achieving Exascale Systems. DARPA Information Processing Techniques Of.ce (IPTO) sponsored study (2008).Google Scholar
- BIENIA, C., KUMAR, S., SINGH, J. P., AND LI, K. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Tech. Rep. TR-811-08, Princeton University, 2008.Google ScholarDigital Library
- BURGER, D., KAXIRAS, S., AND GOODMAN, J. R. Datascalar Architectures. In International Symposium on Computer Architecture (ISCA) (1997). Google ScholarDigital Library
- CARLSON, T. E., HEIRMAN, W., ALLAM, O., KAXIRAS, S., AND EECKHOUT,L. The Load Slice Core Microarchitecture. In International Symposium on Computer Architecture (ISCA) (2015).Google Scholar
- CARLSON, T. E., HEIRMAN, W., AND EECKHOUT, L. Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-core Simulation. In International Conference for High Performance Computing, Networking, Storage and Analysis (2011). Google ScholarDigital Library
- CHE, S., BOYER, M., MENG, J., TARJAN, D., SHEAFFER, J. W., LEE, S.-H., AND SKADRON, K. Rodinia: A Benchmark Suite for Heterogeneous Computing. In International Symposium onWorkload Characterization (2009). Google ScholarDigital Library
- COLLINS, J. D., WANG, H., TULLSEN, D. M., HUGHES, C., LEE, Y.-F., LAVERY, D., AND SHEN, J. P. Speculative Precomputation: Long-range Prefetching of Delinquent Loads. In International Symposium on Computer Architecture (ISCA) (2001). Google ScholarDigital Library
- DE KRUIJF, M., AND SANKARALINGAM, K. Idempotent Processor Architecture. In International Symposium on Microarchitecture (MICRO) (2011). Google ScholarDigital Library
- GONZALEZ, R., AND HOROWITZ, M. Energy Dissipation in General Purpose Microprocessors. IEEEJournal of Solid-State Circuits 31,9(1996). Google ScholarCross Ref
- GUO,X.,IPEK,E., AND SOYATA,T. Resistive Computation: Avoiding the Power Wall with Low-leakage, STT-MRAM Based Computing. In International Symposium on Computer Architecture (ISCA) (2010).Google Scholar
- HENNING, J. L. SPEC CPU2006 Benchmark Descriptions. SIGARCH Computer Architecture News 34,4 (2006). Google ScholarDigital Library
- HOROWITZ, M. Computing's Energy Problem (and what we can do about it). Keynote at International Conference on Solid State Circuits (2014).Google Scholar
- HU, Z., KAXIRAS, S., AND MARTONOSI, M. Timekeeping in the Memory System: Predicting and Optimizing Memory Behavior. In International Symposium on Computer Architecture (ISCA) (2002). Google ScholarDigital Library
- KANDEMIR, M., LI, F., CHEN, G., CHEN, G., AND OZTURK, O. Studying Storage-Recomputation Tradeoffs in Memory-Constrained Embedded Processing. In Design, Automation andTestin Europe(DATE) (2005). Google ScholarDigital Library
- KANG, Y., HUANG, W., YOO, S.-M., KEEN, D., GE, Z., LAM, V., PATTNAIK, P., AND TORRELLAS, J. FlexRAM: Toward an Advanced Intelligent Memory System. In International Conference on Computer Design (ICCD) (1999). Google ScholarCross Ref
- KECKLER, S. W., DALLY, W. J., KHAILANY, B., GARLAND, M., AND GLASCO, D. GPUs and the Future of Parallel Computing. IEEE Micro 31,5 (2011). Google ScholarDigital Library
- KOC, H.,KANDEMIR, M., ERCANLI, E., AND OZTURK, O. Reducing Off-Chip Memory Access Costs Using Data Recomputation in Embedded Chip Multi-processors. In Design Automation Conference (DAC) (2007). Google ScholarDigital Library
- KOC, H., OZTURK, O., KANDEMIR, M., AND ERCANLI, E. Minimizing Energy Consumption of Banked Memories Using Data Recomputation. In International Symposium on LowPower Electronics and Design (ISLPED) (2006). Google ScholarDigital Library
- KOGGE, P., BASS, S., BROCKMAN, J., CHEN, D., AND SHA,E. Pursuinga Peta.op: Point Designsfor100TF Computers Using PIM Technologies. In Frontiers of Massively Parallel Computing (1996).Google Scholar
- KOGGE, P. M. The EXECUBE Approach to Massively Parallel Processing. In International Conference on Parallel Processing (ICPP) (1994).Google Scholar
- LI, S., AHN, J. H., STRONG, R. D., BROCKMAN, J. B., TULLSEN,D.M., AND JOUPPI,N.P. McPAT:An Integrated Power, Area, andTiming Modeling Framework for Multicore and Manycore Architectures. In International Symposium on Microarchitecture (MICRO) (2009).Google Scholar
- LIPASTI, M. H., WILKERSON, C. B., AND SHEN, J. P. Value Locality and Load Value Prediction. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (1996). Google ScholarDigital Library
- LUK, C.-K., COHN, R., MUTH, R., PATIL, H., KLAUSER, A., LOWNEY, G., WALLACE, S., REDDI, V. J., AND HAZELWOOD, K. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Conference on Programming Language Design and Implementation (PLDI) (2005). Google ScholarDigital Library
- MIGUEL,J.S.,BADR,M., AND JERGER,N.E. LoadValue Approximation. In International Symposium on Microarchitecture (MICRO) (2014).Google Scholar
- MOSHOVOS, A., PNEVMATIKATOS, D. N., AND BANIASADI,A. Slice-processors:An Implementationof Operation-based Prediction. In International Conference on Supercomputing (ICS) (2001).Google ScholarDigital Library
- MOWRY, T. C., LAM, M. S., AND GUPTA, A. Design and Evaluation of a Compiler Algorithm for Prefetching. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (1992). Google ScholarDigital Library
- OSKIN,M.,CHONG,F., AND SHERWOOD,T. ActivePages: a Computation Model for Intelligent Memory. In International Symposium on Computer Architecture (ISCA) (1998).Google Scholar
- PATTERSON, D., ANDERSON, T., CARDWELL, N., FROMM, R., KEETON, K., KOZYRAKIS, C., THOMAS, R., AND YELICK, K. A Case for Intelligent RAM. IEEE Micro 17,2 (1997). Google ScholarDigital Library
- RIXNER, S., DALLY, W., KAPASI, U., KHAILANY, B., LOPEZ-LAGUNAS, A., MATTSON, P., AND OWENS, J. A Bandwidth-ef.cient Architecture for Media Processing. In International Symposium on Microarchitecture (MICRO) (1998).Google Scholar
- ROTH, A., AND SOHI, G. S. A quantitative framework for automated pre-execution thread selection. In International Symposium on Microarchitecture (MICRO) (2002). Google ScholarCross Ref
- SHAO, Y., AND BROOKS, D. Energy Characterization and Instruction-Level Energy Model of Intel's Xeon Phi Processor. In International Symposium on Low Power Electronics and Design (ISLPED) (2013). Google ScholarCross Ref
- SODANI, A., AND SOHI, G. S. Dynamic Instruction Reuse. In International Symposium on Computer Architecture (ISCA) (1997). Google ScholarDigital Library
- STONE, H. S. A Logic-in-Memory Computer. IEEE Transactions on Computers C-19,1 (1970). Google ScholarDigital Library
- SUNDARAMOORTHY, K., PURSER, Z., AND ROTENBURG, E. Slipstream Processors: Improving Both Performance and FaultTolerance. InInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2000).Google Scholar
- ZILLES, C., AND SOHI, G. Execution-based Prediction Using Speculative Slices. In International Symposium on Computer Architecture (ISCA) (2001). Google ScholarDigital Library
Recommendations
AMNESIAC: Amnesic Automatic Computer
ASPLOS '17Due to imbalances in technology scaling, the energy consumption of data storage and communication by far exceeds the energy consumption of actual data production, i.e., computation. As a consequence, recomputing data can become more energy efficient ...
AMNESIAC: Amnesic Automatic Computer
Asplos'17Due to imbalances in technology scaling, the energy consumption of data storage and communication by far exceeds the energy consumption of actual data production, i.e., computation. As a consequence, recomputing data can become more energy efficient ...
E-MiLi: energy-minimizing idle listening in wireless networks
MobiCom '11: Proceedings of the 17th annual international conference on Mobile computing and networkingWiFi interface is known to be a primary energy consumer in mobile devices, and idle listening (IL) is the dominant source of energy consumption in WiFi. Most existing protocols, such as the 802.11 power-saving mode (PSM), attempt to reduce the time ...
Comments