skip to main content
10.1145/3352460.3358286acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations

Published:12 October 2019Publication History

ABSTRACT

Important workloads, such as machine learning and graph analytics applications, heavily involve sparse linear algebra operations. These operations use sparse matrix compression as an effective means to avoid storing zeros and performing unnecessary computation on zero elements. However, compression techniques like Compressed Sparse Row (CSR) that are widely used today introduce significant instruction overhead and expensive pointer-chasing operations to discover the positions of the non-zero elements. In this paper, we identify the discovery of the positions (i.e., indexing) of non-zero elements as a key bottleneck in sparse matrix-based workloads, which greatly reduces the benefits of compression.

We propose SMASH, a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of SMASH is to explicitly enable the hardware to recognize and exploit sparsity in data. To this end, we devise a novel software encoding based on a hierarchy of bitmaps. This encoding can be used to efficiently compress any sparse matrix, regardless of the extent and structure of sparsity. At the same time, the bitmap encoding can be directly interpreted by the hardware. We design a lightweight hardware unit, the Bitmap Management Unit (BMU), that buffers and scans the bitmap hierarchy to perform highly-efficient indexing of sparse matrices. SMASH exposes an expressive and rich ISA to communicate with the BMU, which enables its use in accelerating any sparse matrix computation.

We demonstrate the benefits of SMASH on four use cases that include sparse matrix kernels and graph analytics applications. Our evaluations show that SMASH provides average performance improvements of 38% for Sparse Matrix Vector Multiplication and 44% for Sparse Matrix Matrix Multiplication, over a state-of-the-art CSR implementation, on a wide variety of matrices with different characteristics. SMASH incurs a very modest hardware area overhead of up to 0.076% of an out-of-order CPU core.

References

  1. Intel Math Kernel Library. http://software.intel.com/en-us/articles/intel-mkl/Google ScholarGoogle Scholar
  2. Intel Xeon Gold 5118. https://ark.intel.com/content/www/us/en/ark/products/120473/intel-xeon-gold-5118-processor-16-5m-cache-2-30-ghz.html.Google ScholarGoogle Scholar
  3. SMASH code. https://github.com/CMU-SAFARI/SMASHGoogle ScholarGoogle Scholar
  4. J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi, "A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing," in ISCA, 2015.Google ScholarGoogle Scholar
  5. J. Ahn, S. Yoo, O. Mutlu, and K. Y. Choi, "PIM-Enabled Instructions: A Low-Overhead, Locality-Aware Processing-in-Memory Architecture," 2015.Google ScholarGoogle Scholar
  6. K. Akbudak and C. Aykanat, "Exploiting Locality in Sparse Matrix-Matrix Multiplication on Many-Core Architectures," TPDS, 2017.Google ScholarGoogle Scholar
  7. M. Belgin, G. Back, and C. J. Ribbens, "Pattern-Based Sparse Matrix Representation for Memory-Efficient SMVM Kernels," in ISC, 2009.Google ScholarGoogle Scholar
  8. M. Besta, F. Marending, E. Solomonik, and T. Hoefler, "SlimSell: A Vectorizable Graph Representation for Breadth-First Search," in IPDPS, 2017.Google ScholarGoogle Scholar
  9. M. Besta, M. Podstawski, L. Groner, E. Solomonik, and T. Hoefler, "To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations," in HPDC, 2017.Google ScholarGoogle Scholar
  10. J. Bolz, I. Farmer, E. Grinspun, and P. Schröder, "Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid," SIGGRAPH, 2003.Google ScholarGoogle Scholar
  11. A. Boroumand, S. Ghose, Y. Kim, R. Ausavarungnirun, E. Shiu, R. Thakur, D. Kim, A. Kuusela, A. Knies, P. Ranganathan, and O. Mutlu, "Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks," ser. ASPLOS, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Brin and L. Page, "The Anatomy of a Large-scale Hypertextual Web Search Engine," in WWW, 1998.Google ScholarGoogle Scholar
  13. A. Buluc and J. R. Gilbert, "Challenges and Advances in Parallel Sparse Matrix-Matrix Multiplication," in ICPP, 2008.Google ScholarGoogle Scholar
  14. A. Buluç and J. R. Gilbert, "Parallel Sparse Matrix-Matrix Multiplication and Indexing: Implementation and Experiments," SISC, 2012.Google ScholarGoogle Scholar
  15. A. Buluc, S. Williams, L. Oliker, and J. Demmel, "Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication," in IPDPS, 2011.Google ScholarGoogle Scholar
  16. A. Buluç, J. T. Fineman, M. Frigo, J. R. Gilbert, and C. E. Leiserson, "Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication Using Compressed Sparse Blocks," in SPAA, 2009.Google ScholarGoogle Scholar
  17. J. Cui and Q. Qiu, "Towards Memristor based Accelerator for Sparse Matrix Vector Multiplication," in ISCAS, 2016.Google ScholarGoogle Scholar
  18. S. Dalton, L. Olson, and N. Bell, "Optimizing Sparse Matrix-Matrix Multiplication for the GPU," TMS, 2015.Google ScholarGoogle Scholar
  19. T. A. Davis and Y. Hu, "The University of Florida Sparse Matrix Collection," TOMS, 2011.Google ScholarGoogle Scholar
  20. J. Dongarra, A. Lumsdaine, X. Niu, R. Pozoz, and K. Remington, "Sparse Matrix Libraries in C++ for High Performance Architectures," 1994.Google ScholarGoogle Scholar
  21. A. Dziekonski and M. Mrozowski, "A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM," IEEE Access, 2018.Google ScholarGoogle Scholar
  22. A. Elafrou, G. Goumas, and N. Koziris, "Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Modern Multi-and Many-Core Processors," in ICPP, 2017.Google ScholarGoogle Scholar
  23. A. Elafrou, V. Karakasis, T. Gkountouvas, K. Kourtis, G. Goumas, and N. Koziris, "SparseX: A Library for High-Performance Sparse Matrix-Vector Multiplication on Multicore Platforms," TOMS, 2018.Google ScholarGoogle Scholar
  24. R. D. Falgout, "An Introduction to Algebraic Multigrid," Computing in Science Engineering, 2006.Google ScholarGoogle Scholar
  25. R. D. Falgout and U. M. Yang, "hypre: A Library of High Performance Preconditioners," in ICCS, 2002.Google ScholarGoogle Scholar
  26. J. Fowers, K. Ovtcharov, K. Strauss, E. S. Chung, and G. Stitt, "A High Memory Bandwidth FPGA Accelerator for Sparse Matrix-Vector Multiplication," in FCCM, 2014.Google ScholarGoogle Scholar
  27. L. Freeman, "A Set of Measures of Centrality Based on Betweenness," Sociometry, 1977.Google ScholarGoogle Scholar
  28. M. Gao, J. Pu, X. Yang, M. Horowitz, and C. Kozyrakis, "TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory," in ASPLOS, 2017.Google ScholarGoogle Scholar
  29. B. Graham, "Spatially-Sparse Convolutional Neural Networks," arXiv, 2014.Google ScholarGoogle Scholar
  30. J. L. Greathouse and M. Daga, "Efficient Sparse Matrix-vector Multiplication on GPUs Using the CSR Storage Format," in SC, 2014.Google ScholarGoogle Scholar
  31. F. Gremse, A. Höfter, L. O. Schwen, F. Kiessling, and U. Naumann, "GPU-Accelerated Sparse Matrix-Matrix Multiplication by Iterative Row Merging," SIAM, 2015.Google ScholarGoogle Scholar
  32. P. Grigoras, P. Burovskiy, E. Hung, and W. Luk, "Accelerating SpMV on FPGAs by Compressing Nonzero Values," in FCCM, 2015.Google ScholarGoogle Scholar
  33. U. Gupta, X. Wang, M. Naumov, C. Wu, B. Reagen, D. Brooks, B. Cottel, K. M. Hazelwood, B. Jia, H. S. Lee, A. Malevich, D. Mudigere, M. Smelyanskiy, L. Xiong, and X. Zhang, "The Architectural Implications of Facebook's DNN-based Personalized Recommendation," CoRR, 2019.Google ScholarGoogle Scholar
  34. P. Hénon, P. Ramet, and J. Roman, "PASTIX: A High-Performance Parallel Direct Solver for Sparse Symmetric Positive Definite Systems," PMAA, 2002.Google ScholarGoogle Scholar
  35. C. Hong, A. Sukumaran-Rajam, B. Bandyopadhyay, J. Kim, S. E. Kurt, I. Nisa, S. Sabhlok, Ü. V. Çatalyürek, S. Parthasarathy, and P. Sadayappan, "Efficient Sparse-Matrix Multi-Vector Product on GPUs," in HPDC, 2018.Google ScholarGoogle Scholar
  36. K. Hsieh, S. M. Khan, N. Vijaykumar, K. K. Chang, A. Boroumand, S. Ghose, and O. Mutlu, "Accelerating Pointer Chasing in 3D-Stacked memory: Challenges, Mechanisms, Evaluation," 2016.Google ScholarGoogle Scholar
  37. E.-J. Im, K. Yelick, and R. Vuduc, "Sparsity: Optimization Framework for Sparse Matrix Kernels," IJHPCA, 2004.Google ScholarGoogle Scholar
  38. E.-J. Im and K. A. Yelick, "Optimizing Sparse Matrix Vector Multiplication on SMP." in PPSC, 1999.Google ScholarGoogle Scholar
  39. J. Kestyn, V. Kalantzis, E. Polizzi, and Y. Saad, "PFEAST: A High Performance Sparse Eigenvalue Solver Using Distributed-memory Linear Solvers," in SC, 2016.Google ScholarGoogle Scholar
  40. F. Kjolstad, S. Chou, D. Lugato, S. Kamil, and S. Amarasinghe, "taco: A Tool to Generate Tensor Algebra Kernels," in ASE, 2017.Google ScholarGoogle Scholar
  41. K. Kourtis, G. Goumas, and N. Koziris, "Optimizing Sparse Matrix-Vector Multiplication using Index and Value Compression," in CF, 2008.Google ScholarGoogle Scholar
  42. K. Kourtis, V. Karakasis, G. Goumas, and N. Koziris, "CSX: An Extended Compression Format for SpMV on Shared Memory Systems," in PPoPP, 2011.Google ScholarGoogle Scholar
  43. N. Kurd, S. Bhamidipati, C. Mozak, J. L. Miller, T. M. Wilson, M. Nemani, and M. Chowdhury, "Westmere: A Family of 32nm IA Processors," in ISSCC, 2010.Google ScholarGoogle Scholar
  44. D. Langr and P. Tvrdik, "Evaluation Criteria for Sparse Matrix Storage Formats," in TPDS, 2016.Google ScholarGoogle Scholar
  45. J. Leskovec and R. Sosič, "Snap: A General-Purpose Network Analysis and Graph-Mining Library," TIST, 2016.Google ScholarGoogle Scholar
  46. J. Li, G. Tan, M. Chen, and N. Sun, "SMAT: An Input Adaptive Auto-tuner for Sparse Matrix-vector Multiplication," in PLDI, 2013.Google ScholarGoogle Scholar
  47. S. Li, K. Chen, J. H. Ahn, J. B. Brockman, and N. P. Jouppi, "CACTI-P: Architecture-Level Modeling for SRAM-Based Structures with Advanced Leakage Reduction Techniques," in CAD, 2011.Google ScholarGoogle Scholar
  48. C. Y. Lin, N. Wong, and H. K.-H. So, "Design Space Exploration for Sparse Matrix-Matrix Multiplication on FPGAs," FPT, 2013.Google ScholarGoogle Scholar
  49. G. Linden, B. Smith, and J. York, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering," IC, 2003.Google ScholarGoogle Scholar
  50. B. Liu, M. Wang, H. Foroosh, M. Tappen, and M. Pensky, "Sparse Convolutional Neural Networks," in CVPR, 2015.Google ScholarGoogle Scholar
  51. W. Liu and B. Vinter, "An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data," in IPDPS, 2014.Google ScholarGoogle Scholar
  52. W. Liu and B. Vinter, "A Framework for General Sparse Matrix-matrix Multiplication on GPUs and Heterogeneous Processors," JPDC, 2015.Google ScholarGoogle Scholar
  53. W. Liu and B. Vinter, "CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication," in ICS, 2015.Google ScholarGoogle Scholar
  54. X. Liu, M. Smelyanskiy, E. Chow, and P. Dubey, "Efficient Sparse Matrix-Vector Multiplication on x86-Based Many-Core Processors," in SC, 2013.Google ScholarGoogle Scholar
  55. K. K. Matam, S. R. K. B. Indarapu, and K. Kothapalli, "Sparse Matrix-Matrix Multiplication on Modern Architectures," in HiPC, 2012.Google ScholarGoogle Scholar
  56. J. Mellor-Crummey and J. Garvin, "Optimizing Sparse Matrix-Vector Product Computations using Unroll and Jam," IJHPCA, 2004.Google ScholarGoogle Scholar
  57. D. Merrill and M. Garland, "Merge-Based Parallel Sparse Matrix-Vector Multiplication," in SC, 2016.Google ScholarGoogle Scholar
  58. D. Merrill and M. Garland, "Merge-Based Sparse Matrix-Vector Multiplication (SpMV) using the CSR Storage Format," in PPoPP, 2016.Google ScholarGoogle Scholar
  59. A. K. Mishra, E. Nurvitadhi, G. Venkatesh, J. Pearce, and D. Marr, "Fine-Grained Accelerators for Sparse Machine Learning Workloads," in ASP-DAC, 2017.Google ScholarGoogle Scholar
  60. A. Mukkara, N. Beckmann, M. Abeydeera, X. Ma, and D. Sanchez, "Exploiting Locality in Graph Analytics Through Hardware-Accelerated Traversal Scheduling," in MICRO, 2018.Google ScholarGoogle Scholar
  61. M. Naumov, D. Mudigere, H. M. Shi, J. Huang, N. Sundaraman, J. Park, X. Wang, U. Gupta, C. Wu, A. G. Azzolini, D. Dzhulgakov, A. Mallevich, I. Cherniavskii, Y. Lu, R. Krishnamoorthi, A. Yu, V. Kondratenko, S. Pereira, X. Chen, W. Chen, V. Rao, B. Jia, L. Xiong, and M. Smelyanskiy, "Deep Learning Recommendation Model for Personalization and Recommendation Systems," CoRR, 2019.Google ScholarGoogle Scholar
  62. R. Nishtala, R. W. Vuduc, J. W. Demmel, and K. A. Yelick, "When Cache Blocking of Sparse Matrix Vector Multiply Works and Why," AAECC, 2007.Google ScholarGoogle Scholar
  63. E. Nurvitadhi, A. Mishra, and D. Marr, "A Sparse Matrix Vector Multiply Accelerator for Support Vector Machine," in CASES, 2015.Google ScholarGoogle Scholar
  64. E. Nurvitadhi, A. Mishra, Y. Wang, G. Venkatesh, and D. Marr, "Hardware Accelerator for Analytics of Sparse Data," in DAC, 2016.Google ScholarGoogle Scholar
  65. L. Page, S. Brin, R. Motwani, and T. Winograd, "The PageRank Citation Ranking: Bringing Order to the Web." Stanford InfoLab, Tech. Rep., 1999.Google ScholarGoogle Scholar
  66. S. Pal, J. Beaumont, D.-H. Park, A. Amarnath, S. Feng, C. Chakrabarti, H.-S. Kim, D. Blaauw, T. Mudge, and R. Dreslinski, "OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator," in HPCA, 2018.Google ScholarGoogle Scholar
  67. G. Penn, "Efficient Transitive Closure of Sparse Matrices over Closed Semirings," AMiLP, 2006.Google ScholarGoogle Scholar
  68. A. Pinar and M. T. Heath, "Improving Performance of Sparse Matrix-Vector Multiplication," in SC, 1999.Google ScholarGoogle Scholar
  69. L. Ren, X. Chen, Y. Wang, C. Zhang, and H. Yang, "Sparse LU Factorization for Parallel Circuit Simulation on GPU," in DAC, 2012.Google ScholarGoogle Scholar
  70. Y. Saad, Iterative Methods for Sparse Linear Systems, 2003.Google ScholarGoogle Scholar
  71. D. Sanchez and C. Kozyrakis, "ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems," in ISCA, 2013.Google ScholarGoogle Scholar
  72. S. Sengupta, M. Harris, Y. Zhang, and J. D. Owens, "Scan Primitives for GPU Computing," in GH, 2007.Google ScholarGoogle Scholar
  73. V. Seshadri, G. Pekhimenko, O. Ruwase, O. Mutlu, P. B. Gibbons, M. A. Kozuch, T. C. Mowry, and T. Chilimbi, "Page Overlays: An Enhanced Virtual Memory Framework to Enable Fine-grained Memory Management," in ISCA, 2015.Google ScholarGoogle Scholar
  74. J. Shun and G. E. Blelloch, "Ligra: A Lightweight Graph Processing Framework for Shared Memory," in PPoPP, 2013.Google ScholarGoogle Scholar
  75. A. Smith. 6 New Facts About Facebook. http://mediashift.orgGoogle ScholarGoogle Scholar
  76. L. Song, Y. Zhuo, X. Qian, H. Li, and Y. Chen, "GraphR: Accelerating Graph Processing using ReRAM," in HPCA, 2018.Google ScholarGoogle Scholar
  77. B.-Y. Su and K. Keutzer, "clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs," in ICS, 2012.Google ScholarGoogle Scholar
  78. P. D. Sulatycke and K. Ghose, "Caching-Efficient Multithreaded Fast Multiplication of Sparse Matrices," in IPPS, 1998.Google ScholarGoogle Scholar
  79. S. Toledo, "Improving the Memory-System Performance of Sparse-Matrix Vector Multiplication," IBM Journal of research and development, 1997.Google ScholarGoogle Scholar
  80. P.-A. Tsai, Y. L. Gan, and D. Sanchez, "Rethinking the Memory Hierarchy for Modern Languages," in MICRO, 2018.Google ScholarGoogle Scholar
  81. P.-A. Tsai and D. Sanchez, "Compress Objects, Not Cache Lines: An Object-Based Compressed Memory Hierarchy," in ASPLOS, 2019.Google ScholarGoogle Scholar
  82. Y. Umuroglu and M. Jahre, "An Energy Efficient Column-major Backend for FPGA SpMV Accelerators," in ICCD, 2014.Google ScholarGoogle Scholar
  83. S. Van Dongen, "Graph Clustering via a Discrete Uncoupling Process," SIMAX, 2008.Google ScholarGoogle Scholar
  84. N. Vijaykumar, E. Ebrahimi, K. Hsieh, P. B. Gibbons, and O. Mutlu, "The Locality Descriptor: A Holistic Cross-Layer Abstraction to Express Data Locality In GPUs," in ISCA, 2018.Google ScholarGoogle Scholar
  85. N. Vijaykumar, A. Jain, D. Majumdar, K. Hsieh, G. Pekhimenko, E. Ebrahimi, N. Hajinazar, P. B. Gibbons, and O. Mutlu, "A Case for Richer Cross-layer Abstractions: Bridging the Semantic Gap with Expressive Memory," in ISCA, 2018.Google ScholarGoogle Scholar
  86. R. W. Vuduc and H.-J. Moon, "Fast Sparse Matrix-vector Multiplication by Exploiting Variable Block Structure," in HPCC, 2005.Google ScholarGoogle Scholar
  87. Q. Wang, X. Zhang, Y. Zhang, and Q. Yi, "AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs," in SC, 2013.Google ScholarGoogle Scholar
  88. J. B. White and P. Sadayappan, "On Improving the Performance of Sparse Matrix-Vector Multiplication," in HiPC, 1997.Google ScholarGoogle Scholar
  89. J. Willcock and A. Lumsdaine, "Accelerating Sparse Matrix Computations via Data Compression," in ICS, 2006.Google ScholarGoogle Scholar
  90. S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel, "Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms," in SC, 2007.Google ScholarGoogle Scholar
  91. T. Wu, B. Wang, Y. Shan, F. Yan, Y. Wang, and N. Xu, "Efficient PageRank and SpMV Computation on AMD GPUs," in ICPP, 2010.Google ScholarGoogle Scholar
  92. Z. Xianyi, W. Qian, and Z. Yunquan, "Model-driven Level 3 BLAS Performance Optimization on Loongson 3A processor," in ICPADS, 2012.Google ScholarGoogle Scholar
  93. S. Yan, C. Li, Y. Zhang, and H. Zhou, "yaSpMV: Yet Another SpMV Framework on GPUs," in PPoPP, 2014.Google ScholarGoogle Scholar
  94. L. Yavits and R. Ginosar, "Sparse Matrix Multiplication on CAM Based Accelerator," CoRR, 2017.Google ScholarGoogle Scholar
  95. M. Zhang, Y. Zhuo, C. Wang, M. Gao, Y. Wu, K. Chen, C. Kozyrakis, and X. Qian, "GraphP: Reducing Communication for PIM-based Graph Processing with Efficient Data Partition," in HPCA, 2018.Google ScholarGoogle Scholar
  96. S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, and Y. Chen, "Cambricon-X: An Accelerator for Sparse Neural Networks," in MICRO, 2016.Google ScholarGoogle Scholar
  97. Y. Zhao, J. Li, C. Liao, and X. Shen, "Bridging the Gap between Deep Learning and Sparse Matrix Format Selection," in PPoPP, 2018.Google ScholarGoogle Scholar
  98. X. Zhou, Z. Du, Q. Guo, S. Liu, C. Liu, C. Wang, X. Zhou, L. Li, T. Chen, and Y. Chen, "Cambricon-S: Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach," in MICRO, 2018.Google ScholarGoogle Scholar
  99. Q. Zhu, T. Graf, H. E. Sumbul, L. Pileggi, and F. Franchetti, "Accelerating Sparse Matrix-Matrix Multiplication with 3D-Stacked Logic-in-Memory Hardware," in HPEC, 2013.Google ScholarGoogle Scholar
  100. Y. Zhuo, C. Wang, M. Zhang, R. Wang, D. Niu, Y. Wang, and X. Qian, "GraphQ: Scalable PIM-Based Graph Processing," in MICRO, 2019.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    MICRO '52: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture
    October 2019
    1104 pages
    ISBN:9781450369381
    DOI:10.1145/3352460

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 12 October 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate484of2,242submissions,22%

    Upcoming Conference

    MICRO '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader