- Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorow.org.Google Scholar
- Atkey, R., Steuwer, M., Lindley, S., Dubach, C. Strategy preserving compilation for parallel functional code. CoRR, abs/1710.08332, 2017.Google Scholar
- Barham, P., Isard, M. Machine learning systems are stuck in a rut. In HotOS. ACM, 2019, 177--183.Google Scholar
- Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E.Q., Shen, H., Cowan, M., Wang, L., Hu, Y., Ceze, L., Guestrin, C., Krishnamurthy, A. TVM: An automated end-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018. (Carlsbad, CA, USA, October 8--10, 2018), 2018, 578--594.Google Scholar
- Hagedorn, B., Lenfers, J., Koehler, T., Qin, X., Gorlatch, S., Steuwer, M. Achieving high-performance the functional way: A functional pearl on expressing high-performance optimizations as rewrite strategies. Proc. ACM Program. Lang. 4, (ICFP), 2020.Google ScholarDigital Library
- Hagedorn, B., Stoltzfus, L., Steuwer, M., Gorlatch, S., Dubach, C. High performance stencil code generation with Lift. In Proceedings of the 2018 International Symposium on Code Generation and Optimization, CGO 2018, (Vösendorf/Vienna, Austria, February 24--28, 2018), 2018, 100--112.Google Scholar
- Hennessy, J.L., Patterson, D.A. A new golden age for computer architecture. Commun. ACM 62, 2 (2019), 48--60.Google ScholarDigital Library
- Kirchner, H. Rewriting strategies and strategic rewrite programs. In Logic, Rewriting, and Concurrency - Essays dedicated to José Meseguer on the Occasion of His 65th Birthday, 2015, 380--403.Google ScholarCross Ref
- Luttik, S.P., Visser, E., et al. Specification of rewriting strategies. Universiteit van Amsterdam. Programming Research Group, 1997.Google Scholar
- Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. Automatic differentiation in pytorch. 2017.Google Scholar
- Ragan-Kelley, J., Adams, A., Sharlet, D., Barnes, C., Paris, S., Levoy, M., Amarasinghe, S. P., Durand, F. Halide: Decoupling algorithms from schedules for high-performance image processing. Commun. ACM 61, 1 (2018), 106--115.Google Scholar
- Steuwer, M., Fensch, C., Lindley, S., Dubach, C. Generating performance portable code using rewrite rules: From high-level functional expressions to high-performance opencl code. In ICFP. ACM, 2015, 205--217.Google Scholar
- Steuwer, M., Remmelg, T., Dubach, C. Matrix multiplication beyond auto-tuning: rewrite-based GPU code generation. In CASES. ACM, 2016, 15:1--15:10.Google Scholar
- Steuwer, M., Remmelg, T., Dubach, C. Lift: A functional data-parallel IR for high-performance GPU code generation. In Proceedings of the 2017 International Symposium on Code Generation and Optimization, CGO 2017 (Austin, TX, USA, February 4--8, 2017), 2017, 74--85.Google ScholarCross Ref
- TVM. How to optimize gemm on cpu, 2020.Google Scholar
- Visser, E. Stratego: A language for program transformation based on rewriting strategies. In Rewriting Techniques and Applications, 12th International Conference, RTA 2001, Utrecht, The Netherlands, May 22--24, 2001, Proceedings, 2001, 357--362.Google Scholar
- Visser, E. Program transformation with Stratego/XT. In Domain-specific program generation. Springer, 2004, 216--238.Google Scholar
- Visser, E., Benaissa, Z., Tolmach, A.P. Building program optimizers with rewriting strategies. In Proceedings of the third ACM SIGPLAN International Conference on Functional Programming (ICFP '98) (Baltimore, Maryland, USA, September 27--29, 1998), 1998, 13--26.Google ScholarDigital Library
Index Terms
- Achieving High Performance the Functional Way: Expressing High-Performance Optimizations as Rewrite Strategies
Recommendations
Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and ...
Achieving High Performance in Bus-Based Shared-Memory Multiprocessors
In bus-based SMPs, cache misses and bus traffic form key obstacles to high performance. To overcome these problems, several techniques have been proposed: cache prefetching, read snarfing, software-controlled updating, and cache injection for reducing ...
Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies
MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on MicroarchitectureInclusive caches are commonly used by processors to simplify cache coherence. However, the trade-off has been lower performance compared to non-inclusive and exclusive caches. Contrary to conventional wisdom, we show that the limited performance of ...
Comments