Abstract
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64-bit application-level instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardware-designer intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single- instruction tests and concurrent litmus tests.
- J. Alglave and L. Maranget. The diy tool. http://diy.inria.fr/.Google Scholar
- J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Fences in weak memory models. In Proc. CAV, 2010. Google ScholarDigital Library
- J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Litmus: running tests against hardware. In Proceedings of TACAS 2011: the 17th international conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 41–44, Berlin, Heidelberg, 2011. Springer-Verlag. ISBN 978-3-642-19834-2. URL http://dl.acm.org/citation.cfm?id=1987389.1987395. Google ScholarDigital Library
- J. Alglave, L. Maranget, and M. Tautschnig. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM TOPLAS, 36(2):7:1–7:74, July 2014. ISSN 0164-0925.. Google ScholarDigital Library
- R. M. Amadio, N. Ayache, F. Bobot, J. Boender, B. Campbell, I. Garnier, A. Madet, J. McKinna, D. P. Mulligan, M. Piccolo, R. Pollack, Y. Régis-Gianas, C. S. Coen, I. Stark, and P. Tranquilli. Certified complexity (cerco). In Foundational and Practical Aspects of Resource Analysis - Third International Workshop, FOPARA 2013, Bertinoro, Italy, August 29-31, 2013, Revised Selected Papers, pages 1–18, 2013.Google Scholar
- . URL http://dx.doi.org/10.1007/978-3-319-12466-7_1.Google Scholar
- ARM. Cortex-a57 processor. www.arm.com/products/processors/ cortex-a/cortex-a57-processor.php, Accessed 2015/07/06.Google Scholar
- ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile). ARM Ltd., 2014.Google Scholar
- W. W. Collier. Reasoning about parallel architectures. Prentice Hall, Englewood Cliffs, 1992. Google ScholarDigital Library
- ISBN 0-13-767187-3. URL http://opac.inria.fr/record=b1105256.Google Scholar
- J. a. Dias and N. Ramsey. Automatically generating instruction selectors using declarative machine descriptions. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’10, pages 403–416, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-479-9.. URL http://doi.acm.org/10.1145/1706299.1706346. Google ScholarDigital Library
- A. C. J. Fox. Directions in ISA specification. In Interactive Theorem Proving – Third International Conference, ITP 2012, Princeton, NJ, USA, August 13-15, 2012. Proceedings, pages 338–344, 2012.. URL http://dx.doi.org/10.1007/978-3-642-32347-8_23.Google Scholar
- S. Goel, W. A. Hunt, M. Kaufmann, and S. Ghosh. Simulation and formal verification of x86 machine-code programs that make system calls. In Proceedings of the 14th Conference on Formal Methods in Computer-Aided Design, FMCAD ’14, pages 18:91–18:98, Austin, TX, 2014. FMCAD Inc. ISBN 978-0-9835678-4-4. URL http://dl.acm.org/citation.cfm?id=2682923.2682944. Google ScholarDigital Library
- K. E. Gray, G. Kerneis, D. Mulligan, C. Pulte, S. Sarkar, and P. Sewell. An integrated concurrency and core-ISA architectural envelope definition, and test oracle, for IBM POWER multiprocessors. In Proc. MICRO-48, the 48th Annual IEEE/ACM International Symposium on Microarchitecture, Dec. 2015. Google ScholarDigital Library
- A. Kennedy, N. Benton, J. B. Jensen, and P.-E. Dagand. Coq: The world’s best macro assembler? In Proceedings of the 15th Symposium on Principles and Practice of Declarative Programming, PPDP ’13, pages 13–24, New York, NY, USA, 2013. ACM. ISBN 978-1-4503- 2154-9.. URL http://doi.acm.org/10.1145/2505879.2505897. Google ScholarDigital Library
- R. Kumar, M. O. Myreen, M. Norrish, and S. Owens. Cakeml: A verified implementation of ml. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, pages 179–191, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2544-8.. URL http://doi.acm.org/10.1145/2535838.2535841. Google ScholarDigital Library
- X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363–446, 2009. URL http://gallium.inria.fr/~xleroy/publi/compcert-backend.pdf. Google ScholarDigital Library
- L. Maranget, S. Sarkar, and P. Sewell. A tutorial introduction to the ARM and POWER relaxed memory models. Draft available from http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf, 2012.Google Scholar
- P. McKenney. Validating memory barriers and atomic instructions. Linux Weekly News article, Dec. 2011.Google Scholar
- https://lwn.net/Articles/470681/.Google Scholar
- G. Morrisett, G. Tan, J. Tassarotti, J. Tristan, and E. Gan. Rocksalt: better, faster, stronger SFI for the x86. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, Beijing, China - June 11 - 16, 2012, pages 395–404, 2012.. URL http://doi.acm.org/10.1145/2254064.2254111. Google ScholarDigital Library
- D. P. Mulligan, S. Owens, K. E. Gray, T. Ridge, and P. Sewell. Lem: reusable engineering of real-world semantics. In Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming, pages 175–188, 2014.. URL http://doi.acm.org/10.1145/2628136.2628143. Google ScholarDigital Library
- S. Owens, S. Sarkar, and P. Sewell. A better x86 memory model: x86-TSO. In Proceedings of TPHOLs 2009: Theorem Proving in Higher Order Logics, LNCS 5674, pages 391–407, 2009. Google ScholarDigital Library
- S. Sarkar, P. Sewell, F. Zappa Nardelli, S. Owens, T. Ridge, T. Braibant, M. Myreen, and J. Alglave. The semantics of x86-CC multiprocessor machine code. In Proceedings of POPL 2009: the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, pages 379–391, Jan. 2009.. URL http://doi.acm.org/10.1145/1594834.1480929. Google ScholarDigital Library
- S. Sarkar, P. Sewell, J. Alglave, L. Maranget, and D. Williams. Understanding POWER multiprocessors. In Proceedings of PLDI 2011: the 32nd ACM SIGPLAN conference on Programming Language Design and Implementation, pages 175–186, 2011.. URL http://doi.acm.org/10.1145/1993498.1993520. Google ScholarDigital Library
- S. Sarkar, K. Memarian, S. Owens, M. Batty, P. Sewell, L. Maranget, J. Alglave, and D. Williams. Synchronising C/C++ and POWER. In Proceedings of PLDI 2012, the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (Beijing), pages 311–322, 2012. Google ScholarDigital Library
- . URL http://doi.acm.org/10.1145/2254064.2254102.Google Scholar
- P. Sewell, S. Sarkar, S. Owens, F. Zappa Nardelli, and M. O. Myreen. x86-TSO: A rigorous and usable programmer’s model for x86 multiprocessors. Communications of the ACM, 53(7):89–97, July 2010. Google ScholarDigital Library
- (Research Highlights).Google Scholar
- X. Shi, J.-F. Monin, F. Tuong, and F. Blanqui. First steps towards the certification of an ARM simulator using Compcert. In J.-P. Jouannaud and Z. Shao, editors, Certified Programs and Proofs, volume 7086 of Lecture Notes in Computer Science, pages 346–361. Springer Berlin Heidelberg, 2011. ISBN 978-3-642-25378-2.. URL http://dx.doi.org/10.1007/978-3-642-25379-9_25. Google ScholarDigital Library
- J. Ševˇcík, V. Vafeiadis, F. Zappa Nardelli, S. Jagannathan, and P. Sewell. CompCertTSO: A verified compiler for relaxed-memory concurrency. J. ACM, 60(3):22:1–22:50, June 2013. ISSN 0004-5411. Google ScholarDigital Library
Index Terms
- Modelling the ARMv8 architecture, operationally: concurrency and ISA
Recommendations
Simplifying ARM concurrency: multicopy-atomic axiomatic and operational models for ARMv8
ARM has a relaxed memory model, previously specified in informal prose for ARMv7 and ARMv8. Over time, and partly due to work building formal semantics for ARM concurrency, it has become clear that some of the complexity of the model is not justified by ...
Modelling the ARMv8 architecture, operationally: concurrency and ISA
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesIn this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64-bit application-level instruction set (ISA). Our goal is to clarify what the range of architecturally allowable ...
Cross-ISA execution of SIMD regions for improved performance
SYSTOR '19: Proceedings of the 12th ACM International Conference on Systems and StorageWe investigate the effectiveness of executing SIMD workloads on multiprocessors with heterogeneous Instruction Set Architecture (ISA) cores. Heterogeneous ISAs offer an intriguing clock speed/parallelism tradeoff for workloads with frequent usage of ...
Comments