research-article

Open Access

Flexible Device Sharing in PCIe Clusters using Device Lending

Authors:
Jonas Markussen

Simula Research Laboratory, Oslo, Norway and University of Oslo

Simula Research Laboratory, Oslo, Norway and University of Oslo
View Profile

,
Lars Bjørlykke Kristiansen

Dolphin Interconnect Solution AS, Oslo, Norway

Dolphin Interconnect Solution AS, Oslo, Norway
View Profile

,
Håkon Kvale Stensland

Simula Research Laboratory, Oslo, Norway and University of Oslo

Simula Research Laboratory, Oslo, Norway and University of Oslo
View Profile

,
Friedrich Seifert

Dolphin Interconnect Solution AS, Oslo, Norway

Dolphin Interconnect Solution AS, Oslo, Norway
View Profile

,
Carsten Griwodz

University of Oslo, Oslo, Norway and Simula Research Laboratory

University of Oslo, Oslo, Norway and Simula Research Laboratory
View Profile

,
Pål Halvorsen

Simula Research Laboratory, Oslo, Norway and University of Oslo

Simula Research Laboratory, Oslo, Norway and University of Oslo
View Profile

ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel ProcessingAugust 2018Article No.: 48Pages 1–10https://doi.org/10.1145/3229710.3229759

Published:13 August 2018Publication History

ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing

Pages 1–10

ABSTRACT

Processing workloads may have very high IO demands, exceeding the capabilities provided by resource virtualization and requiring direct access to the physical hardware. For computers that are interconnected in PCI Express (PCIe) networks, we have previously proposed Device Lending as a solution for assigning devices to remote hosts. In this paper, we explain how we have extended our implementation with support for the Linux Kernel-based Virtual Machine (KVM) hypervisor. Using our extended Device Lending, it becomes possible to dynamically "pass through" physical remote devices to VM guests while still retaining the flexibility of virtualization, something that previously required extensive facilitation in both hypervisor and device drivers in the form of paravirtualization.

We have also improved our original implementation with support for interoperability between remote devices. We show that it is possible to use multiple devices residing in different hosts, while still achieving the same bandwidth and latency as native PCIe, and without requiring any additional support in device drivers.

References

{n. d.}. Linux IOMMU Support. Retrieved April 28, 2018 from https://www.kernel.org/doc/Documentation/Intel-IOMMU.txtGoogle Scholar
{n. d.}. VFIO - "Virtual Function I/O". Retrieved April 28, 2018 from https://www.kernel.org/doc/Documentation/vfio.txtGoogle Scholar
Darren Abramson, Jeff Jackson, Sridhar Muthrasanallur, Gil Neiger, Greg Regnier, Rajes Sankaran, Ioannis Schoinas, Rich Uhlig, Balaji Vembu, and John Weigert. 2006. Intel Virtualization Technology for Directed I/O. Intel Technology Journal 10, 03 (2006).Google ScholarCross Ref
Knut Alnæs, Ernst H. Kristiansen, David B. Gustavson, and David V. James. 1990. Scalable Coherent Interface. In Proceedings of International Conference on Computer Systems and Software Engineering (CompEuro). 446--453.Google Scholar
Chelsio Communications Inc. 2015. The Case Against iWARP. Retrieved April 28, 2018 from https://www.chelsio.com/wp-content/uploads/resources/iWARP-Myths.pdfGoogle Scholar
Paolo Costa, Hitesh Ballani, Kaveh Razavi, and Ian Kash. 2015. R2C2: A network stack for rack-scale computers. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 551--564. Google ScholarDigital Library
Alexandros Daglis, Stanko Novaković, Edouard Bugnion, Babak Falsafi, and Boris Grot. 2015. Manycore network interfaces for in-memory rack-scale computing. ACM SIGARCH Computer Architecture News 43, 3 (2015), 567--579. Google ScholarDigital Library
Dolphin Interconnect Solutions AS. {n. d.}. PXH830 Gen3 PCI Express NTB Host Adapter. Retrieved March 1, 2018 from http://www.dolphinics.no/products/PXH830.htmlGoogle Scholar
J. Duato, A.J. Pena, F. Silla, R. Mayo, and E.S. Quintana-Ortí. 2010. rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In Proceedings of International Conference on High Performance Computing and Simulation (HPCS). 224--231.Google Scholar
T. Fountain, A. McCarthy, and F. Peng. 2005. PCI Express: An Overview of PCI Express, Cabled PCI Express and PXI Express. In Proceedings of International Conference on Accelerator & Large Expt. Physics Control Systems (ICALEPCS).Google Scholar
John P Hayes, Trevor Mudge, Quentin F Stout, Stephen Colley, and John Palmer. 1986. A Microprocessor-based Hypercube Supercomputer. IEEE Micro 6, 5 (1986), 6--17. Google ScholarDigital Library
Jian Huang, Xiangyong Ouyang, Jithin Jose, Md Wasi-Ur-Rahman, Hao Wang, Miao Luo, Hari Subramoni, Chet Murthy, and Dhabaleswar K. Panda. 2012. High-performance design of hbase with RDMA over InfiniBand. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS). 774--785. Google ScholarDigital Library
Neo Jia and Kirti Wankhede. {n.d.}. VFIO Mediated Devices. Retrieved April 29, 2018 from https://www.kernel.org/doc/Documentation/vfio-mediated-device.txtGoogle Scholar
Weihang Jiang, Jiuxing Liu, Hyun-Wook Jin, D K Panda, W Gropp, and R Thakur. 2004. High performance MPI-2 one-sided communication over InfiniBand. In Proceedings of International Symposium on Cluster Computing and the Grid (CCGrid). 531--538. Google ScholarDigital Library
Lars Bjørlykke Kristiansen, Jonas Markussen, Håkon Kvale Stensland, Michael Riegler, Hugo Kohmann, Friedrich Seifert, Roy Nordstrom, Carsten Griwodz, and Pål Halvorsen. 2016. Device Lending in PCI Express Networks. In Proceedings of International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV). 10:1--10:6. Google ScholarDigital Library
Mellanox Technologies. 2017. RoCE vs. iWARP Competitive Analysis. Retrieved April 28, 2018 from http://www.mellanox.com/related-docs/whitepapers/WP_RoCE_vs_iWARP.pdfGoogle Scholar
NVIDIA Corporation. {n. d.}. Nvidia Virtual GPU Technology (vGPU). Retrieved April 28, 2018 from http://www.nvidia.com/object/virtual-gpus.htmlGoogle Scholar
NVIDIA Corporation. 2017. CUDA Toolkit Documentation 9.1.85. Retrieved April 29, 2018 from http://docs.nvidia.com/cuda/Google Scholar
Peripheral Component Interconnect Special Interest Group (PCI-SIG). 2008. Multi-root I/O Virtualization and Sharing Specification. https://www.pcisig.com/specifications/iov/multi-root/Google Scholar
Peripheral Component Interconnect Special Interest Group (PCI-SIG) 2009. Address Translation Services Revision 1.1. Peripheral Component Interconnect Special Interest Group (PCI-SIG). https://www.pcisig.com/specifications/iov/ats/Google Scholar
Peripheral Component Interconnect Special Interest Group (PCI-SIG). 2010. PCI Express 3.1 Base Specification. https://pcisig.com/specificationsGoogle Scholar
Peripheral Component Interconnect Special Interest Group (PCI-SIG). 2010. Single-root I/O Virtualization and Sharing Specification. https://www.pcisig.com/specifications/iov/single-root/Google Scholar
Murali Ravindran. 2008. Extending Cabled PCI Express to Connect Devices with Independent PCI Domains. In Proceedings of the 2nd annual IEEE Systems Conference (SysCon). 1--7.Google ScholarCross Ref
Jack Regula. 2004. Using Non-transparent Bridging in PCI Express Systems. PLX Technology, Inc. White paper.Google Scholar
Davide Rosetti. 2014. Benchmarking GPUDirect RDMA on Modern Server Platforms. Retrieved April 29, 2018 from http://devblogs.nvidia.com/parallelforall/benchmarking-gpudirect-rdma-on-modern-server-platforms/Google Scholar
Kazuo Saito, Koji Anai, Keiju Igarashi, Takeshi Nishikawa, Ryoichi Himeno, and Kazuhiro Yoguchi. 1998. ATM bus system. US patent No. 5,796,741 A.Google Scholar
Mark J. Sullivan. 2010. Intel Xeon Processor C5500/C3500 Series Non-Transparent Bridge. Technical Report. Intel Corporation.Google Scholar
Jun Suzuki, Yoichi Hidaka, Junichi Higuchi, Teruyuki Baba, Nobuharu Kami, and Takashi Yoshikawa. 2010. Multi-root Share of Single-Root I/O Virtualization (SR-IOV) Compliant PCI Express Device. In Proceedings of Symposium on High Performance Interconnects (HOTI). IEEE, 25--31. Google ScholarDigital Library
A Trivedi, B Metzler, and P Stuedi. 2011. A case for RDMA in clouds. In Proceedings of the Second Asia-Pacific Workshop on Systems (APSys). 17:1--17:5. Google ScholarDigital Library
Cheng-Chun Tu, Chao-tang Lee, and Tzi-cker Chiueh. 2013. Secure I/O Device Sharing Among Virtual Machines on Multiple Hosts. ACM SIGARCH Computing Architecture News 41, 3 (2013), 108--119. Google ScholarDigital Library
A. Venkatesh, H. Subramoni, K. Hamidouche, and Dhabaleswar K. Panda. 2014. A high performance broadcast design with hardware multicast and GPUDirect RDMA for streaming applications on Infiniband clusters. In Proceedings of International Conference on High Performance Computing (HiPC).Google Scholar
Colin Whitby-Strevens. 1985. The transputer. ACM SIGARCH Computer Architecture News 13, 3 (1985), 292--300. Google ScholarDigital Library
Heymian Wong. {n. d.}. PCI Express Multi-Root Switch Reconfiguration During System Operation. Master's thesis. Massachusetts Institute of Technology.Google Scholar

Index Terms

Flexible Device Sharing in PCIe Clusters using Device Lending
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing
    2. Parallel architectures
      1. Interconnection architectures

Recommendations

Flexible device compositions and dynamic resource sharing in PCIe interconnected clusters using Device Lending
Abstract
Modern workloads often exceed the processing and I/O capabilities provided by resource virtualization, requiring direct access to the physical hardware in order to reduce latency and computing overhead. For computers interconnected in a cluser, ...
Read More
I/o paravirtualization at the device file boundary
ASPLOS '14

Paravirtualization is an important I/O virtualization technology since it uniquely provides all of the following benefits: the ability to share the device between multiple VMs, support for legacy devices without virtualization hardware, and high ...
Read More
Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO
EUC '14: Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing

VFIO (Virtual Function I/O) is a Linux kernel infrastructure that allows to leverage the capabilities of modern IOMMUs to drive a device directly from user space without any additional specialized kernel driver being involved. When used by QEMU/KVM, a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing
August 2018
409 pages
ISBN:9781450365239
DOI:10.1145/3229710

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 August 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
IOMMU
PCIe
Resource sharing
data access
networked resources
non-transparent bridging
resource allocation
virtualization
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate91of313submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 712
  Total Downloads
- Downloads (Last 12 months)140
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Flexible Device Sharing in PCIe Clusters using Device Lending

ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flexible device compositions and dynamic resource sharing in PCIe interconnected clusters using Device Lending

I/o paravirtualization at the device file boundary

Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Flexible Device Sharing in PCIe Clusters using Device Lending

ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flexible device compositions and dynamic resource sharing in PCIe interconnected clusters using Device Lending

I/o paravirtualization at the device file boundary

Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media