ABSTRACT
Programmable data planes promise unprecedented flexibility and innovation. But enormous management issues arise when these programmable data-planes, and the in-network compute functionality they enable, are deployed within production networks. In this paper, we present an overview of these management challenges, then explore the limitations of existing management techniques. Finally, we propose a system, Harmony, that encapsulates new abstractions and primitives to address these problems.
- A. Abhashkumar, J. Lee, J. Tourrilhes, S. Banerjee, W. Wu, J.-M. Kang, and A. Akella. P5: Policy-driven Optimization of P4 Pipeline. In Proceedings of the Symposium on SDN Research, SOSR '17, pages 136--142, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- A. Agache, M. Ionescu, and C. Raiciu. CloudTalk: Enabling Distributed Application Optimisations in Public Clouds. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys '17, pages 605--619, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- N. Amit and M. Wei. The Design and Implementation of Hyperupcalls. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 97--112, Boston, MA, 2018. USENIX Association. Google ScholarDigital Library
- B. Arzani, S. Ciraci, B. T. Loo, A. Schuster, and G. Outhred. Taking the Blame Game out of Data Centers Operations with NetPoirot. In Proceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM '16, pages 440--453, New York, NY, USA, 2016. ACM. Google ScholarDigital Library
- A. AuYoung, Y. Ma, S. Banerjee, J. Lee, P. Sharma, Y. Turner, C. Liang, and J. C. Mogul. Democratic Resolution of Resource Conflicts Between SDN Control Programs. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, CoNEXT '14, pages 391--402, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Towards predictable datacenter networks. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 242--253, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- Barefoot Networks. Barefoot Tofino, 2018.Google Scholar
- P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, et al. P4: Programming protocol-independent packet processors. ACM SIGCOMM Computer Communication Review, 44(3):87--95, 2014. Google ScholarDigital Library
- P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN. SIGCOMM Comput. Commun. Rev., 43(4):99--110, Aug. 2013. Google ScholarDigital Library
- B. Burns and D. Oppenheimer. Design Patterns for Container-based Distributed Systems. In 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 16), Denver, CO, 2016. USENIX Association. Google ScholarDigital Library
- P. B. Changhoon Kim and E. Doe. In-band Network Telemetry (INT), 2016. {Online; accessed 14-July-2016}.Google Scholar
- M. Charikar, Y. Naamad, J. Rexford, and X. K. Zou. Multi-Commodity Flow with In-Network Processing. CoRR, abs/1802.09118, 2018.Google Scholar
- L. Chen, G. Chen, J. Lingys, and K. Chen. Programmable Switch as a Parallel Computing Device. CoRR, abs/1803.01491, 2018.Google Scholar
- M. Chow, D. Meisner, J. Flinn, D. Peek, and T. F. Wenisch. The Mystery Machine: End-to-end performance analysis of large-scale Internet services. In Proceedings of the 11th symposium on Operating Systems Design and Implementation, 2014. Google ScholarDigital Library
- M. Dalton, D. Schultz, J. Adriaens, A. Arefin, A. Gupta, B. Fahs, D. Rubinstein, E. C. Zermeno, E. Rubow, J. A. Docauer, J. Alpert, J. Ai, J. Olson, K. DeCabooter, M. de Kruijf, N. Hua, N. Lewis, N. Kasinadhuni, R. Crepaldi, S. Krishnan, S. Venkata, Y. Richter, U. Naik, and A. Vahdat. Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pages 373--387, Renton, WA, 2018. USENIX Association. Google ScholarDigital Library
- H. T. Dang, P. Bressana, H. Wang, K. S. Lee, M. Canini, N. Zilberman, F. Pedone, and R. Soulé. P4xos: Consensus as a Network Service. Technical Report 2018/01, University of Lugano, May 2018.Google Scholar
- H. T. Dang, D. Sciascia, M. Canini, F. Pedone, and R. Soulé. NetPaxos: Consensus at Network Speed. In Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking Research, SOSR '15, pages 5:1--5:7, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- H. T. Dang, H. Wang, T.Jepsen, G. Brebner, C. Kim, J. Rexford, R. Soulé, and H. Weatherspoon. Whippersnapper: A P4 Language Benchmark Suite. In Proceedings of the Symposium on SDN Research, SOSR '17, pages 95--101, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- G. Even, M. Rost, and S. Schmid. An Approximation Algorithm for Path Computation and Function Placement in SDNs. In Proc. 23rd International Colloquium on Structural Information and Communication Complexity (SIROCCO), 2016.Google ScholarCross Ref
- D. Firestone. VFP: A virtual switch platform for host SDN in the public cloud. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 315--328, Boston, MA, 2017. USENIX Association. Google ScholarDigital Library
- R. Fonseca, G. Porter, R. H. Katz, and S. Shenker. X-Trace: A Pervasive Network Tracing Framework. In 4th USENIX Symposium on Networked Systems Design & Implementation (NSDI 07), Cambridge, MA, 2007. USENIX Association. Google ScholarDigital Library
- L. Freire, M. Neves, L. Leal, K. Levchenko, A. Schaeffer-Filho, and M. Barcellos. Uncovering Bugs in P4 Programs with Assertion-based Verification. In Proceedings of the Symposium on SDN Research, SOSR '18, pages 4:1--4:7, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- M. Ghasemi, T. Benson, and J. Rexford. Dapper: Data Plane Performance Diagnosis of TCP. In Proceedings of the Symposium on SDN Research, SOSR '17, pages 61--74, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- H. Giesen, L. Shi, J. Sonchack, A. Chelluri, N. Prabhu, N. Sultana, L. Kant, A. J. McAuley, A. Poylisher, A. DeHon, and B. T. Loo. In-network Computing to the Rescue of Faulty Links. In Proceedings of the 2018 Morning Workshop on In-Network Computing, NetCompute '18, pages 1--6, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- D. Hancock and J. van der Merwe. HyPer4: Using P4 to Virtualize the Programmable Data Plane. In Proceedings of the 12th International on Conference on Emerging Networking EXperiments and Technologies, CoNEXT '16, pages 35--49, New York, NY, USA, 2016. ACM. Google ScholarDigital Library
- N. Handigol, B. Heller, V. Jeyakumar, D. Mazières, and N. McKeown. I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 71--85, Seattle, WA, 2014. USENIX Association. Google ScholarDigital Library
- Z. István, D. Sidler, G. Alonso, and M. Vukolic. Consensus in a Box: Inexpensive Coordination in Hardware. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 425--438, Santa Clara, CA, 2016. USENIX Association. Google ScholarDigital Library
- V. Jalaparti, H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Bridging the Tenant-provider Gap in Cloud Services. In Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, pages 10:1--10:14, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- T. Jepsen, M. Moshref, A. Carzaniga, N. Foster, and R. Soulé. Life in the Fast Lane: A Line-Rate Linear Road. In Proceedings of the Symposium on SDN Research, SOSR '18, pages 10:1--10:7, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- X. Jin, J. Gossels, J. Rexford, and D. Walker. CoVisor: A Compositional Hypervisor for Software-defined Networks. In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation, NSDI'15, pages 87--101, Berkeley, CA, USA, 2015. USENIX Association. Google ScholarDigital Library
- X. Jin, X. Li, H. Zhang, N. Foster, J. Lee, R. Soulé, C. Kim, and I. Stoica. NetChain: Scale-Free Sub-RTT Coordination. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pages 35--49, Renton, WA, 2018. USENIX Association. Google ScholarDigital Library
- X. Jin, X. Li, H. Zhang, R. Soulé, J. Lee, N. Foster, C. Kim, and I. Stoica. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP '17, pages 121--136, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- S. Y. Ko, I. Hoque, B. Cho, and I. Gupta. On Availability of Intermediate Data in Cloud Computations. In Proceedings of the 12th Conference on Hot Topics in Operating Systems, HotOS'09, pages 6--6, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarDigital Library
- T. Kohler, R. Mayer, F. Dürr, M. Maaß, S. Bhowmik, and K. Rothermel. P4CEP: Towards In-Network Complex Event Processing. In Proceedings of the 2018 Morning Workshop on In-Network Computing, NetCompute '18, pages 33--38, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- T. Koponen, K. Amidon, P. Balland, M. Casado, A. Chanda, B. Fulton, I. Ganichev, J. Gross, P. Ingram, E. Jackson, A. Lambeth, R. Lenglet, S.-H. Li, A. Padmanabhan, J. Pettit, B. Pfaff, R. Ramanathan, S. Shenker, A. Shieh, J. Stribling, P. Thakkar, D. Wendlandt, A. Yip, and R. Zhang. Network Virtualization in Multi-tenant Datacenters. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 203--216, Seattle, WA, 2014. USENIX Association. Google ScholarDigital Library
- B. Li, Z. Ruan, W. Xiao, Y. Lu, Y. Xiong, A. Putnam, E. Chen, and L. Zhang. KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP '17, pages 137--152, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- J. Li, E. Michael, and D. R. K. Ports. Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP '17, pages 104--120, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- J. Li, E. Michael, N. K. Sharma, A. Szekeres, and D. R. K. Ports. Just Say No to Paxos Overhead: Replacing Consensus with Network Ordering. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI'16, pages 467--483, Berkeley, CA, USA, 2016. USENIX Association. Google ScholarDigital Library
- Y. Li, D. Wei, X. Chen, Z. Song, R. Wu, Y. Li, X. Jin, and W. Xu. Dumb-Net: A Smart Data Center Network Fabric with Dumb Switches. In Proceedings of the Thirteenth EuroSys Conference, EuroSys '18, pages 9:1--9:13, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- J. Liang, J. Bi, Y. Zhou, and C. Zhang. In-band Network Function Telemetry. In Proceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos, SIGCOMM '18, pages 42--44, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- H. H. Liu, X. Wu, M. Zhang, L. Yuan, R. Wattenhofer, and D. Maltz. zUpdate: Updating Data Center Networks with Zero Loss. SIGCOMM Comput. Commun. Rev., 43(4):411--422, Aug. 2013. Google ScholarDigital Library
- J. Liu, W. Hallahan, C. Schlesinger, M. Sharif, J. Lee, R. Soulé, H. Wang, C. Caçcaval, N. McKeown, and N. Foster. P4V: Practical Verification for Programmable Data Planes. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '18, pages 490--503, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- M. Liu, L. Luo, J. Nelson, L. Ceze, A. Krishnamurthy, and K. Atreya. In-cBricks: Toward In-Network Computation with an In-Network Cache. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '17, pages 795--809, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- J. Mace and R. Fonseca. Universal context propagation for distributed system instrumentation. In Proceedings of the Thirteenth EuroSys Conference, EuroSys '18, pages 8:1--8:18, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- I. Martinez-Yelmo, J. Alvarez-Horcajo, M. Briso-Montiano, D. Lopez-Pajares, and E. Rojas. ARP-P4: A Hybrid ARP-Path/P4Runtime Switch. In 2018 IEEE 26th International Conference on Network Protocols (ICNP), pages 438--439, Sep. 2018.Google ScholarCross Ref
- R. Miao, H. Zeng, C. Kim, J. Lee, and M. Yu. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '17, pages 15--28, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- R. Miao, H. Zeng, C. Kim, J. Lee, and M. Yu. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '17, pages 15--28, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- C. Monsanto, J. Reich, N. Foster, J. Rexford, and D. Walker. Composing Software Defined Networks. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 1--13, Lombard, IL, 2013. USENIX Association. Google ScholarDigital Library
- R. Ozdag. Intel® Ethernet Switch FM6000 Series-Software Defined Networking. 2012.Google Scholar
- P. M. Phothilimthana, M. Liu, A. Kaufmann, S. Peter, R. Bodik, and T. Anderson. Floem: A Programming System for NIC-accelerated Network Applications. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI'18, pages 663--679, Berkeley, CA, USA, 2018. USENIX Association. Google ScholarDigital Library
- T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. Hey, You, Get off of My Cloud: Exploring Information Leakage in Third-party Compute Clouds. In Proceedings of the 16th ACM Conference on Computer and Communications Security, CCS '09, pages 199--212, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- G. Sallam, G. R. Gupta, B. Li, and B. Ji. Shortest Path and Maximum Flow Problems Under Service Function Chaining Constraints. In IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, pages 2132--2140, April 2018.Google ScholarCross Ref
- D. Sanvito, G. Siracusano, and R. Bifulco. Can the Network Be the AI Accelerator? In Proceedings of the 2018 Morning Workshop on In-Network Computing, NetCompute '18, pages 20--25, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- A. Sapio, I. Abdelaziz, A. Aldilaijan, M. Canini, and P. Kalnis. In-Network Computation is a Dumb Idea Whose Time Has Come. In Proceedings of the Sixteenth ACM Workshop on Hot Topics in Networks, 2017. Google ScholarDigital Library
- J. Seedorf and E. Burger. Application-Layer Traffic Optimization (ALTO) Problem Statement. RFC 5693, Oct. 2009.Google Scholar
- B. H. Sigelman, L. A. Barroso, M. Burrows, P. Stephenson, M. Plakal, D. Beaver, S. Jaspan, and C. Shanbhag. Dapper, a large-scale distributed systems tracing infrastructure. Google research, 2010.Google Scholar
- A. Sivaraman, S. Subramanian, M. Alizadeh, S. Chole, S.-T. Chuang, A. Agrawal, H. Balakrishnan, T. Edsall, S. Katti, and N. McKeown. Programmable Packet Scheduling at Line Rate. In Proceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM '16, pages 44--57, New York, NY, USA, 2016. ACM. Google ScholarDigital Library
- J. Sonchack, A. J. Aviv, E. Keller, and J. M. Smith. Turboflow: Information Rich Flow Record Generation on Commodity Switches. In Proceedings of the Thirteenth EuroSys Conference, EuroSys '18, pages 11:1--11:16, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- J. Sonchack, O. Michel, A. J. Aviv, E. Keller, and J. M. Smith. Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries With *Flow. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 823--835, Boston, MA, 2018. USENIX Association. Google ScholarDigital Library
- R. Stoenescu, D. Dumitrescu, M. Popovici, L. Negreanu, and C. Raiciu. Debugging P4 Programs with Vera. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '18, pages 518--532, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
- N. Sultana, S. Galea, D. Greaves, M. Wojcik, J. Shipton, R. Clegg, L. Mai, P. Bressana, R. Soulé, R. Mortier, P. Costa, P. Pietzuch, J. Crowcroft, A. W. Moore, and N. Zilberman. Emu: Rapid Prototyping of Networking Services. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 459--471, Santa Clara, CA, 2017. USENIX Association. Google ScholarDigital Library
- P. Sun, R. Mahajan, J. Rexford, L. Yuan, M. Zhang, and A. Arefin. A Network-state Management Service. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, pages 563--574, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- The Kubernetes Authors. Container Lifecycle Hooks, 2019.Google Scholar
- H. Wang, R. Soulé, H. T. Dang, K. S. Lee, V. Shrivastav, N. Foster, and H. Weatherspoon. P4FPGA: A Rapid Prototyping Framework for P4. In Proceedings of the Symposium on SDN Research, SOSR '17, pages 122--135, New York, NY, USA, 2017. ACM. Google ScholarDigital Library
- C. Zhang, J. Bi, Y. Zhou, A. B. Dogar, and J. Wu. HyperV: A High Performance Hypervisor for Virtualization of the Programmable Data Plane. In 2017 26th International Conference on Computer Communication and Networks (ICCCN), pages 1--9, July 2017.Google Scholar
- J. Zhang, A. Sinha, J. Llorca, A. M. Tulino, and E. Modiano. Optimal Control of Distributed Computing Networks with Mixed-Cast Traffic Flows. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, pages 1880--1888, 2018.Google ScholarCross Ref
- P. Zheng, T. Benson, and C. Hu. P4Visor: Lightweight Virtualization and Composition Primitives for Building and Testing Modular Programs. In Proceedings of the 14th International Conference on Emerging Networking EXperiments and Technologies, CoNEXT '18, pages 98--111, New York, NY, USA, 2018. ACM. Google ScholarDigital Library
Index Terms
- In-Network Compute: Considered Armed and Dangerous
Recommendations
IncBricks: Toward In-Network Computation with an In-Network Cache
Asplos'17The emergence of programmable network devices and the increasing data traffic of datacenters motivate the idea of in-network computation. By offloading compute operations onto intermediate networking devices (e.g., switches, network accelerators, ...
IncBricks: Toward In-Network Computation with an In-Network Cache
ASPLOS '17The emergence of programmable network devices and the increasing data traffic of datacenters motivate the idea of in-network computation. By offloading compute operations onto intermediate networking devices (e.g., switches, network accelerators, ...
IncBricks: Toward In-Network Computation with an In-Network Cache
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsThe emergence of programmable network devices and the increasing data traffic of datacenters motivate the idea of in-network computation. By offloading compute operations onto intermediate networking devices (e.g., switches, network accelerators, ...
Comments