ABSTRACT
Despite major advances in the engineering of maintainable and robust software over the years, upgrading software remains a primitive and error-prone activity. In this paper, we argue that several problems with upgrading software are caused by a poor integration between upgrade deployment, user-machine testing, and problem reporting. To support this argument, we present a characterization of softwareupgrades resulting from a survey we conducted of 50 system administrators. Motivated by the survey results, we present Mirage, a distributed framework for integrating upgrade deployment, user-machine testing, and problem reporting into the overall upgrade development process. Our evaluation focuses on the most novel aspect of Mirage, namely its staged upgrade deployment based on the clustering of usermachines according to their environments and configurations. Our results suggest that Mirage's staged deployment is effective for real upgrade problems.
Supplemental Material
Available for Download
Slides from the presentation
Supplemental material for Staged deployment in mirage, an integrated software upgrade testing and distribution system
- M. Agrawal and S. Seshan. Development Tools for Distributed Applications. In Proceedings of the 9th Workshop on Hot Topics in Operating Systems, May 2003. Google ScholarDigital Library
- S. Ajmani, B. Liskov, and L. Shrira. Scheduling and Simulation: How to Upgrade Distributed Systems. In Proceedings of the 9th Workshop on Hot Topics in Operating Systems, May 2003. Google ScholarDigital Library
- ASF Bugzilla Bug 10073 upgrade from 1.3.24 to 1.3.26 breaks include directive. http://issues.apache.org/bugzilla/show_bug.cgi?id=10073.Google Scholar
- S. Beattie, S. Arnold, C. Cowan, P. Wagle, C. Wright, and A. Shostack. Timing the Application of Security Patches for Optimal Uptime. In Proceedings of the 16th Systems Administration Conference, 2002. Google ScholarDigital Library
- P. Brada. Metadata Support for Safe Component Upgrades. In Proceedings of the 26th International Computer Software and Applications Conference, August 2002. Google ScholarDigital Library
- D. Chaum. Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms. Communications of the ACM, 4(2), February 1981. Google ScholarDigital Library
- J. Cook and A. Orso. MonDe: Safe Updating through Monitored Deployment of New Component Versions. In Proceedings of the 6th Workshop on Program Analysis for Software Tools and Engineering (PASTE), September 2005. Google ScholarDigital Library
- L. P. Cox, C. D. Murray, and B. D. Noble. Pastiche: Making Backup Cheap and Easy. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, December 2002. Google ScholarDigital Library
- J. Dunagan, R. Roussev, B. Daniels, A. Johson, C. Verbowski, and Y.-M. Wang. Towards a Self-Managing Software Patching Process Using Black--Box Persistent-State Manifests. In Proceedings of the International Conference on Autonomic Computing, May 2004. Google ScholarDigital Library
- G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: Enabling Intrusion Analysis Through Virtual-Machine Logging and Replay. SIGOPS Oper. Syst. Rev., 36(SI):211--224, 2002. Google ScholarDigital Library
- Fixing A Troubled Firefox 2.0 Upgrade. http://softwaregadgets.gridspace.net/2006/10/30/fixing-a-troubled-firefox-20-upgrade/.Google Scholar
- Firefox crashes after 1.5.0.9 update. http://www.ubuntuforums.org/showthread.php?t=331274.Google Scholar
- C. Gkantsidis, T. Karagiannis, P. Rodriguez, and M. Vojnovic. Planet Scale Software Updates. In Proceedings of SIGCOMM, September 2006. Google ScholarDigital Library
- A. Heydon, R. Levin, T. Mann, and Y. Yu. The Vesta Software Configuration Management System. Compaq Systems Research Center, 2002. Google ScholarDigital Library
- L. J. Heyer, S. Kruglyak, and S. Yooseph. Exploring Expression Data: Identification and Analysis of Coexpressed Genes. In Genome Research, pages 1106--1115, 1999.Google ScholarCross Ref
- F. Junqueira, R. Bhagwan, A. Hevia, K. Marzullo, and G. M. Voelker. Surviving Internet Catastrophes. In Proceedings of the USENIX 2005 Annual Technical Conference, April 2005. Google ScholarDigital Library
- Kaseya Patch Management. http://www.kaseya.com/products/patch-management.php.Google Scholar
- S. King and P. Chen. Backtracking Intrusions. In Proceedings of the 19th SOSP, October 2003. Google ScholarDigital Library
- L. Mariani and M. Pezze. Behavior Capture and Test: Automated Analysis of Component Integration. In Proceedings of the International Conference on Engineering of Complex Computer Systems, June 2005. Google ScholarDigital Library
- S. McCamant and M. D. Ernst. Predicting Problems Caused by Component Upgrades. In Proceedings of the 10th European Software Engineering Conference and the 11th Symposium on the Foundations of Software Engineering, September 2003. Google ScholarDigital Library
- Microsoft Online Crash Analysis. http://oca.microsoft.com/en/Welcome.aspx.Google Scholar
- K.-K. Muniswamy-Reddy, D. A. Holland, U. Braun, and M. Seltzer. Provenance-Aware Storage Systems. In Proceedings of the USENIX Annual Technical Conference, June 2006. Google ScholarDigital Library
- A. Muthitacharoen, B. Chen, and D. Mazieres. A Low-bandwidth Network File System. In Proceedings of the 18th SOSP, December 2001. Google ScholarDigital Library
- Report of PHP problem after MySQL upgrade. http://www.linuxquestions.org/questions/showthread.php?t=425535.Google Scholar
- K. Nagaraja, F. Oliveira, R. Bianchini, R. P. Martin, and T. D. Nguyen. Understanding and Dealing with Operator Mistakes in Internet Services. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, Dec. 2004. Google ScholarDigital Library
- F. Oliveira, K. Nagaraja, R. Bachwani, R. Bianchini, R. P. Martin, and T. D. Nguyen. Understanding and Validating Database System Administration. In Proceedings of the USENIX Annual Technical Conference, June 2006. Google ScholarDigital Library
- PatchLink. http://www.patchlink.com/.Google Scholar
- PHP5 Migration guide. http://ch2.php.net/manual/en/migration5.incompatible.php.Google Scholar
- Y. Saito. Jockey: A User-Space Library for Record-Replay Debugging. In Proceedings of the 6th International Symposium on Automated Analysis-Driven Debugging, September 2005. Google ScholarDigital Library
- Secunia "Security Watchdog" Blog. http://secunia.com/blog/11.Google Scholar
- L. Sobr and P. Tuma. SOFAnet: Middleware for Software Distribution over Internet. In Proceedings of the IEEE Symposium on Applications and the Internet, January 2005. Google ScholarDigital Library
- S. Srnivasan, S. Kandula, C. Andrews, and Y. Zhou. Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging. In Proceedings of the USENIX Annual Technical Conference, June 2004. Google ScholarDigital Library
- Software upgrade survey. http://mirage.epfl.ch/webdav/site/mirage/users/128770/public/survey.pdf.Google Scholar
- User-Mode Linux. http://user-mode-linux.sourceforge.net/.Google Scholar
- T. Xie and D. Notkin. Checking Inside the Black Box: Regression Testing Based on Value Spectra Differences. In Proceedings of the 20th IEEE International Conference on Software Maintenance, September 2004. Google ScholarDigital Library
Index Terms
- Staged deployment in mirage, an integrated software upgrade testing and distribution system
Recommendations
Staged deployment in mirage, an integrated software upgrade testing and distribution system
SOSP '07Despite major advances in the engineering of maintainable and robust software over the years, upgrading software remains a primitive and error-prone activity. In this paper, we argue that several problems with upgrading software are caused by a poor ...
Fast and Scalable VMM Live Upgrade in Large Cloud Infrastructure
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating SystemsHigh availability is the most important and challenging problem for cloud providers. However, virtual machine monitor (VMM), a crucial component of the cloud infrastructure, has to be frequently updated and restarted to add security patches and new ...
Comments