ABSTRACT
Network intrusions become a signification threat to network servers and its availability. A simple intrusion can suspend the organization's network services and can lead to a financial disaster. In this paper, we propose a framework called TimeVM to mitigate, or even eliminate, the infection of a network intrusion on-line as fast as possible. The framework is based on the virtual machine technology and traffic-replay-based recovery. TimeVM gives the illusion of "time machine". TimeVM logs only the network traffic to a server and replays the logged traffic to multiple "shadow" virtual machines (Shadow VM) after different time delays (time lags). Consequently, each Shadow VM will represent the server at different time in history. When attack/infection is detected, TimeVM enables navigating through the traffic history (logs), picking uninfected Shadow VM, removing the attack traffic, and then fast-replaying the entire traffic history to this Shadow VM. As a result, a typical up-to-date uninfected version of the original system can be constructed.
The paper shows the implementation details for TimeVM. It also addresses many practical challenges related to how to configure and deploy TimeVM in a system in order to minimize the recovery time. We present analytical framework and extensive evaluation to validate our approach in different environments.
- http://www.rtfm.com/ssldump.Google Scholar
- T. F. Abdelzaher and C. Lu. Modeling and performance control of internet servers. In Proceedings of the 39th IEEE Conference on Decision and Control, volume 3, pages 2234--2239, 2000.Google ScholarCross Ref
- P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. In SOSP '03: Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 164--177, 2003. Google ScholarDigital Library
- A. Cichocki and R. Unbehauen. Neural Networks for Optimization and Signal Processing. John Wiley and Sons, 1993. Google ScholarDigital Library
- J. R. Crandall and T. F. Chong. Minos: Control data attack preventing orthogonal to memory model. In In Proceedings of the 37th International Symposium on Microarchitecture (MICRO), 2004. Google ScholarDigital Library
- J. R. Crandall, Z. Su, S. F. Wu, and T. F. Chong. On deriving unkown vulnerabilities from zero-day polymorphic and metamorphic worm exploits. ACM CCS, pages 235--248, 2005. Google ScholarDigital Library
- D. A. S. de Oliveira, J. R. Crandall, G. Wassermann, S. F. Wu, Z. Su, and F. T. Chong. Execrecorder: Vm-based full-system replay for attack analysis and system recovery. In ASID '06: Proceedings of the 1st workshop on Architectural and system support for improving software dependability, pages 66--71, 2006. Google ScholarDigital Library
- G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. Revirt: enabling intrusion analysis through virtual-machine logging and replay. SIGOPS Oper. Syst. Rev., 36:211--224, 2002. Google ScholarDigital Library
- G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In VEE '08: Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 121--130, 2008. Google ScholarDigital Library
- M. Kharbutli, X. Jiang, Y. Solihin, G. Venkataramani, and M. Prvulovic. Comprehensively and efficiently protecting the heap. ASPLOC, 2006. Google ScholarDigital Library
- T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Trans. Comput., 36(4):471--482, 1987. Google ScholarDigital Library
- J. D. C. Little. A proof of the queueing formula l = λw. Oper. Res., p:383--387, 1961.Google Scholar
- M. V. Mahoney. Network traffic anomaly detection based on packet bytes. In SAC '03: Proceedings of the 2003 ACM symposium on Applied computing, pages 346--350, 2003. Google ScholarDigital Library
- L. McVoy and C. Staelin. lmbench: Portable tools for performance analysis. In Proceedings of the USENIX, 1996. Google ScholarDigital Library
- S. Narayanasamy, G. Pokam, and B. Calder. Bugnet: Continuously recording program execution for deterministic replay debugging. SIGARCH Comput. Archit. News, 33(2):284--295, 2005. Google ScholarDigital Library
- I. Ray and S. Tideman. A secure tcp connection migration protocol to enable the survivability of client-server applications under malicious attack. J. Netw. Syst. Manage., 12(2):251--276, 2004. Google ScholarDigital Library
- A. C. Snoeren, D. G. Andersen, and H. Balakrishnan. Fine-grained failover using connection migration. In USITS'01: Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems, 2001. Google ScholarDigital Library
- S. M. Srinivasan, S. Kandula, C. R. Andrews, and Y. Zhou. Flashback: a lightweight extension for rollback and deterministic replay for software debugging. In ATEC '04: Proceedings of the annual conference on USENIX Annual Technical Conference, pages 3--3, 2004. Google ScholarDigital Library
- F. Sultan, K. Srinivasan, and L. Iftode. Migratory tcp: connection migration for service continuity in the internet. In Proceedings of the 22nd International Conference on Distributed Computing Systems, pages 469--470, 2002. Google ScholarDigital Library
- A. Whitaker, R. S. Cox, M. Shaw, and S. D. Gribble. Rethinking the design of virtual machine monitors. IEEE Computer, 38(5):57--62, 2005. Google ScholarDigital Library
- M. Xu, R. Bodik, and M. D. Hill. A flight data recorder for enabling full-system multiprocessor deterministic replay. SIGARCH Comput. Archit. News, 31(2):122--135, 2003. Google ScholarDigital Library
Index Terms
- TimeVM: a framework for online intrusion mitigation and fast recovery using multi-time-lag traffic replay
Recommendations
ExecRecorder: VM-based full-system replay for attack analysis and system recovery
ASID '06: Proceedings of the 1st workshop on Architectural and system support for improving software dependabilityLog-based recovery and replay systems are important for system reliability, debugging and postmortem analysis/recovery of malware attacks. These systems must incur low space and performance overhead, provide full-system replay capabilities, and be ...
Live migration of virtual machine based on full system trace and replay
HPDC '09: Proceedings of the 18th ACM international symposium on High performance distributed computingLive migration of virtual machines (VM) across distinct physical hosts provides a significant new benefit for administrators of data centers and clusters. Previous migration schemes focused on transferring the runtime memory state of the VM. Those ...
Maintaining Network QoS Across NIC Device Driver Failures Using Virtualization
NCA '09: Proceedings of the 2009 Eighth IEEE International Symposium on Network Computing and ApplicationsDevice driver failures have been shown to be a major cause of system failures. Network services stress NIC device drivers, increasing the probability of NIC driver bugs being manifested as server failures. System virtualization is increasingly used for ...
Comments