Abstract
An I/O intensive application, parallel full text retrieval based on a signature file method, is studied. The text retrieval system is implemented on a cluster of DEC5000 workstations connected by Ethernet. Experiments are performed to evaluate the benefit and cost for running such an application in a workstation cluster. Results show that substantial improvement in speed can be obtained through parallelism in disk accesses, despite the high communication and synchronization overhead that is incurred. Several factors that affect the performance of a parallel I/O application in this type of computing environment are discussed. The advantages of a workstation cluster are its large combined I/O buffer capacity, and possible concurrent accesses to disks local to each workstation. Our study demonstrates that these advantages, when exploited properly, can lead to effective performance improvement without the need for additional hardware.
- [1] S. Christodoulakis, F. Ho, and M. Theodoridou, "The Multimedia Object Presentation Manager in MINOS: A Symmetric Approach," Proc. ACM SIGMOD, May 1986. Google ScholarDigital Library
- [2] A.L. Cheung and A.P. Reeves, "High Performance Computing on a Cluster of Workstations", Proceedings of the First International Symposium on High-performance Distributed Computing, Syracuse, New York, Sept, 1992.Google ScholarCross Ref
- [3] C. Faloutsos, "Signature-Based Text Retrieval Methods: A Survey", IEEE Data Engineering, Vol. 13, Mar. 1990. pp. 25-32. Google ScholarDigital Library
- [4] G.A. Geist, "Network Based Concurrent Computing on the PVM System", Technical report TM-11826, Oak Ridge National Lab., 1991.Google Scholar
- [5] Z. Lin and C. Faloutsos, "Frame-sliced Signature files", IEEE Transaction on Data Engineering, Vol 4, NO. 3, June 1992. Google ScholarDigital Library
- [6] Z. Lin, "Concurrent Frame Signature Files", Distributed and Parallel Databases: An International Journal, Vol 1. No. 3. July, 1993. Google ScholarDigital Library
- [7] M. Psrashar and S. Hariri, "A Requirement Analysis for High Performance Distributed Computing over LAN's", Proceedings of the First International Symposium on High-performance Distributed Computing, Syracuse, New York, Sept, 1992.Google ScholarCross Ref
- [8] J. Price, "The Optical Disk Pilot Project at the Library of Congress", Video-disc and Optical Disk, vol. 4, no. 6, pp. 424-432, Nov. 1984.Google Scholar
- [9] R. Sacks-Davis, A. Kent, and K. Ramamohanarao, "Multikey Access Methods Based on Superimposed Coding Techniques". ACM Transaction on Database Systems, Vol. 12, No. 4, Dec. 1987. Google ScholarDigital Library
- [10] C. J. Van-Rijsbergen, Information Retrieval, Butterworths, London, England, 1979. Google ScholarDigital Library
Index Terms
- Parallelizing I/O intensive applications for a workstation cluster: a case study
Recommendations
Parallelizing Subroutines in Sequential Programs
An algorithm for making sequential programs parallel is described, which first identifies all subroutines, then determines the appropriate execution mode and restructures the code. It works recursively to parallelize the entire program. We use Fortran ...
Parallelizing audio analysis applications: a case study
ICSE-SEET '17: Proceedings of the 39th International Conference on Software Engineering: Software Engineering and Education TrackAs multicore computers become widespread, the need for software programmers to decide on the most effective parallelization techniques becomes very prominent. In this case study, we examined a competition in which four teams of graduate students ...
Parallelizing neural network training for cluster systems
PDCN '08: Proceedings of the IASTED International Conference on Parallel and Distributed Computing and NetworksWe present a technique for parallelizing the training of neural networks. Our technique is designed for parallelization on a cluster of workstations. To take advantage of parallelization on clusters, a solution must account for the higher network ...
Comments