ABSTRACT
A 0--1 matrix has a banded structure if both rows and columns can be permuted so that the non-zero entries exhibit a staircase pattern of overlapping rows. The concept of banded matrices has its origins in numerical analysis, where entries can be viewed as descriptions between the problem variables; the bandedness corresponds to variables that are coupled over short distances. Banded data occurs also in other applications, for example in the physical mapping problem of the human genome, in paleontological data, in network data and in the discovery of overlapping communities without cycles.
We study in this paper the banded structure of binary matrices, give a formal definition of the concept and discuss its theoretical properties. We consider the algorithmic problems of computing how far a matrix is from being banded, and of finding a good submatrix of the original data that exhibits approximate bandedness. Finally, we show by experiments on real data from ecology and other applications the usefulness of the concept. Our results reveal that bands exist in real datasets and that the final obtained ordering of rows and columns have natural interpretations.
- R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD '93, pages 207--216, 1993. Google ScholarDigital Library
- F. Alizadeh, R. M. Karp, L. A. Newberg, and D. K. Weisser. Physical mapping of chromosomes: A combinatorial problem in molecular biology. Algorithmica, 13(1/2):52--76, 1995.Google ScholarCross Ref
- J. Atkins, E. Boman, and B. Hendrickson. A spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput., 28(1):297--310, 1999. Google ScholarDigital Library
- C. Aykanat, A. Pinar, and U. Çatalyürek. Permuting sparse rectangular matrices into block-diagonal form. SIAM J. Sci. Comput., 25(6):1860--1879, 2004. Google ScholarDigital Library
- R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999. Google ScholarDigital Library
- A. Banerjee, C. Krumpelman, J. Ghosh, S. Basu, and R. Mooney. Model-based overlapping clustering. In KDD '05, pages 532--537, 2005. Google ScholarDigital Library
- K. S. Booth. PQ-tree algorithms. PhD thesis, 1975. Google ScholarDigital Library
- P. Burzyn, F. Bonomo, and G. Durán. NP-completeness results for edge modification problems. Disc. Appl. Math., 154(13):1824--1844, 2006. Google ScholarDigital Library
- T. Cormen, C. Leiserson, and R. Rivest. Introduction to algorithms. MIT Press and McGraw-Hill, 1990. Google ScholarDigital Library
- E. Cuthill and J. McKee. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 1969 24th national conference, pages 157--172, 1969. Google ScholarDigital Library
- M. Fortelius. Neogene of the old world database of fossil mammals (NOW). http://www.helsinki.fi/science/now/, 2008.Google Scholar
- K. K. Puolam¨aki, M. Fortelius, and H. Mannila. Seriation in paleontological data using Markov chain Monte Carlo methods. PLoS Comput Biol, 2, 2006.Google Scholar
- I.-J. Lin and D. B. West. Interval digraphs that are indifference digraphs. In Graph theory, Combinatorics, and Algorithms, pages 751--765, 1992.Google Scholar
- H. Mannila and E. Terzi. Nestedness and segmented nestedness. In KDD '07, pages 480--489, 2007. Google ScholarDigital Library
- R. M. McConnell. A certifying algorithm for the consecutive-ones property. In SODA '04, pages 768--777, 2004. Google ScholarDigital Library
- S. Myllykangas, J. Himberg, T. Böhling, B. Nagy, J. Hollm´en, and S. Knuutila. Dna copy number amplification profiling of human neoplasms. Oncogene, 25:7324--7332, 2006.Google ScholarCross Ref
- C. H. Papadimitriou. The NP-completeness of the bandwidth minimization problem. Computing, 16:263--270, 1976.Google ScholarCross Ref
- F. S. Roberts. Indifference graphs. In Proof Techniques in Graph Theory, pages 139--146, 1969.Google Scholar
- R. Rosen. Matrix bandwidth minimization. In ACM national conference, pages 585--595, 1968. Google ScholarDigital Library
- M. Sen and B. K. Sanyal. Indifference digraphs: A generalization of indifference graphs and semiorders. SIAM J. Discret. Math., 7(2):157--165, 1994. Google ScholarDigital Library
- A. Tucker. A structure theorem for the consecutive 1's property. Journal of Combinatorial Theory, Series B, 12(2):153--162, 1972.Google ScholarCross Ref
Index Terms
- Banded structure in binary matrices
Recommendations
Banded structure in binary matrices
A binary matrix has a banded structure if both rows and columns can be permuted so that the non-zero entries exhibit a staircase pattern of overlapping rows. The concept of banded matrices has its origins in numerical analysis, where entries can be ...
The Snap-Back Pivoting Method for Symmetric Banded Indefinite Matrices
The four existing stable factorization methods for symmetric indefinite matrices suffer serious defects when applied to banded matrices. Partial pivoting (row or column exchanges) maintains a band structure in the reduced matrix and the factors, but ...
A fast algorithm for solving banded Toeplitz systems
A fast algorithm for solving systems of linear equations with banded Toeplitz matrices is presented. This new approach is based on extending the given matrix with several rows on the top and several columns on the right and to assign zeros and some ...
Comments