ACM Home Page
Please provide us with feedback. Feedback
Graph based multi-modality learning
Full text PdfPdf (304 KB)
Source International Multimedia Conference archive
Proceedings of the 13th annual ACM international conference on Multimedia table of contents
Hilton, Singapore
SESSION: Content 6: multimodal processing table of contents
Pages: 862 - 871  
Year of Publication: 2005
ISBN:1-59593-044-2
Authors
Hanghang Tong  Tsinghua University, Beijing, China
Jingrui He  Tsinghua University, Beijing, China
Mingjing Li  Microsoft Research Asia, Beijing, China
Changshui Zhang  Tsinghua University, Beijing, China
Wei-Ying Ma  Microsoft Research Asia, Beijing, China
Sponsors
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 115,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1101149.1101337
What is a DOI?

ABSTRACT

To better understand the content of multimedia, a lot of research efforts have been made on how to learn from multi-modal feature. In this paper, it is studied from a graph point of view: each kind of feature from one modality is represented as one independent graph; and the learning task is formulated as inferring from the constraints in every graph as well as supervision information (if available). For semi-supervised learning, two different fusion schemes, namely linear form and sequential form, are proposed. For each scheme, it is derived from optimization point of view; and further justified from two sides: similarity propagation and Bayesian interpretation. By doing so, we reveal the regular optimization nature, transductive learning nature as well as prior fusion nature of the proposed schemes, respectively. Moreover, the proposed method can be easily extended to unsupervised learning, including clustering and embedding. Systematic experimental results validate the effectiveness of the proposed method.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
4
 
5
 
6
Dupont, S., and Luettin, J. Audio-visual speech modeling for continuous speech recognition. IEEE Trans. on Multimedia, 2(3): 141--151, 2000.
7
 
8
Garg, A., Potamianos, G., Neti, C., and Huang, T.S. Frame-dependent multi-stream reliability indications for audio-visual speech recognition, Proc. of Int. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 24--27, 2003.
 
9
10
 
11
Heckmann, M., Berthommier, F., and Kroschel, K. Noise adaptive stream weighting in audio-visual speech recognition, EURASIP Journal on Applied Signal Process, pp. 1260--1273, 2002.
 
12
 
13
Kailing, K., Kriegel, H., Pryakhin, A., and Schubert, M. Clustering multi-represented objects with noise. Proc. of the Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 394--403, 2004.
 
14
Kittler, J., Hatef, M., and Duin, R.P.W. Combining classifiers. Pattern Recognition, pp. 897--901, 1996.
 
15
 
16
Ng, A.Y., Jordan, M.I., and Weiss, Y. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems, 2001.
17
 
18
 
19
 
20
Reference removed for double-blind review
 
21
Tamura, H., Mori, S., and Yamawaki, T. Textural features corresponding to visual perception. IEEE Trans. on Systems., Man and Cybernetics, pp. 460--472, 1978.
 
22
The WebKB dataset. http://meganesia.int.gu.edu.au/~phmartin/WebKB/.
23
24
25
 
26
Yi, X. Zhang, C, and Wang, J. Multi-view EM algorithm and its application to color image segmentation. IEEE Int. Conf. on Multimedia and Expo, pp. 351--354, 2004.
27
 
28
Zhou, D., and Schölkopf, B. A regularization framework for learning from graph data. Workshop on Statistical Relational Learning at Int. Conf. on Machine Learning, pp. 132--137, 2004.
 
29
Zhou, D., and Schölkopf, B. Transductive Inference with Graphs. MPI Technical Report, 2004.
 
30
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., and Schölkopf, B. Learning with local and global consistency. 18th Annual Conf. on Neural Information Processing Systems, pp. 237--244, 2003.
 
31
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., and Schölkopf, B. Ranking on data manifolds. 18th Annual Conf. on Neural Information Processing System, pp. 169--176, 2003.


Collaborative Colleagues:
Hanghang Tong: colleagues
Jingrui He: colleagues
Mingjing Li: colleagues
Changshui Zhang: colleagues
Wei-Ying Ma: colleagues