ABSTRACT
Recently, the booming fashion sector and its huge potential benefits have attracted tremendous attention from many research communities. In particular, increasing research efforts have been dedicated to the complementary clothing matching as matching clothes to make a suitable outfit has become a daily headache for many people, especially those who do not have the sense of aesthetics. Thanks to the remarkable success of neural networks in various applications such as the image classification and speech recognition, the researchers are enabled to adopt the data-driven learning methods to analyze fashion items. Nevertheless, existing studies overlook the rich valuable knowledge (rules) accumulated in fashion domain, especially the rules regarding clothing matching. Towards this end, in this work, we shed light on the complementary clothing matching by integrating the advanced deep neural networks and the rich fashion domain knowledge. Considering that the rules can be fuzzy and different rules may have different confidence levels to different samples, we present a neural compatibility modeling scheme with attentive knowledge distillation based on the teacher-student network scheme. Extensive experiments on the real-world dataset show the superiority of our model over several state-of-the-art methods. Based upon the comparisons, we observe certain fashion insights that can add value to the fashion matching study. As a byproduct, we released the codes, and involved parameters to benefit other researchers.
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio . 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR Vol. abs/1409.0473 (2014).Google Scholar
- Léon Bottou . 1991. Stochastic gradient learning in neural networks. Proceedings of Neuro-Nımes Vol. 91, 8 (1991).Google Scholar
- Da Cao, Xiangnan He, Lianhai Miao, Yahui An, Chao Yang, and Richang Hong . 2018. Attentive Group Recommendation. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. Google ScholarDigital Library
- Da Cao, Liqiang Nie, Xiangnan He, Xiaochi Wei, Shunzhi Zhu, and Tat-Seng Chua . 2017. Embedding Factorization Models for Jointly Recommending Items and User Generated Lists. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 585--594. Google ScholarDigital Library
- Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, and Tat-Seng Chua . 2016. Micro tells macro: predicting the popularity of micro-videos via a transductive model. In Proceedings of the ACM International Conference on Multimedia. ACM, 898--907. Google ScholarDigital Library
- Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, and Tat-Seng Chua . 2017. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention. In Proceedings of the International ACM SIGIR Conference. 335--344. Google ScholarDigital Library
- Zhiyong Cheng, Ying Ding, Lei Zhu, and Mohan S. Kankanhalli . 2018. Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews Proceedings of the ACM International WWW Conference. 639--648. Google ScholarDigital Library
- Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S. Kankanhalli, and Liqiang Nie . 2017. Exploiting Music Play Sequence for Music Recommendation Proceedings of the International Joint Conference on Artificial Intelligence. 3654--3660. Google ScholarDigital Library
- Pedro Felzenszwalb, David McAllester, and Deva Ramanan . 2008. A discriminatively trained, multiscale, deformable part model IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.Google Scholar
- Fuli Feng, Xiangnan He, Yiqun Liu, Liqiang Nie, and Tat-Seng Chua . 2018. Learning on Partial-Order Hypergraphs. In Proceedings of the ACM International WWW Conference. 1523--1532. Google ScholarDigital Library
- Xintong Han, Zuxuan Wu, Yu-Gang Jiang, and Larry S. Davis . 2017. Learning Fashion Compatibility with Bidirectional LSTMs Proceedings of the ACM International Conference on Multimedia. 1078--1086. Google ScholarDigital Library
- Ruining He and Julian McAuley . 2016. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback Proceedings of the AAAI Conference. AAAI Press, 144--150. Google ScholarDigital Library
- Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua . 2017. Neural Collaborative Filtering. In Proceedings of the ACM International WWW Conference. ACM, 173--182. Google ScholarDigital Library
- Xiangnan He, Hanwang Zhang, Min Yen Kan, and Tat Seng Chua . 2016. Fast Matrix Factorization for Online Recommendation with Implicit Feedback Proceedings of the International ACM SIGIR Conference. 549--558. Google ScholarDigital Library
- Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean . 2015. Distilling the Knowledge in a Neural Network. CoRR Vol. abs/1503.02531 (2015).Google Scholar
- Diane J Hu, Rob Hall, and Josh Attenberg . 2014. Style in the long tail: Discovering unique interests with latent variable models in large scale social e-commerce. In Proceedings of the International ACM SIGKDD Conference. ACM, 1640--1649. Google ScholarDigital Library
- Yang Hu, Xi Yi, and Larry S Davis . 2015. Collaborative fashion recommendation: a functional tensor factorization approach Proceedings of the ACM International Conference on Multimedia. ACM, 129--138. Google ScholarDigital Library
- Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard H. Hovy, and Eric P. Xing . 2016 a. Harnessing Deep Neural Networks with Logic Rules. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. The Association for Computer Linguistics, 2410--2420.Google Scholar
- Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, and Eric P. Xing . 2016 b. Deep Neural Networks with Massive Learned Knowledge Proceedings of the Conference on Empirical Methods in Natural Language Processing. The Association for Computational Linguistics, 1670--1679.Google Scholar
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell . 2014. Caffe: Convolutional architecture for fast feature embedding Proceedings of the ACM International Conference on Multimedia. ACM, 675--678. Google ScholarDigital Library
- Lu Jiang, Shoou-I Yu, Deyu Meng, Yi Yang, Teruko Mitamura, and Alexander G Hauptmann . 2015. Fast and accurate content-based semantic search in 100m internet videos Proceedings of the ACM International Conference on Multimedia. ACM, 49--58. Google ScholarDigital Library
- Aditya Khosla, Atish Das Sarma, and Raffay Hamid . 2014. What makes an image popular?. In Proceedings of the ACM International WWW Conference. ACM, 867--876. Google ScholarDigital Library
- Yoon Kim . 2014. Convolutional Neural Networks for Sentence Classification Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP. 1746--1751.Google Scholar
- Yuncheng Li, Liangliang Cao, Jiang Zhu, and Jiebo Luo . 2017. Mining Fashion Outfit Composition Using an End-to-End Deep Learning Approach on Set Data. IEEE Transactions on Multimedia Vol. 19, 8 (2017), 1946--1955.Google ScholarDigital Library
- Meng Liu, Liqiang Nie, Meng Wang, and Baoquan Chen . 2017. Towards Micro-video Understanding by Joint Sequential-Sparse Modeling Proceedings of the ACM on Multimedia Conference. 970--978. Google ScholarDigital Library
- Si Liu, Jiashi Feng, Zheng Song, Tianzhu Zhang, Hanqing Lu, Changsheng Xu, and Shuicheng Yan . 2012 a. Hi, magic closet, tell me what to wear!. In Proceedings of the ACM International Conference on Multimedia. ACM, 619--628. Google ScholarDigital Library
- Si Liu, Zheng Song, Guangcan Liu, Changsheng Xu, Hanqing Lu, and Shuicheng Yan . 2012 b. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, 3330--3337. Google ScholarDigital Library
- Yihui Ma, Jia Jia, Suping Zhou, Jingtian Fu, Yejun Liu, and Zijian Tong . 2017. Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 38--44.Google Scholar
- Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel . 2015. Image-based recommendations on styles and substitutes Proceedings of the International ACM SIGIR Conference. ACM, 43--52. Google ScholarDigital Library
- Xueming Qian, He Feng, Guoshuai Zhao, and Tao Mei . 2014. Personalized recommendation combining user interest and social circle. IEEE Transactions on Knowledge and Data Engineering Vol. 26, 7 (2014), 1763--1777.Google ScholarCross Ref
- Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, and Jiawei Han . 2017. An Attention-based Collaboration Framework for Multi-View Network Representation Learning. In Proceedings of the International ACM CIKM Conference. ACM, 1767--1776. Google ScholarDigital Library
- Janarthanan Rajendran, Mitesh M Khapra, Sarath Chandar, and Balaraman Ravindran . 2015. Bridge correlational neural networks for multilingual multimodal representation learning. arXiv preprint arXiv:1510.03519 (2015).Google Scholar
- Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme . 2009. BPR: Bayesian personalized ranking from implicit feedback Proceedings of the International Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452--461. Google ScholarDigital Library
- Aliaksei Severyn and Alessandro Moschitti . 2015. Twitter Sentiment Analysis with Deep Convolutional Neural Networks Proceedings of the International ACM SIGIR Conference. ACM, 959--962. Google ScholarDigital Library
- Xuemeng Song, Fuli Feng, Jinhuan Liu, Zekun Li, Liqiang Nie, and Jun Ma . 2017. NeuroStylist: Neural Compatibility Modeling for Clothing Matching Proceedings of the ACM International Conference on Multimedia. 753--761. Google ScholarDigital Library
- Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua . 2015 a. Multiple social network learning and its application in volunteerism tendency prediction. In Proceedings of the International ACM SIGIR Conference. ACM, 213--222. Google ScholarDigital Library
- Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua . 2015 b. Interest Inference via Structure-Constrained Multi-Source Multi-Task Learning. Proceedings of the International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377. Google ScholarDigital Library
- Xiang Wang, Xiangnan He, Fuli Feng, Liqiang Nie, and Tat-Seng Chua . 2017 a. TEM: Tree-enhanced Embedding Model for Explainable Recommendation Proceedings of the International Conference on World Wide Web. 1543--1552. Google ScholarDigital Library
- Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua . 2017 b. Item silk road: Recommending items from information domains to social users Proceedings of the International ACM SIGIR conference. ACM, 185--194. Google ScholarDigital Library
- Xinxi Wang and Ye Wang . 2014. Improving content-based and hybrid music recommendation using deep learning Proceedings of the ACM International Conference on Multimedia. ACM, 627--636. Google ScholarDigital Library
- Hongzhi Yin, Hongxu Chen, Xiaoshuai Sun, Hao Wang, Yang Wang, and Quoc Viet Hung Nguyen . 2017 a. SPTF: A Scalable Probabilistic Tensor Factorization Model for Semantic-Aware Behavior Prediction. In IEEE International Conference on Data Mining. 585--594.Google Scholar
- Hongzhi Yin, Weiqing Wang, Hao Wang, Ling Chen, and Xiaofang Zhou . 2017 b. Spatial-Aware Hierarchical Collaborative Deep Learning for POI Recommendation. IEEE Transactions on Knowledge and Data Engineering Vol. 29, 11 (2017), 2537--2551.Google ScholarDigital Library
- Ruichi Yu, Ang Li, Vlad I. Morariu, and Larry S. Davis . 2017. Visual Relationship Detection With Internal and External Linguistic Knowledge Distillation. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 1974--1982.Google Scholar
- Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua . 2017. Visual Translation Embedding Network for Visual Relation Detection IEEE Conference on Computer Vision and Pattern Recognition. 3107--3115.Google Scholar
- Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua . 2013. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In Proceedings of the ACM International Conference on Multimedia. ACM, 33--42. Google ScholarDigital Library
Index Terms
- Neural Compatibility Modeling with Attentive Knowledge Distillation
Recommendations
NeuroStylist: Neural Compatibility Modeling for Clothing Matching
MM '17: Proceedings of the 25th ACM international conference on MultimediaNowadays, as a beauty-enhancing product, clothing plays an important role in human's social life. In fact, the key to a proper outfit usually lies in the harmonious clothing matching. Nevertheless, not everyone is good at clothing matching. Fortunately, ...
Attribute-wise Explainable Fashion Compatibility Modeling
With the boom of the fashion market and people’s daily needs for beauty, clothing matching has gained increased research attention. In a sense, tackling this problem lies in modeling the human notions of the compatibility between fashion items, i.e., ...
Neural fashion experts: I know how to make the complementary clothing matching
AbstractClothing has gradually become the beauty enhancing product, while the harmonious clothing matching is critical for a suitable outfit. The existing clothing matching techniques mainly rely on the visual features but overlook the textual ...
Comments