Abstract
Machine learning and data analytics tasks in practice require several consecutive processing steps. RapidMiner is a widely used software tool for the development and execution of such analytics workflows. Unlike many other algorithm toolkits, it comprises a visual editor that allows the user to design processes on a conceptual level. This conceptual and visual approach helps the user to abstract from the technical details during the development phase and to retain a focus on the core modeling task. The large set of preimplemented data analysis and machine learning operations available in the tool, as well as their logical dependencies, can, however, be overwhelming in particular for novice users.
In this work, we present an add-on to the RapidMiner framework that supports the user during the modeling phase by recommending additional operations to insert into the currently developed machine learning workflow. First, we propose different recommendation techniques and evaluate them in an offline setting using a pool of several thousand existing workflows. Second, we present the results of a laboratory study, which show that our tool helps users to significantly increase the efficiency of the modeling process. Finally, we report on analyses using data that were collected during the real-world deployment of the plug-in component and compare the results of the live deployment of the tool with the results obtained through an offline analysis and a replay simulation.
- Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD’93). 207--216. Google ScholarDigital Library
- Szymon Bobek, Mateusz Baran, Krzysztof Kluza, and Grzegorz J. Nalepa. 2013. Application of Bayesian networks to recommendations in business process modeling. In Proceedings of the 2013 Workshop AI Meets Business Processes (AIBP’13). 41--50.Google Scholar
- Nguyen Ngoc Chan, Walid Gaaloul, and Samir Tata. 2011. Composition context matching for web service recommendation. In Proceedings of the 2011 IEEE International Conference on Services Computing (SCC’11). 624--631. Google ScholarDigital Library
- Nguyen Ngoc Chan, Walid Gaaloul, and Samir Tata. 2012. A recommender system based on historical usage data for web service discovery. Serv. Orient. Comput. Appl. 6, 1 (2012), 51--63. Google ScholarDigital Library
- Nguyen Ngoc Chan, Karn Yongsiriwit, Walid Gaaloul, and Jan Mendling. 2014. Mining event logs to assist the development of executable process variants. In Proceedings 26th International Conference on Advanced Information Systems Engineering. 548--563.Google ScholarCross Ref
- Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the 2010 ACM Conference on Recommender Systems (RecSys’10). 39--46. Google ScholarDigital Library
- Brian D. Davison and Haym Hirsh. 1998. Predicting sequences of user actions. In Proceedings of the AAAI/ICML ’98 Workshop on Predicting the Future: AI Approaches to Time Series Analysis (AAAI’98). 5--12.Google Scholar
- Remco Dijkman, Marlon Dumas, Boudewijn van Dongen, Reina Käärik, and Jan Mendling. 2011. Similarity of business process models: Metrics and evaluation. Inform. Syst. 36, 2 (2011), 498--516. Google ScholarDigital Library
- Xin Dong, Alon Halevy, Jayant Madhavan, Ema Nemes, and Jun Zhang. 2004. Similarity search for web services. In Proceedings of the 30th International Conference on Very Large Data Bases (VLDB’04). 372--383. Google ScholarDigital Library
- Philippe Fournier-Viger, Usef Faghihi, Roger Nkambou, and Engelbert Mephu Nguifo. 2012. CMRules: Mining sequential rules common to several sequences. Knowledge-Based Syst. 25, 1 (2012), 63--76. Google ScholarDigital Library
- Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD’00). 1--12. Google ScholarDigital Library
- Thomas Hornung, Agnes Koschmider, and Georg Lausen. 2008. Recommendation based process modeling support: Method and user experience. In Proceedings of the 27th International Conference on Conceptual Modeling (ER’08). 265--278. Google ScholarDigital Library
- Ya-Han Hu and Yen-Liang Chen. 2006. Mining association rules with multiple minimum supports: A new mining algorithm and a support tuning mechanism. Decision Support Syst. 42, 1 (2006), 1--24. Google ScholarDigital Library
- Dietmar Jannach and Simon Fischer. 2014. Recommendation-based modeling support for data mining processes. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys’14). 334--340. Google ScholarDigital Library
- Dietmar Jannach, Michael Jugovac, and Lukas Lerche. 2015. Adaptive recommendation-based modeling support for data analysis workflows. In Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI’15). 252--262. Google ScholarDigital Library
- Dietmar Jannach, Lukas Lerche, Fatih Gedikli, and Geoffray Bonnin. 2013. What recommenders recommend - An analysis of accuracy, popularity, and sales diversity effects. In Proceedings of the 21st International Conference on User Modeling, Adaptation and Personalization (UMAP 2013). Rome, Italy.Google ScholarCross Ref
- Krzysztof Kluza, Mateusz Baran, Szymon Bobek, and Grzegorz J. Nalepa. 2013. Overview of recommendation techniques in business process modeling. In Proceedings of 9th Workshop on Knowledge Engineering and Software Engineering (KESE9). 46--57.Google Scholar
- Agnes Koschmider, Thomas Hornung, and Andreas Oberweis. 2011. Recommendation-based editor for business process modeling. Data Knowledge Eng. 70, 6 (2011), 483--503. Google ScholarDigital Library
- Henrik Leopold, Jan Mendling, and Hajo A. Reijers. 2011. On the automatic labeling of process models. In Advanced Information Systems Engineering. Lecture Notes in Computer Science, Vol. 6741. 512--520. Google ScholarDigital Library
- Ying Li, Bin Cao, Lida Xu, Jianwei Yin, Shuiguang Deng, Yuyu Yin, and Zhaohui Wu. 2014. An efficient recommendation method for improving business process modeling. IEEE Trans. Indust. Inform. 10, 1 (2014), 502--513.Google ScholarCross Ref
- Justin Matejka, Wei Li, Tovi Grossman, and George W. Fitzmaurice. 2009. CommunityCommands: Command recommendations for software applications. In Proceedings of the 22th Annual ACM Symposium on User Interface Software and Technology (UIST’09). 193--202. Google ScholarDigital Library
- Steffen Mazanek and Mark Minas. 2009. Business process models as a showcase for syntax-based assistance in diagram editors. In Model Driven Engineering Languages and Systems. Lecture Notes in Computer Science, Vol. 5795. 322--336. Google ScholarDigital Library
- Mirjam Minor, Ralph Bergmann, Sebastian Görg, and Kirstin Walter. 2010. Towards case-based adaptation of workflows. In Case-Based Reasoning. Research and Development. Lecture Notes in Computer Science, Vol. 6176. 421--435. Google ScholarDigital Library
- David Piorkowski, Scott Fleming, Christopher Scaffidi, Christopher Bogart, Margaret Burnett, Bonnie John, Rachel Bellamy, and Calvin Swart. 2012. Reactive information foraging: An empirical investigation of theory-based recommender systems for programmers. In Proceedings of the 2012 Conference on Human Factors in Computing Systems (CHI’12). 1471--1480. Google ScholarDigital Library
- Mohamed Sellami, Samir Tata, Zakaria Maamar, and Bruno Defude. 2009. A recommender system for web services discovery in a distributed registry environment. In Proceedings of the 4th International Conference on Internet and Web Applications and Services (ICIW’09). 418--423. Google ScholarDigital Library
Index Terms
- Supporting the Design of Machine Learning Workflows with a Recommendation System
Recommendations
Whither AutoML? Understanding the Role of Automation in Machine Learning Workflows
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing SystemsEfforts to make machine learning more widely accessible have led to a rapid increase in Auto-ML tools that aim to automate the process of training and deploying machine learning. To understand how Auto-ML tools are used in practice today, we performed ...
Adaptive Recommendation-based Modeling Support for Data Analysis Workflows
IUI '15: Proceedings of the 20th International Conference on Intelligent User InterfacesRapidMiner is a software framework for the development and execution of data analysis workflows. Like many modern software development environments, the tool comprises a visual editor which allows the user to design processes on a conceptual level, ...
Mining Students' Learning Behavior in Moodle System
In the last few years, Educational Data Mining has become an interesting area exploited to discover and extract hidden knowledge of students from educational environment data. During the establishment of this work an attempt was made to manage the ...
Comments