效用挖掘技術長篇綜述在數據挖掘頂級期刊IEEE TKDE在線發(fā)表
日前,,哈工大深圳的甘文生博士撰寫的關于效用挖掘Utility Mining的長篇綜述,,先后歷時18個月的peer-review, 在數據庫與數據挖掘等領域的頂級期刊IEEE Transactions on Knowledge and Data Engineering(SCI, IF:3.438, CCF A)在線發(fā)表,,DOI:10.1109/TKDE.2019.2942594 , 哈爾濱工業(yè)大學(深圳)為論文的第一作者單位,。本文的完成人包括: 哈工大深圳的甘文生、西挪威應用科技大學的林?,|教授 (原哈工大深圳副教授, 已于2018年6月離職),、哈工大深圳的Philippe Fournier-Viger教授,臺灣東華大學的趙涵捷教授,、美國伊利諾伊大學芝加哥分校的Philip S. Yu教授等人,。該長篇綜述針對基于效用驅動的模式挖掘技術(Utility-oriented Pattern Mining)的研究背景與意義、應用案例,、經典研究問題,、算法分類與原理、發(fā)展研究現狀做出了詳細的回顧,、原理闡述,、現狀分析和總結,。該論文是IEEE TKDE自1989年創(chuàng)刊以來發(fā)表的以哈爾濱工業(yè)大學(深圳)為第一作者單位的第一篇長篇綜述。
IEEE Transactions on Knowledge and Data Engineering (SCI, IF:3.438, CCF A), IEEE TKDE是數據庫,、數據挖掘等領域的最具影響力的國際期刊,,CCF A類期刊。中國計算機學會將IEEE TKDE定位為數據庫/數據挖掘/內容檢索領域4個A類國際期刊之一,?!癆類指國際上極少數的頂級刊物和會議,鼓勵我國學者去突破”,。該學術期刊每年出版12期,,共收錄200篇文章左右。
論文題目: A survey of utility-oriented pattern mining
文章鏈接:https://ieeexplore.ieee.org/document/8845637
Authors: Wensheng Gan, Jerry Chun-Wei Lin*, Philippe Fournier-Viger, Han-Chieh Chao, Vincent S. Tseng, Philip S. Yu
Abstract:
The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. For identifying and evaluating the usefulness of different kinds of patterns, many techniques and constraints have been proposed, such as support, confidence, sequence order, and utility parameters (e.g., weight, price, profit, quantity, satisfaction, etc.). In recent years, there has been an increasing demand for utility-oriented pattern mining (UPM, or called utility mining). UPM is a vital task, with numerous high-impact applications, including cross-marketing, e-commerce, finance, medical, and biomedical applications. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods of UPM. First, we introduce an in-depth understanding of UPM, including concepts, examples, and comparisons with related concepts. A taxonomy of the most common and state-of-the-art approaches for mining different kinds of high-utility patterns is presented in detail, including Apriori-based, tree-based, projection-based, vertical-/horizontal-data-format-based, and other hybrid approaches. A comprehensive review of advanced topics of existing high-utility pattern mining techniques is offered, with a discussion of their pros and cons. Finally, we present several well-known open-source software packages for UPM. We conclude our survey with a discussion on open and practical challenges in this field.