参考文献
[1] 徐子沛.数据之巅:大数据革命,历史、现实与未来[M].北京:中信出版社,2014.
[2] WU X D, ZHU X Q, WU G Q, et al.Data Mining with Big Data[J].IEEE Transactions on Knowledge and Data Engineering,2014,26:97-107.
[3] QUINLAN J R.C4.5:Programs for machine learning[M].San Mateo:Morgan Kaufmann Publishers,1993.
[4] HART P.The condensed nearest neighbor rule[J].IEEE Transactions on Information Theory,1968,14:515-516.
[5] DOMINGOS P, PAZZANI M.On the optimality of the simple Bayesian classifier under zero-one loss[J].Machine Learning,1997,29:103-130.
[6] RUMELHART D E, HINTON G E, WILLIAMS R J.Learning representations by back-propagation errors[J].Nature,1986,323:533-536.
[7] PRESS S J, WILSON S.Choosing between logistic regression and discriminant analysis [J].Journal of the American Statistical Association,1978,73(364):699-705.
[8] VAPNIK V.The nature of statistical learning theory[M].New York:Springer Press,1995.
[9] HUANG G B, ZHOU H, DING X, et al.Extreme learning machine for regression and multiclass classification[J].IEEE Transactions on System, Man and Cybernetics, B:Cybernetics,2012,42:513-529.
[10] HE H, EDWARDO A G.Learning from imbalanced data[J].IEEE Transactions on Knowledge and Data Engineering,2009,21(9):1263-1284.
[11] SUN Y M, WONG A K C, KAMEL M S.Classification of imbalanced data:a review [J].International Journal of Pattern Recognition and Artificial Intelligence,2009, 23(4):687-719.
[12] YU H, SUN C, YANG W, et al.A review of class imbalance learning methods in Bioinformatics[J].Current Bioinformatics,2015,10:360-369.
[13] JAPKOWICZ N.Workshop on Learning from Imbalanced Data Sets[C].Proceedings of the 17th American Association for Artificial Intelligence, Austin, Texas, USA,2000.
[14] CHAWLA N V, JAPKOWICZ N, KOLCZ A.Workshop on Learning from Imbalanced Data Sets II[C].Proceedings of the 20th International Conference of Machine Learning, Washington, USA,2003.
[15] CHAWLA N V, JAPKOWICZ N, KOLCZ A.Editorial:Special Issue on Learning from Imbalanced Data Sets[J].ACM Sigkdd Explorations Newsletter,2004,6:1-6.
[16] CHAWLA N V, JAPKOWICZ N, ZHOU Z H.Workshop on Data Mining When Classes are Imbalanced and Errors Have Costs[C].Proceedings of the 13th Pacific-Asia Knowledge Discovery and Data Mining Conference, Bangkok, Thailand,2009.
[17] YANG Q, WU X.10 challenging problems in data mining research[J].International Journal of Information Technology and Decision Making,2006,5(4):597-604.
[18] MENA L, JESUS A G.Symbolic one-class learning from imbalanced datasets:application in medical diagnosis[J].International Journal on Artificial Intelligence Tools,2009,18:273-309.
[19] WANG S, YAO X.Multi-class Imbalance Problems:Analysis and Potential Solutions [J].IEEE Transactions on System, Man and Cybernetics, B:Cybernetics,2012, 42(4):1119-1130.
[20] LIN M, TANG K, YAO X.Dynamic sampling approach to training neural networks for multiclass imbalance classification[J].IEEE Transactions on Neural Networks and Learning Systems,2013,24(4):647-660.
[21] TANG Y, ZHANG Y Q, CHAWLA N V, et al.SVMs modeling for highly imbalanced classification[J].IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics,2009,39(1):281-288.
[22] JAPKOWICZ N, STEPHEN S.The class imbalance problem:A systematic study[J]. Intelligent Data Analysis,2002,6(5):429-450.
[23] TAEHO J, JAPKOWICZ N.Class imbalances versus small disjuncts[J].ACM Sigkdd Explorations Newsletter,2004,6(1):40-49.
[24] ELKAN C.The foundations of cost-sensitive learning[C].Proceedings of the 17th International Joint Conference of Artificial Intelligence, Seattle, Washington, USA, 2001:973-978.
[25] LING C, LI C.Data mining for direct marketing problems and solutions[C]. Proceedings of the 4th ACM International Conference of Knowledge Discovery and Data Mining,1998:73-79.
[26] CHAWLA N, BOWYER K W, HALL L O.SMOTE:Synthetic Minority Over-Sampling Technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[27] HAN H, WANG W Y, MAO B H.Borderline-SMOTE:A New Over-Sampling Method in Imbalanced Data Sets Learning[C].Proceedings of the 2005 International Conference of Intelligent Computing, Hefei, China,2005:878-887.
[28] KUBAT M, MATWIN S.Addressing the Curse of Imbalanced Training Sets:One-Sided Selection[C].Proceedings of the 14th International Conference of Machine Learning, Nashville, Tennessee, USA,1997:179-186.
[29] HE H, BAI Y, GARCIA E A.ADASYN:Adaptive Synthetic Sampling Approach for Imbalanced Learning[C].Proceedings of the 2008 International Joint Conference of Neural Networks, Hong Kong, China,2008:1322-1328.
[30] YEN S J, LEE Y S.Cluster-based under-sampling approaches for imbalanced data distributions[J].Expert Systems and Applications,2009,36(3):5718-5727.
[31] YU H, NI J, ZHAO J.ACOSampling:An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data[J].Neurocomputing,2013, 101:309-318.
[32] ZHANG H, LI M.RWO-Sampling:A random walk over-sampling approach to imbalanced data classification[J].Information Fusion,2014,20:99-116.
[33] DAS B, KRISHNAN N C, COOK D J.RACOG and wRACOG:Two Probabilistic Oversampling Techniques[J].IEEE Transactions on Knowledge and Data Engineering, 2015,27:222-234.
[34] ZHOU Z H, LIU X Y.On Multi-class Cost Sensitive Learning[J].Computational Intelligence,2010,26(3):232-257.
[35] DRUMMOND C, HOLTE R C.Exploiting the Cost(In)Sensitivity of Decision Tree Splitting Criteria[C].Proceedings of the 17th International Conference of Machine Learning, Stanford, CA, USA,2000:239-246.
[36] VEROPOULOS K, CAMPBELL C N, Cristianini.Controlling the sensitivity of support vector machines[C].Proceedings of the International Joint Conference of Artificial Intelligence,1999:55-60.
[37] ZONG W, HUANG G B, CHENY.Weighted extreme learning machine for imbalance learning[J].Neurocomputing,2013,101:229-242.
[38] BATUWITA R, PALADE V.FSVM-CIL:Fuzzy Support Vector Machines for Class Imbalance Learning[J].IEEE Transactions on Fuzzy Systems,2010,18:558-571.
[39] 于化龙,祁云嵩,杨习贝,等.类不平衡模糊加权极限学习机算法研究[J].计算机科学与探索,2016, doi:10.3778/j.issn.1673-9418.1603094,1-13.
[40] SUN Y, KAMEL M S, WONG A K C, et al.Cost-Sensitive Boosting for Classification of Imbalanced Data[J].Pattern Recognition,2007,40(12):3358-3378.
[41] FAN W, STOLFO S J, ZHANG J, et al.AdaCost:Misclassification Cost-Sensitive Boosting[C].Proceedings of the 16th International Conference of Machine Learning, Bled, Slovenia,1999:97-105.
[42] DOMINGOS P.MetaCost:A General Method for Making Classifiers Cost-Sensitive [C].Proceedings of the 5th ACM Sigkdd International Conference of Knowledge Discovery and Data Mining, San Diego, CA, USA,1999:155-164.
[43] ZHOU Z H, LIU X Y.Training cost-sensitive neural networks with methods addressing the class imbalance problem[J].IEEE Transactions on Knowledge and Data Engineering,2006,18:63-77.
[44] LIN W J, CHEN J J.Class-imbalanced classifiers for high-dimensional data[J]. Briefings in Bioinformatics,2013,14:13-26.
[45] YU H, MU C, SUN C, et al.Support Vector Machine-Based Optimized Decision Threshold Adjustment Strategy for Classifying Imbalanced Data[J].Knowledge-Based Systems,2015,76:67-78.
[46] YU H, SUN C, YANG X, et al.ODOC-ELM:Optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data[J].Knowledge-Based Systems,2016,92:55-70.
[47] TAO D, TANG X, LI X, et al.Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(7):1088-1099.
[48] SUN Z, SONG Q, ZHU X, et al.A novel ensemble method for classifying imbalanced data[J].Pattern Recognition,2015,48:1623-1637.
[49] YU H, NI J.An Improved Ensemble Learning Method for Classifying High-dimensional and Imbalanced Biomedicine Data[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2014,11:657-666.
[50] CHAWLA N, LAZAREVIE A, HALL L O, et al.SMOTEBoost:Improving prediction of the minority class in boosting[C].Proceedings of 7th European Conference on Principles and Practice of Knowledge Discovery in Databases,2003:107-119.
[51] SEIFFERT C, KHOSHGOFTAAR T M, VAN HULSE J, et al.RUSBoost:a hybrid approach to alleviating class imbalance[J].IEEE Transactions on Systems, Man and Cybernetics, Part A, Systems and Humans,2010,40:185-197.
[52] LIU X Y, WU J, ZHOU Z H.Exploratory Undersampling for Class-Imbalance Learning[J].IEEE Transactions on Systems, Man and Cybernetics, Part B, Cybernetics,2009,39:539-550.
[53] KHOSHGOFTAAR T M, GOLAWALA M, HULSE J V.An empirical study of learning from imbalanced data using random forest[C].Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, IEEE Press,2007:310-317.
[54] MOHAMMED K, CHAKRABORTY POPESCU S M.Predicting disease risks from highly imbalanced data using random forest[J].BMC medical informatics and decision making,2011,11(1):1.
[55] DIEZ-PASTOR J F, RODRIGUEZ J J, GARCIA-OSOORIO C I, et al.Diversity techniques improve the performance of the best imbalance learning ensembles[J]. Information Science,2015,325:98-117.
[56] ERTEKIN S, HUANG J, GILES C L.Active learning for class imbalance problem [C].Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM Press,2007:823-824.
[57] ERTEKIN S, HUANG J, BOTTOU J, et al.Learning on the border:active learning in imbalanced data classification[C].Proceedings of the sixteenth ACM conference on information and knowledge management, ACM Press,2007:127-136.
[58] TOMANEK K, HAHN U.Reducing class imbalance during active learning for named entity annotation[C].Proceedings of the fifth international conference on Knowledge capture, ACM Press,2009:105-112.
[59] 潘志松,陈斌,缪志敏,等.One-Class分类器研究[J].电子学报,2009,37:2496-2503.
[60] MENA L, GONZALEZ J A.Symbolic one-class learning from imbalanced datasets:applications in medical diagnosis[J].International Journal of Artificial Intelligence Tools,2009,18:273-309.
[61] MALDONADO S, MONTECINOS C.Robust classification of imbalanced data using one-class and two-class SVM-based multiclassifiers[J].Intelligent Data Analysis, 2014,18:95-112.
[62] THOMAS C.Improving intrusion detection for imbalanced network traffic[J]. Security and Communication Networks,2013,6(3):309-324.
[63] WEI W, LI J, CAO L, et al.Effective detection of sophisticated online banking fraud on extremely imbalanced data[J].World Wide Web,2013,16:449-475.
[64] LOY C C, XIANG T, GOGN S.Stream-based Active Unusual Event Detection[C]. Proceedings of the 10th Asian Conference on Computer Vision, Queenstown, New Zealand,2010:161-175.
[65] TANG Y, KRASSER S, JUDGE P.Fast and Effective Spam Sender Detection with Granular SVM on Highly Imbalanced Mail Server Behavior Data[C].Proceedings of the 2nd IEEE International Conference on Collaborative Computing:Networking, Applications and Worksharing, IEEE Press,2006:1-6.
[66] LIU Y, HAN T L, SUN A.Imbalanced text classification:A term weighting approach [J].Expert Systems with Applications,2009,36(1):690-701.
[67] PEDRAJAS N G, RODRIGUEZ J P, PEDRAJAS, M G et al.Class imbalance methods for translation initiation site recognition in DNA sequences[J].Knowledge-Based Systems,2012,25:22-34.
[68] WANG S, YAO X.Using class imbalance learning for software defect prediction[J]. IEEE Transactions on Reliability,2013,62(2):434-443.
[69] SEIFFERT C, KHOSHGOFTAAR T M, HULSE J V.Improving software-quality predictions with data sampling and boosting[J].IEEE Transactions on Systems, Man and Cybernetics, Part A:Systems and Humans,2009,39(6):1283-1294.
[70] SUN T.Imbalanced Hyperspectral Image Classification Based on Maximum Margin [J].IEEE Geoscience and Remote Sensing Letters,2015,12(3):522-526.