# -------------------------------------------- # CITATION file created with {cffr} R package # See also: https://docs.ropensci.org/cffr/ # -------------------------------------------- cff-version: 1.2.0 message: 'To cite package "ForestDisc" in publications use:' type: software license: GPL-3.0-or-later title: 'ForestDisc: Forest Discretization' version: 0.1.0 doi: 10.32614/CRAN.package.ForestDisc abstract: Supervised, multivariate, and non-parametric discretization algorithm based on tree ensembles learning and moment matching optimization. This version of the algorithm relies on random forest algorithm to learn a large set of split points that conserves the relationship between attributes and the target class, and on moment matching optimization to transform this set into a reduced number of cut points matching as well as possible statistical properties of the initial set of split points. For each attribute to be discretized, the set S of its related split points extracted through random forest is mapped to a reduced set C of cut points of size k. This mapping relies on minimizing, for each continuous attribute to be discretized, the distance between the four first moments of S and the four first moments of C subject to some constraints. This non-linear optimization problem is performed using k values ranging from 2 to 'max_splits', and the best solution returned correspond to the value k which optimum solution is the lowest one over the different realizations. ForestDisc is a generalization of RFDisc discretization method initially proposed by Berrado and Runger (2009) , and improved by Berrado et al. in 2012 by adopting the idea of moment matching optimization related by Hoyland and Wallace (2001) . authors: - family-names: Maïssae given-names: Haddouchi email: maissaem7@gmail.com repository: https://hmais.r-universe.dev commit: ea960566ba7b25142c37c3ae0b444b5dbecc5a71 date-released: '2020-03-19' contact: - family-names: Maïssae given-names: Haddouchi email: maissaem7@gmail.com