SM05: Enhanced feature selection for website classification

Goal of this study is to propose a heuristic upgrade to existing feature selection (FS) functions, that would improve multi-class single-label classification by exploiting information already available in the training dataset. The proposed FIP (Flat Inverse Particularity) function, is based on assumption that features with small website-level frequency and high page-level frequency within single website, represent undesired noise in the training set.

imbWBIWeb Business Intelligence libraries of imbVeles Framework. Console Tool used: v0.5

