(draft version – full version is coming soon)
imbWBIWeb Business Intelligence libraries of imbVeles Framework. Web Classification mechanism is called “Industry Term Model” (ITM), as it describes categories (industries) as semantic clouds of lemma terms, extracted from training set web sites. The ITM performs single-label multi-class classification of web sites, after it is trained by list of domains, separated in corresponding categories.
Here we will cover basic operations you can perform from imbWBIWeb Business Intelligence libraries of imbVeles Framework. Console Tool application. The ITM is supported in imbWBIWeb Business Intelligence libraries of imbVeles Framework. Console Tool as imbACE Console plugin named “itm”, therefore all commands related to ITM have prefix “itm.”.
Creating a new ITM project
Let’s create a new Industry Term Model project
// this will create new project called "test" itm.Open "test"; // here we are saving just created project, to get folder structure and all configuration files created at imbWBI\projects\itmPlugin\test subfolder itm.Save; // here we shutdown the console application Quit;
Now, examine the folder [location of the imbWBIWeb Business Intelligence libraries of imbVeles Framework. Console Tool]\projects\itmPlugin\test
Constructing the Mining Context for the categories
// We are opening a project named [itm01] itm.Open "itm01"; // Calling for crawl of [constructions] category, saying the system to do not clear existing MC repository, to run in debug mode and to execute the crawling script just after generates the imbWEM script. itm.CrawlScript name="constructions";clearRepo=false;debug=true;autorun=true; itm.CrawlScript name="cooling";clearRepo=false;debug=true;autorun=true; itm.CrawlScript name="energetics";clearRepo=false;debug=true;autorun=true; // Here we are using implicit syntax of ACE Script, with default values for other parameters of the method itm.CrawlScript "heating"; // Again, we are using implicit syntax of ACE Script, but specifying values for all parameters of the method itm.CrawlScript "furniture";false;false;true; // Saving the project itm.Save;
The crawl script for imbWEM plugin is automatically generated, and executed:
// This is auto-generated script to build MC Repository for Industry Term Model Project // Date 12/31/2017 // Defining job wem.Job "MCRepo for constructions";"Building MCRepo for ITMP itm01";true;"";1; // Loading web domains wem.SampleFile "G:\imbWBI_Test\projects\imbWBIToolState_jobs\imbWBIToolState\constructions_crawl.txt",false,"Domains of constructions",true,0,-1,True; // Creates new instance of built-in crawler wem.Crawler classname="SM_LTS";LT_t=1;I_max=50;PL_max=15;PS_c=10;instanceNameSufix="_MC";primLanguage="serbian";secLanguage="english"; // Configuring Crawl Job Engine wem.CrawlJobEngineSettings TC_max=2;Tdl_max=20;Tll_max=50;Tcjl_max=120; // Opens new session with the Index Engine wem.OpenSession experimentSession="itm01_constructions";IndexID="itm01";useJobSettings=false;crawlFolderNameTemplate="*"; // Opens new session with the Mining Context manager mcm.Open repo="constructions_const"; log_msg="MCRepo construction for constructions"; debug=True; // Adds plugin wem.plugin plugin_classname="reportPlugIn_CrawlToMC"; // Runs the crawl job wem.Run; // Closes the currently opened Mining Context session mcm.Close log_msg="Ending MCRepo construction for constructions"; doReport=true; debug=True;
Performing an experiment
Performing series of template-composite experiments
Generating secondary reports
Check imbWBIWeb Business Intelligence libraries of imbVeles Framework. API Documentation:
http://doc.veles.rs/