imbWBI.ConsoleTool Console Reference (v0.3.1)

025 Plugin: wds

Class imbWEM.Core.consolePlugin.webDatasetPlugin

imbWBIWeb Business Intelligence libraries of imbVeles Framework..ConsoleTool Console.webDatasetPlugin

This is imbACE advanced console plugin for webDatasetPlugin

1 wds.ExtractDomainList

Extracting sample list for crawl from existing data set

It will load dataset specified and extract domain list from it

Command arguments:

ID Name Type Default Comment
01 DataSet String word Path to dataset
02 Output String Path where domain list should be saved
03 debug Boolean True –Values: True,False

Example :

wds.ExtractDomainList DataSet="word";Output="";debug=True;

2 wds.ExtractURLsFromDataset

Extracts all crawled urls from the dataset

It creates single txt file with list of all URLs crawled by the dataset

Command arguments:

ID Name Type Default Comment
01 runName String word Name of the report folder
02 datasetPath String Path to dataset – when other than currently loaded should be reported about
03 debug Boolean True –Values: True,False

Example :

wds.ExtractURLsFromDataset runName="word";datasetPath="";debug=True;

3 wds.GetDomains

It will execute subset compilation and set result as active sample list

It will query domains from the dataset source, using subset compilation specified

Command arguments:

ID Name Type Default Comment
01 subsetCompilation String ODPBusinessDistantTopics name of the subset compilation to activate
02 saveFile Boolean True if true it will save result to the sample list fileValues: True,False
03 construct Boolean True if true it will prepare output WebDocumentCategory directory to store crawled contentValues: True,False
04 limit Int32 -1 Upper limit for crawl size

Example :

wds.GetDomains subsetCompilation="ODPBusinessDistantTopics";saveFile=True;construct=True;limit=-1;

4 wds.InitDatasets

Performs initiation of the mail dataset sources

It will connect and check state of WebKB and ODP datasources

Example :

wds.InitDatasets 

5 wds.LoadDomainCategory

Loads WebDomainCategory tree from specified path

It will search the specified path and load hierarchical domain list

Command arguments:

ID Name Type Default Comment
01 path String word WebDomainCategory root folder to load from

Example :

wds.LoadDomainCategory path="word";

6 wds.Test

It will run several diagnostic procedures

What it will do?

Example :

wds.Test 

Spread the love