025 Plugin: wds
Class imbWEM.Core.consolePlugin.webDatasetPlugin
imbWBIWeb Business Intelligence libraries of imbVeles Framework..ConsoleTool Console.webDatasetPlugin
This is imbACE advanced console plugin for webDatasetPlugin
1 wds.ExtractDomainList
Extracting sample list for crawl from existing data set
It will load dataset specified and extract domain list from it
Command arguments:
ID | Name | Type | Default | Comment |
---|---|---|---|---|
01 | DataSet | String | word | Path to dataset |
02 | Output | String | Path where domain list should be saved | |
03 | debug | Boolean | True | –Values: True,False |
Example :
wds.ExtractDomainList DataSet="word";Output="";debug=True;
2 wds.ExtractURLsFromDataset
Extracts all crawled urls from the dataset
It creates single txt file with list of all URLs crawled by the dataset
Command arguments:
ID | Name | Type | Default | Comment |
---|---|---|---|---|
01 | runName | String | word | Name of the report folder |
02 | datasetPath | String | Path to dataset – when other than currently loaded should be reported about | |
03 | debug | Boolean | True | –Values: True,False |
Example :
wds.ExtractURLsFromDataset runName="word";datasetPath="";debug=True;
3 wds.GetDomains
It will execute subset compilation and set result as active sample list
It will query domains from the dataset source, using subset compilation specified
Command arguments:
ID | Name | Type | Default | Comment |
---|---|---|---|---|
01 | subsetCompilation | String | ODPBusinessDistantTopics | name of the subset compilation to activate |
02 | saveFile | Boolean | True | if true it will save result to the sample list fileValues: True,False |
03 | construct | Boolean | True | if true it will prepare output WebDocumentCategory directory to store crawled contentValues: True,False |
04 | limit | Int32 | -1 | Upper limit for crawl size |
Example :
wds.GetDomains subsetCompilation="ODPBusinessDistantTopics";saveFile=True;construct=True;limit=-1;
4 wds.InitDatasets
Performs initiation of the mail dataset sources
It will connect and check state of WebKB and ODP datasources
Example :
wds.InitDatasets
5 wds.LoadDomainCategory
Loads WebDomainCategory tree from specified path
It will search the specified path and load hierarchical domain list
Command arguments:
ID | Name | Type | Default | Comment |
---|---|---|---|---|
01 | path | String | word | WebDomainCategory root folder to load from |
Example :
wds.LoadDomainCategory path="word";
6 wds.Test
It will run several diagnostic procedures
What it will do?
Example :
wds.Test