imbWBIWeb Business Intelligence libraries of imbVeles Framework. framework is envisioned as set of:
- high-level business intelligence oriented components
- mid-level web content processing toolkit
It is developed as both research and development toolkit for real application.
Planed scope of the application-level functionalities:
- Business Entities Classification
- Business Entities Clusterization
- Web Harvesting – data extraction based on well-known content structure model, i.e. template data extraction)
- Web Data Mining – more intelligent approach to Web Data Extraction, using general and non-wrapper techniques (mainly NLP stuff and pipeline model logic)
- Ontology-based information extraction
The supportive layer includes the following, distinctive component:
- Web Site Digestion – web site content pre-processing, moving closer to Information Extraction, from MCRepository content
- Domain-specific ontology construction
- Category knowledge constructor