简介:The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated with learnt classification rules categorizing documents into topics. Classification provides for the reduction of the dimensionality of the document feature space.The semantic model of retrieved web documents is semantically labeled by querying domain ontology and processed with content-based classification method.
简介:The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated with learnt classification rules categorizing documents into topics. Classification provides for the reduction of the dimensionality of the document feature space.The semantic model of retrieved web documents is semantically labeled by querying domain ontology and processed with content-based classification method.