How to translate text using browser tools
27 December 2019 Research on Ocean Government Data Extraction and Clustering Based on XML Document Similarity Technology
Baofeng Yao, Lei Wang, Shijun Liu
Author Affiliations +
Abstract

Yao, B.; Wang, L., and Liu, S., 2019. Research on ocean government data extraction and clustering based on XML document similarity technology. In: Li, L.; Wan, X., and Huang, X. (eds.), Recent Developments in Practices and Research on Coastal Regions: Transportation, Environment and Economy. Journal of Coastal Research, Special Issue No. 98, pp. 259–262. Coconut Creek (Florida), ISSN 0749-0208.

In component clustering, a recursive algorithm for computing the similarity between components based on extensible markup language (XML) description is proposed, which can effectively measure the structure and semantic information contained in the XML description document. The document similarity matrix is constructed. The high-dimensional samples are mapped to the two-dimensional plane by a genetic algorithm, and the k-means algorithm is used to cluster to obtain the global optimal component clustering. Finally, experiments are carried out on the component library test model. The experimental results show that the component clustering algorithm based on XML similarity is feasible and effective in component query practice.

©Coastal Education and Research Foundation, Inc. 2019
Baofeng Yao, Lei Wang, and Shijun Liu "Research on Ocean Government Data Extraction and Clustering Based on XML Document Similarity Technology," Journal of Coastal Research 98(sp1), 259-262, (27 December 2019). https://doi.org/10.2112/SI98-064.1
Received: 9 June 2019; Accepted: 26 August 2019; Published: 27 December 2019
KEYWORDS
Component
genetic algorithm
semantic similarity
RIGHTS & PERMISSIONS
Get copyright permission
Back to Top