Mine and Lattice Web Data using NLP
Gaurav Sharma
Gaurav Sharma, Research & Development Software Engineering, OnePlus India.
Manuscript received on 26 June 2022 | Revised Manuscript received on 01 July 2022 | Manuscript Accepted on 15 July 2022 | Manuscript published on 30 July 2022 | PP: 103-108 | Volume-11 Issue-2, July 2022 | Retrieval Number: 100.1/ijrte.B71450711222 | DOI: 10.35940/ijrte.B7145.0711222
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: This Paper will be an exercise in net extraction, natural language processing (NLP), and named entity recognition (NER). For the NLP, we will primarily be using the open-supply Python libraries NLTK and Spacy. This paper is meant to be a demonstration of a use-case for net extraction and NLP, now no longer a complete novice educational to the use of both techniques. We extricated joins from a web page, at that point utilized those joins to extricate indeed more substance from the internet location. We utilized that substance to at that point extricate and upgrade that data utilizing outside APIs, ML clustering calculations, and NLP.
Keywords: NLP, Information Extraction, Segmentation, Name Entity Recognition (NER), Gaussian NB, SVM, SPACY.
Scope of the Article: Natural Language Processing