LSA Based Text Summarization
Shrabanti Mandal1, Girish Kumar Singh2

1Shrabanti Mandal, Department of Computer Science & Applications,Dr. Harisingh Gour Central University, Sagar, MP ,India.
2Girish Kumar Singh*, Department of Computer Science & Applications, Dr. Harisingh Gour Central University, Sagar, MP ,India. 

Manuscript received on May 25, 2020. | Revised Manuscript received on June 29, 2020. | Manuscript published on July 30, 2020. | PP: 150-156 | Volume-9 Issue-2, July 2020. | Retrieval Number: B3288079220/2020©BEIESP | DOI: 10.35940/ijrte.B3288.079220
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: In this study we propose an automatic single document text summarization technique using Latent Semantic Analysis (LSA) and diversity constraint in combination. The proposed technique uses the query based sentence ranking. Here we are not considering the concept of IR (Information Retrieval) so we generate the query by using the TF-IDF(Term Frequency-Inverse Document Frequency). For producing the query vector, we identify the terms having the high IDF. We know that LSA utilizes the vectorial semantics to analyze the relationships between documents in a corpus or between sentences within a document and key terms they carry by producing a list of ideas interconnected to the documents and terms. LSA helps to represent the latent structure of documents. For selecting the sentences from the document Latent Semantic Indexing (LSI) is used. LSI helps to arrange the sentences with its score. Traditionally the highest score sentences have been chosen for summary but here we calculate the diversity between chosen sentences and produce the final summary as a good summary should have maximum level of diversity. The proposed technique is evaluated on OpinosisDataset1.0. 
Keywords: Text summarization, LSA, SVD, LSI and diversity constraint.