Loading

PLeveraging Django and Redis using Web Scraping
K. M. Anandkumar1, Abhinav R2, Abhinav Raman3, Abilash R4

1Dr. K. M. Anandkumar, Professor, Department of Computer Science and Engineering, Easwari Engineering College, Chennai, India.
2Abhinav R , UG Students, Department of Computer Science and Engineering, Easwari Engineering College, Chennai, India.
3Abhinav Raman, UG Students, Department of Computer Science and Engineering, Easwari Engineering College, Chennai, India.
4Abilash R, UG Students, Department of Computer Science and Engineering, Easwari Engineering College, Chennai, India.

Manuscript received on April 30, 2020. | Revised Manuscript received on May 06, 2020. | Manuscript published on May 30, 2020. | PP: 2103-2105 | Volume-9 Issue-1, May 2020. | Retrieval Number: A1916059120/2020©BEIESP | DOI: 10.35940/ijrte.A1916.059120
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Web scraping is also known as data scraping and it is used for extracting data from sites. The software used for this may directly access the World Wide Web by using the Hypertext Transfer Protocol or by using a web browser. Over the years, due to advancements in web development and its technology, various frameworks have come in use and almost all of websites are dynamic with their content being served from CMS. This makes it tough to extract data since there is no common template for extracting data. Hence, we use RSS. Rich Site Summary is a kind of timeline allowing users and also applications to gain access to the updates on websites in a standardized, computer-readable format. This project combines the use of RSS to extract data from websites and serve users in a robust and easy way. The differentiation is that this project uses server side caching to serve users almost instantaneously without the need to perform data extraction from the requested site all over again. This is done using Redis and Django. 
Keywords: Python, Django, redis, RSS
Scope of the Article: Web Mining