News contextualization using web scraping techniques and api(s)

×

Error message

User warning: The following theme is missing from the file system: journalijdr. For information about how to fix this, see the documentation page. in _drupal_trigger_error_with_delayed_logging() (line 1138 of /home2/journalijdr/public_html/includes/bootstrap.inc).

International Journal of Development Research

News contextualization using web scraping techniques and api(s)

Abstract: 

Getting different views and articles about any specific news, from different sources, can be done by news contextualization. The solution for news contextualization would be integrating all the textual and pictorial information about the news topic, that can be found on various social networking sites and news sites, and, displaying them all in a single place. Taking a search keyword from the user and retrieving the related news data from different news sources and social networking sites can be done by web scraping techniques and/or using ‘APIs’. The advantage of this would be the user won’t have to search repeatedly for getting the information from various sources related to any news topic. Information from most of the usual and predefined sources will be searched and displayed, after searching for it only once, and, in a single place. This will save the hassle of opening a new page every time to check information from a different source. This may also help one to find different views of different people from the social networking data about the news topic. For example, if the news search keyword is ‘xyz scam’, then the server will process this keyword on social networking sites such as Facebook and Twitter, and find the various posts and tweets related to ‘xyz scam’. We could also search YouTube for any videos related to this search query. And the news related to ‘xyz scam’ can be provided from reliable news websites. At the end, all this data i.e. the Facebook posts, the Twitter tweets and the news articles will be displayed in a single page with different sections for posts, tweets, videos and news. This could be implemented by using web scraping techniques to extract data from websites. Some of the high profile social networking websites provide ‘APIs’ for data extraction which may make relevant data retrieval easier. In this paper, we will be exploring some web scraping techniques and APIs that can be used for the purpose of news contextualization

Download PDF: