Disclaimer: This is an example of a student written essay.
Click here for sample essays written by our professional writers.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.

Recommender System Based on Machine Learning Using Users' Geo-Tagged Images On Social Media

Paper Type: Free Essay Subject: Information Systems
Wordcount: 4292 words Published: 23rd Sep 2019

Reference this

A Recommender System Based on Machine Learning Using Users’ Geo-Tagged Images On Social Media



While touring unfamiliar cities, travelers often need useful advice in choosing sightseeing and tourism attractions from a wide range of options which suit their personal interests. Since the lack of information about an unfamiliar city exists, visitors always use travel websites to access more precise information about a place where they intend to visit. Nevertheless, as this data is fragmentary and also includes the biased attitudes and opinions of authors, visitors may not be able to efficiently determine which city attractions would be appealing to them the most. Furthermore, travelers are only able to find the descriptions about the most memorable attractions and may miss out on some fameless ones that they would favor in reality. As trip planning can be a complicated process, there is always a demand for a travel recommendation system that can suggest suitable recommendations according to individual interests and spatiotemporal circumstances. Besides, recently, social media applications that use location-based services have become more widespread. Consequently, burgeoning numbers of location-based social media applications present numerous and enormous amounts of social media data including geographic location information. This research studies new methods to create a travel recommender system based on geotagged photos, therefore, tourists can make better decisions.

The use of Artificial Intelligence (AI) is a key to bridge the gap between travel activities recommendation and tagged photos.

Research objectives:

-          Investigate the use of data science to bridge the gap between travel websites and tagged photos.

-          Design a technique to effectively create mappings between travel activities and relevant photos.


-          New model to connect travel websites with tagged photos

-          Evaluation of the usefulness of the model in improving the recommender system.


  1. Recommender Systems

   Recommender systems utilize artificial intelligence techniques to provide users with item suggestions. Recommender systems were proposed in 1992 when Tapestry, the first Recommender system, emerged. They are separated into three main sections to drive the recommendations: collaborative, content-based, and hybrid filtering (Adomavicius & Tuzhilin, 2005). First, Recommender systems using a collaborative way consider the user information when preparing information for the recommendation. It suggests items by recognizing other users with a similar tendency; it uses their view to recommend objects to the current user (Adomavicius & Tuzhilin, 2005). Lately, researchers have focused on developing multiple K-nearest-neighbor methods based on collaborative filtering. For instance, Memon et al. (2015) implemented the kernel density estimation to model user’s geographical travel choices and then measured the Kullback Leibler divergence to represent the user similarity. Clements et al. (2010) used a user-based collaborative filtering approach to analyze a geotagged Flickr dataset and predict user performance. Next, Recommender systems with a content-based filtering method base their suggestions on the item data they can reach. these methods normally base their forecasts on user’s data, and they neglect participation from other users as with the case of collaborative methods (Min & Han, 2005; Celma & Serra, 2008). Notwithstanding the success of these two filtering ways, several conditions have been recognized. Some of the problems connected with content-based filtering techniques are restricted content analysis, overspecialization, and sparsity of information (Adomavicius & Tuzhilin, 2005). Also, collaborative methods present cold-start, sparsity and scalability obstacles. These difficulties usually diminish the quality of the suggestions. In order to decrease some of the problems distinguished, Hybrid filtering, which combines two or more filtering methods in different ways in order to enhance the accuracy and performance of recommender systems has been introduced (Göksedef & Gündüz-Öğüdücü, 2010). These methods combine two or more filtering techniques in order to control their strengths while leveling out their identical defects (Al-Shamri & Bharadwaj, 2008). Cheng et al. (2011) suggested a route recommendation system based on a Bayesian classifier trained by a Flickr dataset. Moreover, some researchers have studied ensemble learning models to achieve more accurate results. Subramaniyaswamy et al. (2015) introduced an algorithm coupling the AdaBoost (adaptive boosting algorithm) classifier with a Bayesian classifier to create a travel recommendation system.

  1. Travel recommendation using location-based data

Study on travel recommendation systems has concentrated on two kinds of recommendation: route recommendation and location recommendation. Route recommendation systems provide a route plan, which is in a time series, based on user records. For instance, De Choudhury et al. (2010) collected Flickr photos to find attractions and used a graph-based algorithm to create recommended routes for travelers. Cai, Lee, and Lee (2018) used a sequential model mining method for geotagged pictures to determine users’ semantic trajectory models. The principal goal of a route recommendation is to find a user’s travel template and then provide recommendations based on it. Nevertheless, geotagged social media data are not a proper dataset for route recommendation. Because these geotagged photos are scattered and taken over considerable intervals of time, there is not sufficient information to determine an exact travel pattern for most people.

Conversely, a location recommendation system produces just one recommended attraction, or a list of candidate appeals that are not prepared in a time series. Additionally, location recommendations concentrate on creating an individual interest profile. Some studies have defined approaches based on geotagged social media data for providing location suggestion. Cao et al. (2010) recommended common locations via identifying representative tags and images, and numerous memory-based methods were used for location recommendation. They all center on enhancing collaborative filtering based on geotagged photos and context information. Nevertheless, they avoid the cold-start and sparsity problems that are made by using social media data by dropping sparse users.

  1. Geo-tagged social media data

Geo-tagged social media data are absolute data that completely reflect users’ interests. Despite being noisy and disorganized, geotagged social media data are huge and take different forms (Sun, Huang, Peng, Chen, & Liu, 2018). In particular, Flickr is globally one of the most famous photo-sharing websites. It keeps tens of millions of geotagged photos uploaded by tourists and photographers from all over the world. Accordingly, many researchers have built recommendation systems based on Flickr dataset. Deep analysis of geographical favorites from geo-tagged social media data has become a research subject of interest in recent geographic information science (GIS) research (Zhou, Xu & Kimmons, 2015). For instance, geo-tagged photos have been widely employed for traveler destination mining in recent years (Shao, Zhang & Li, 2017). Extra research used a hybrid ensemble learning method for traveler route recommendations based on geo-tagged social networks (Wan, Hong, Huang, Peng & Li, 2018).



  1. Data generation process


   To do the task, first I need to collect data from Internet. Data collection is the most important part of this project. Both commercial location-based services and location sharing services have been implemented for particular platforms. Most of these services provide APIs, which can be used to retrieve information from their systems. Therefore, they can make them content providers for researchers and business analytics wishing to analyze the accessible data. As an illustration, twitter data is most commonly used data for several studies, from finding observer tweets during crises to connecting social networks. Hence, I will use Twitter due to the reasons below:

Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!

Essay Writing Service

Twitter’s dataset contains 22 million geo-tagged tweets collected via Twitter APIs. The geo-tagged information was posted from more than 1,200 applications (Cheng, Caverlee, Lee & Sui, 2011). Twitter provides two types of APIs; the streaming API and the Firehouse API. Streaming API has been used for social media and network analysis in order to generate the understanding of how users behave. This API retrieves 1% of all tweets while the Firehouse API retrieves as much as 43% of all tweets. The downside of the Firehouse API is the cost of infrastructure and bandwidth as well as data storage which are critical for a large amount of data. Requests could consist of multiple texts or hashtags as well as multiple areas. The request is typically user defined based on the needs of the analysis. The data is stored in the databases and is filtered depending on the system requirements. A typical filtering process could include filtering for geotagged content, filtering multiple content from single users and determining dominant urban area content ratios. The system can format and display the results in a suitable form, afterwards. Tables, clusters on maps and graphs are most common visualization types.

After collecting data, I should remove incomplete data. To further clean the dataset, I will remove redundant photos, which happen when a few users take many pictures of an attraction within a short time. I will perform geospatial clustering on geo-tagged social media data to gain user POIs. Later, the model assembles extended tourist–location combinations by coupling travelers’ location archives and associated temperatures, climate conditions, seasons as well as vacations with the attributes of user POIs. Then I should determine the attributes I want to use in the machine learning algorithm.  There are many attributes that may affect the recommendation system. Some attributes like ‘tags’, ‘date Taken’, ‘latitude’, ‘longitude’, ‘user’s age’ and ‘gender’ may also have impacts on the system. I will start with some basic attributes and add others if I need.

  1. Recommender System Methods


the major methodologies used by the recommender systems in location-based social networks are categorized into the following three groups: 1) content-based recommendation. It uses data from a user’s profile such as age, gender, and preferred cuisines, and also the features of locations such as categories and tags associated with a location in order to make recommendations. 2) link analysis-based recommendation. It applies link analysis models such as hypertext induced topic search (HITS) and PageRank to identify experienced users and interesting locations. 3) collaborative filtering (CF) recommendation. It derives a user preference from historical behavior such as location history. In this research, content-based recommendations and Collaborative filtering-based recommendations is chosen to use due to the following reasons:

  1. Content-based recommendations

Content-based recommendation systems match user preferences, discovered from users’ profiles with features extracted from locations, such as tags and categories to make recommendations. These systems require accurate and structured information for both the user profiles and the location features to make high quality recommendations. While using this approach, the system is robust against the cold start problem for both new users and locations. As long as the latest added user or location has the appropriate descriptive content, they can be effectively handled.

  1. Collaborative filtering-based recommendations

Collaborative filtering (CF) is widely used in conventional recommendation systems (Adomavicius & Tuzhilin, 2005). The intuition in extending the CF model for recommendations in location-based social networks is that a user is more likely to visit a location if it is preferred by similar users which makes this approach semantic. This approach used by recommender systems in location-based social networks consists of three processes: 1) candidate selection 2) similarity inference 3) recommendation score predication.

Candidate selections: The first step of CF-based recommendation systems is to select a subset of candidate nodes to reduce the computational overhead. The traditional CF-based recommendation algorithms use the most similar users (or locations, activities, etc.) as the candidates. CF-based recommender systems in location-based social networks can also use geographic bounds and associations to curb the candidate selection process. A spatial range can be computed to simplify candidate locations.

Similarity inferences: Similarities between users (or locations, activities, etc.) are inferred from users’ ratings and location histories in location-based social networks. The traditional CF models can be divided into two subgroups: 1) user-based models using similarity measures between each pair of users 2) item-based models that use similarity measures between each pair of items (media content, activities, etc.).

The following equation explains a simple user similarity computation for user u and u′ using the Cosine correlation function in a user-based CF model:


Where r(o,u) is the rating user u gives to each object o in the set of all objects O.

Recommendation score predication: Finally, CF systems predict a recommendation score for each object oi (locations, social media, etc.) in the candidate set. These scores are calculated from ratings given by the set of users (U) and the similarity measures between individual users. The following equation is meant to calculate recommendation score prediction:


The advantages of the collaborative filtering models are that there is no need to maintain well-structured descriptions of items (locations, activities, etc.) or users, and they take advantage of community opinions which provide high quality recommendations.

I integrate CF and Content based recommendation to create a hybrid recommender system in order to overcome the weaknesses of both.

  1. Evaluation methods

Recommender systems in location-based social networks have typically used two methods to assess the effectiveness of their recommendations; user studies, precision and recall ratios.

User studies: In order to conduct a user study of a recommender system, multiple subjects are invited to use the recommender system and evaluate its performance (Zheng et al., 2010). For each recommendation task, the subjects need to evaluate the top-k recommendations suggested by the recommendation system.

To create a baseline for this assessment, all the feedback provided by the subjects are collected to create an ideal ranking list. As recommendations are based on result rankings, the normalized discounted cumulative gain (nDCG) is used to measure the effectiveness of the recommendation list (Manning, PRABHAKAR & HINRICH, 2008). nDCG is also commonly used in information retrieval to measure search engine performance. A higher nDCG value states that there are more relevant items which appear first in the results list.

Find Out How UKEssays.com Can Help You!

Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.

View our services

Precision and recall ratios: Precision and recall ratios are also used to evaluate the effectiveness of recommendations in location-based social networks (Ye et al., 2011). To use this evaluation method, user’s location history is divided into two parts, the location history generated within a query area which is used as ground truth, and the rest of the user’s location history, which is used as a training set to learn the user’s preferences and build the recommendation model, then. The evaluation process is carried out based on its ability in suggesting those sites within the query region that the user has actually visited based on the training data (the location history outside of the query region).


     The following is a draft timetable of research progression. I intend to also draft more detailed annual and quarterly plans.

March 2019      Commencement of candidature

March 2019 – August 2019     Relevant training and skills update

       Literature review

September 2019 – February 2020    Identifying sources for collecting data

       Full research proposal

March 2020 – August 2020    Collecting data

      Working on algorithms

September 2020 – November 2020  Data analysis and writing

December 2020 – February 2021   Editing and revision     

March 2021      Submission of thesis




  • Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge & Data Engineering, (6), 734-749.
  • Al-Shamri, M. Y. H., & Bharadwaj, K. K. (2008). Fuzzy-genetic approach to recommender systems based on a novel hybrid user model. Expert systems with applications, 35(3), 1386-1399.
  • Cai, G., Lee, K., & Lee, I. (2018). Itinerary recommender system with semantic trajectory pattern mining from geo-tagged photos. Expert Systems with Applications, 94, 32-40.
  • Cao, L., Luo, J., Gallagher, A. C., Jin, X., Han, J., & Huang, T. S. (2010, March). A worldwide tourism recommendation system based on geotaggedweb photos. In ICASSP (pp. 2274-2277).
  • Celma, Ò., & Serra, X. (2008). FOAFing the music: Bridging the semantic gap in music recommendation. Web Semantics: Science, Services and Agents on the World Wide Web, 6(4), 250-256.
  • Cheng, A. J., Chen, Y. Y., Huang, Y. T., Hsu, W. H., & Liao, H. Y. M. (2011, November). Personalized travel recommendation by mining people attributes from community-contributed photos. In Proceedings of the 19th ACM international conference on Multimedia (pp. 83-92). ACM.
  • Cheng, Z., Caverlee, J., Lee, K., & Sui, D. Z. (2011). Exploring millions of footprints in location sharing services. ICWSM2011, 81-88.
  • Clements, M., Serdyukov, P., De Vries, A. P., & Reinders, M. J. (2010). Using flickr geotags to predict user travel behaviour. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval (pp. 851-852). ACM.
  • De Choudhury, M., Feldman, M., Amer-Yahia, S., Golbandi, N., Lempel, R., & Yu, C. (2010, June). Automatic construction of travel itineraries using social breadcrumbs. In Proceedings of the 21st ACM conference on Hypertext and hypermedia (pp. 35-44). ACM.
  • Göksedef, M., & Gündüz-Öğüdücü, Ş. (2010). Combination of Web page recommender systems. Expert Systems with Applications, 37(4), 2911-2922.
  • Min, S. H., & Han, I. (2005). Detection of the customer time-variant pattern for improving recommender systems. Expert Systems with Applications, 28(2), 189-199.
  • Memon, I., Chen, L., Majid, A., Lv, M., Hussain, I., & Chen, G. (2015). Travel recommendation using geo-tagged photos in social media for tourist. Wireless Personal Communications, 80(4), 1347-1362.
  • Manning, C. D., PRABHAKAR, R., & HINRICH, S. (2008). Introduction to information retrieval, volume 1 Cambridge University Press. Cambridge, UK.
  • Shao, H., Zhang, Y., & Li, W. (2017). Extraction and analysis of city’s tourism districts based on social media data. Computers, Environment and Urban Systems, 65, 66-78.
  • Subramaniyaswamy, V., Vijayakumar, V., Logesh, R., & Indragandhi, V. (2015). Intelligent travel recommendation system by mining attributes from community contributed photos. Procedia Computer Science, 50, 447-455.
  • Sun, X., Huang, Z., Peng, X., Chen, Y., & Liu, Y. (2018). Building a model-based personalised recommendation approach for tourist attractions from geotagged social media data. International Journal of Digital Earth, 1-18.
  • Wan, L., Hong, Y., Huang, Z., Peng, X., & Li, R. (2018). A hybrid ensemble learning method for tourist route recommendations based on geo-tagged social networks. International Journal of Geographical Information Science, 1-22.
  • Ye, M., Yin, P., Lee, W. C., & Lee, D. L. (2011). Exploiting geographical influence for collaborative point-of-interest recommendation. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval (pp. 325-334). ACM.
  • Zheng, V. W., Zheng, Y., Xie, X., & Yang, Q. (2010). Collaborative location and activity recommendations with GPS history data. In Proceedings of the 19th international conference on World wide web (pp. 1029-1038). ACM.
  • Zhou, X., Xu, C., & Kimmons, B. (2015). Detecting tourism destinations using scalable geospatial analysis based on cloud computing platform. Computers, Environment and Urban Systems, 54, 144-153.


Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please: