Big Data

Why is this method important?
The integration of Big Data and analytical tools allows researchers to capture real-time tourist behaviors and process vast amounts of unstructured data (text, images, video) that traditional surveys cannot handle. These methods are crucial for “nowcasting” tourism demand, understanding sentiment at scale, and automating the extraction of semantic meaning from user-generated content. AI and machine learning provide the computational power necessary to identify complex, non-linear patterns in high-dimensional datasets, offering actionable insights for experience design and destination management.
 
What has our team done so far? 
We have been at the forefront of applying AI and machine learning to big data tourism research. We pioneered the use of search engine query data (e.g., Google Trends, Baidu Index) and web traffic data for high-frequency tourism demand forecasting. In the realm of unstructured data, we employed Convolutional Neural Networks (CNN), a deep learning deep learning algorithm, to classify health-related issues from social media posts to monitor tourist experiences under air pollution. Our team utilized Support Vector Machines (SVM) to measure topic matching in management responses. Recently, we leveraged Large Language Models (LLMs), specifically GPT-4o, to perform Chinese word segmentation and construct a tourism lexicon, enabling the quantification of “Government Attention to Tourism” from policy documents. 

Recommendation

Yang Yang, Qianwen Yin, Gyusang Hwang, Sai Liang, Dejin Yang

2026

International Journal of Contemporary Hospitality Management

Lee, Eunji; Yang, Yang

2026

JOURNAL OF TRAVEL & TOURISM MARKETING

Yang, Yang; Tan, Karen Pei-Sze; Liu, Yi Vanessa

2026

TOURISM MANAGEMENT

Related Presentations

Spatial Analytics

Workshop on Informatics, Data Science, and Economics in Hospitality and Tourism Research

University of Houston, Houston, TX

COVID19tourism Index and its application in tourism management

University of Perpignan

Perpignan, France

Machine Learning and Artificial Intelligence Research in Tourism and Hospitality

University of Macau

Macau (Online)

Tourist behavior analysis using online user generated data

Kyung Hee University

Seoul, Korea (Online)

Related Resources

Restaurant Week Impact Explorer

Tool link: https://uflyy.github.io/restaurant-week/

Academic Reference:

Yang, Y., Yin, Q., Hwang, G. K., Liang, S., & Yang, D. (forthcoming). Restaurant week paradox: Asymmetric effects of event-based marketing on online engagement. International Journal of Contemporary Hospitality Management.

1. Geographic Map View

  • Year Filtering: Use the dropdown in the top right corner to filter participation status by a specific year.
  • Frequency Mapping: If “All Years” is selected, the size (radius) of the cherry-red markers expands dynamically based on the total number of years a restaurant has participated.
  • Interaction: Click on any marker to open an information popup with historical baseline data and predictive metrics.

2. Cohort Comparison

  • Purpose: Compares the structural and baseline performance differences between restaurants that participated in RW versus those that did not.
  • Visual Variables: Recharts-based grouped bar charts are used to visualize the mean values of quantitative variables (Rating, Review Volume, Local/First-time mix) and the proportional makeup of categorical variables (Price Level, Fine Dining status).
  • Frequency Histogram: When filtering by “All Years”, a unique frequency distribution chart activates, revealing the historical retention and loyalty of participating restaurants.

3. Single Unit Simulation

  • Purpose: A sandbox environment detached from the map data. It allows you to model hypothetical scenarios based on econometric estimation formulas.
  • Controls: Adjust the baseline consumer mix (Local Reviewers and First-Time Reviewers) and toggle the Fine Dining status to see real-time marginal treatment effects.

Bike sharing and tourism impact tool

Tool link: https://uflyy.github.io/bike-sharing/

The tool is an interactive, data-driven dashboard designed to visualize the synergistic relationship between urban micro-mobility and tourism. Grounded in empirical econometric research, the tool uses Chicago as a case study to demonstrate how bike-sharing systems impact the demand and visitor experience of nearby tourist attractions. It features interactive spatial mapping and a predictive policy simulator, bridging the gap between academic research and smart city tourism management.

Rating Adjustment Tool

Tool link: https://uflyy.github.io/rating-adjustment/

 

The Rating Adjustment Tool is an advanced analytical web application designed to standardize hotel online reviews by correcting for “scaling heterogeneity”—the phenomenon where different types of reviewers interpret and use rating scales differently. Powered by a Hierarchical Ordered Probit (HOPIT) model grounded in peer-reviewed academic research , the tool mathematically controls for systematic response biases tied to traveler demographics (such as age and gender) and trip characteristics (such as travel type and reviewer expertise). By offering individual review adjustments, hotel-level aggregate score calculations, and batch CSV processing, the tool effectively translates subjective, raw user ratings into objective, comparable latent scores and standardized 1–5 metrics, ensuring fairer and more accurate hotel evaluations.

This rating adjustment tool is built on the theoretical framework and empirical results from the following paper:

Leung, X. Y., & Yang, Y. (2020). Are all five points equal? Scaling heterogeneity in hotel online ratings. International Journal of Hospitality Management, 88, 102539.