Big Data

Why is this method important?
The integration of Big Data and analytical tools allows researchers to capture real-time tourist behaviors and process vast amounts of unstructured data (text, images, video) that traditional surveys cannot handle. These methods are crucial for “nowcasting” tourism demand, understanding sentiment at scale, and automating the extraction of semantic meaning from user-generated content. AI and machine learning provide the computational power necessary to identify complex, non-linear patterns in high-dimensional datasets, offering actionable insights for experience design and destination management.
 
What has our team done so far? 
We have been at the forefront of applying AI and machine learning to big data tourism research. We pioneered the use of search engine query data (e.g., Google Trends, Baidu Index) and web traffic data for high-frequency tourism demand forecasting. In the realm of unstructured data, we employed Convolutional Neural Networks (CNN), a deep learning deep learning algorithm, to classify health-related issues from social media posts to monitor tourist experiences under air pollution. Our team utilized Support Vector Machines (SVM) to measure topic matching in management responses. Recently, we leveraged Large Language Models (LLMs), specifically GPT-4o, to perform Chinese word segmentation and construct a tourism lexicon, enabling the quantification of “Government Attention to Tourism” from policy documents. 

Recommendation

Yang, Lisi; Yang, Yang; Huang, Xijia; Yan, Kai

2026

TOURISM MANAGEMENT

Zhang, Xiaowei; Huang, Xingyu; Yang, Yang; Liu, Wei

2026

TOURISM MANAGEMENT

Zhang, Ziqiong; Yang, Yang; Wang, Xueyan; Wang, Chuxin; Zhang, Zili

2026

INFORMATION & MANAGEMENT

Related Presentations

Spatial Analytics

Workshop on Informatics, Data Science, and Economics in Hospitality and Tourism Research

University of Houston, Houston, TX

COVID19tourism Index and its application in tourism management

University of Perpignan

Perpignan, France

Machine Learning and Artificial Intelligence Research in Tourism and Hospitality

University of Macau

Macau (Online)

Tourist behavior analysis using online user generated data

Kyung Hee University

Seoul, Korea (Online)

Related Resources

Government Attention to Tourism Data

This web-based application is designed to provide researchers and policymakers with an interactive platform for querying and analyzing data regarding Government Attention to Tourism (GAT) in China.

Based on the research findings of Yang, Yang, Huang & Yan (2026), this tool visually demonstrates the spatiotemporal relationships and statistical correlations between the GAT index and various tourism economic indicators. This tool supports a bilingual interface in both Chinese and English. You can switch between the two language modes in real-time by clicking the language toggle button (中文/EN) located in the top-right corner of the page.

Key Features:

Multidimensional Data Coverage: Includes GAT indices and 11 key tourism economic indicators at both the provincial and prefectural (city) levels.

Interactive Visualization: Provides trend analysis, spatial distribution maps, and statistical correlation scatter plots.

Data Resources: Supports viewing summaries of raw data and downloading data.

 

本网页应用旨在为研究人员和政策制定者提供关于中国政府旅游关注度 (Government Attention to Tourism, GAT) 的交互式数据查询与分析平台。

该工具基于 Yang, Yang, Huang & Yan (2026) 的研究成果,通过可视化手段展示了 GAT 指数与各类旅游经济指标之间的时空关系和统计关联。本工具支持中英文双语界面。点击页面右上角的语言切换按钮(中文/EN),即可在两种语言模式间实时切换。

核心功能:

  • 多维数据覆盖: 包含省级和地级层面的 GAT 指数及 11 项关键旅游经济指标。
  • 交互式可视化: 提供趋势分析、空间分布地图和统计关联散点图。
  • 数据资源: 支持查看原始数据摘要及下载。

Pulse of American Domestic Tourism

“The ‘Pulse of American Domestic Tourism’ project serves as a digital monitor for the nation’s internal mobility. By mining transportation-derived mobility data, we develop a comprehensive matrix of tourism flows connecting American MSAs. This data-driven approach unveils the rhythmic shifts in visitor demand and regional connectivity. Crucially, we ground these digital insights through extensive cross-validation with household survey data, creating a verified, high-resolution framework for understanding the evolving landscape of domestic travel.”

Key Vocabulary Used (Why it works):

  • Inter-MSA travel flows: Specific and accurate to your methodology.
  • Arterial circulation / Rhythmic shifts: Reinforces the “Pulse” metaphor without being cheesy.
  • High-granularity / Spatiotemporal precision: Highlights the “Big Data” advantage.
  • Rigorously cross-validated: Emphasizes the reliability of your model (crucial for academic trust).
  • Ground-truth metrics: A professional way to refer to the survey data as the standard of truth.

COVID19tourism Index

The COVID19tourism index was developed to monitor the pandemic’s multifaceted impact on the global tourism industry. This index comprises five distinct sub-indices designed to track the specific effects of COVID-19 across various aspects of tourism activities. By utilizing this tool, destinations are enabled to assess their recovery status, generate rigorous forecasts, and benchmark their performance against potential competitors. Sub-indices The COVID19tourism index is comprised of five distinct sub-indices. These sub-indices were designed to track the specific effects of the pandemic across different aspects of tourism activities.

Dashboard Utility The index functions as a tool that enables destinations to perform three primary functions:
• Evaluate Recovery: Destinations can use the tool to assess their current recovery status.
• Forecast: The tool allows users to produce rigorous forecasts regarding tourism trends.
• Benchmark: Destinations can use the index to benchmark their performance against potential competitors

Link to the COVID19tourism Index Dashboard

Link to download the data