Skip to content

Loretta C. Duckworth Scholars Studio

⠀

Menu
  • Scholars Studio Blog
    • Digital Methods
      • coding
      • critical making
      • data visualization
      • digital pedagogy
      • immersive technology (AR/VR)
      • mapping
      • textual analysis
      • web scraping
    • Disciplinary Fields
      • Anthropology
      • Archaeology
      • Architecture
      • Art History
      • Business
      • Computer Science
      • Critical Digital Studies
      • Cultural Studies
      • Dance
      • Economics
      • Education
      • Environmental Studies
      • Film Studies
      • Gaming Studies
      • Geography
      • History
      • Information Science
      • Linguistics
      • Literary Studies
      • Marketing
      • Media and Communication Studies
      • Music Studies
      • Political Science
      • Psychology
      • Public Health
      • Sculpture
      • Sociology
      • Urban Studies
      • Visual Art
    • Cultural Analytics Practicum Blogposts
  • Current Staff
  • Newsletter
  • About
    • Games Group 
Menu

Coding with Keras for Transfer Learning: Measuring Impact of Built Environments on Health Part III

Posted on March 11, 2020April 22, 2020 by Huilin Zhu

By Huilin Zhu

My digital project’s research question explores how the built environment affects people’s health, especially in terms of weight, in the state of Pennsylvania. In order to consistently measure the built environment’s effects, I’ve previously worked to download satellite images from Google’s static. My first blog introduced how to use python and Google static map API to downloaded all the tile images. I  put the image together based on census tract, import all these images, and convert each image into a 3-dimensional array in python.

The next step is to use a pre-trained Convolutional Neural Network (CNN) model to extract features of the built environment from satellite maps. To run a deep convolutional network for object recognition, I’m using a model trained over ImageNet’s dataset developed by Oxford’s renowned Visual Geometry Group (VGG). My previous blog  “Measuring the Impact of Built Environments on Health Part II: Use of transfer learning” discussed why and how I use the VGG  model as the pre-trained model to extract the features of the built environment. In this blog, I will focus on how to use Python to apply the VGG16 model to do transfer learning.

Installation

I downloaded TensorFlow and Keras to implement VGG. It’s better to upgrade pip before you download TensorFlow and Keras.

pip install --upgrade pip
pip install tensorflow
pip install keras

Set up a model

Keras provides both the 16-layer and 19-layer version via the VGG16 and VGG19 classes. My study uses the VGG16 model. VGG16 is trained over ImageNet, and the images in ImageNet are classified into animals, geological formation, natural objects, and many other different categories. The input images I am using are Google’s satellite images, including images of parks, highways, green streets, crosswalks, and housing.   Since there are some differences between the input data image in the pre-trained model and the input image data, I will use the second fully connected layer instead of the final layer-predictions.

The model can be created as follows:

from keras.applications.vgg16 import VGG16
model = VGG16(weights='imagenet', input_shape=(224,224,3))
model.summary()

My project uses a second fully connected layer of the VGG-CNN-F network to extract the information of the built environment. If you want to check what kind of exact objects exist in the image, you should use prediction as to the extracting layer.  The following code shows how I use the second fully connected layer ‘fc2’ to extract the information from each image.

from keras import layers
from keras import models
from keras.models import Model
layer_name = 'fc2'    #set up layer extracted
intermediate_layer_model = Model(inputs=model.input,outputs=model.get_layer(layer_name).output)

Extract the feature of the maps

All images are loaded and converted to 3-dimensional arrays in Python. Each item in x_list includes all the Numpy arrays in one specific census tract.  For example, x_4202000-42077001800 represents a list containing all the arrays in the census tract with 4202000-42077001800 as Place_tractID. Using the function of eval(x_list[0]), we can see all the NumPy arrays for each image in the first census tract in Allentown. 

The following code shows how to implement the VGG pre-trained model to extract the features of the built environment in each census tract.

data_sum = [ ]
for i in x_list:
k = [ ]
#get 4096 variables for each image
createVar['layer_output_' + i] = intermediate_layer_model.predict(eval(i))
# get the mean value for the 4096 variables for all images in each census tract
createVar['mean_' + i] = pd.DataFrame(eval('layer_output_' + i)).mean(axis=0)
k = eval('mean_' + i)
k['Place_TractID'] = i
if len(data_sum) == 0:
data_sum = k
else:
data_sum = pd.concat([data_sum,k], axis=1, ignore_index=True)
data_sum  # show the results

Finally, I can get 4096 variables for each census tract in data_sum. The following table is part of the output of data_sum.

The first column is the index and another 26 columns representing each census tract in Allentown. The last row of the table shows the Place_TractID for each census tract.  Each Column contains 4096 variables that can represent the feature of the built environment in each census tract.

For example, x_4202000-42077001800 is the place tract ID for the first census tract.  The value of the first variable in the first census tract is 0.39. The value of the variable 4096 in the first census tract is 0.32. These 4096 variables do not have a specific meaning, but they can represent the indicator of the built environment, including color, gradient, edge, height, length, etc.

Future Research

Right now, I get 4096 variables for each census tract in Allentown. Allentown only has 26 census tract, which makes the sample very small.  I will download the satellite image for the main cities in Pennsylvania, get all variables for each census tract.  Then I will combine the 4096 variables with the overweight percentage level in each census and do statistical analysis. Other control variables may also be taken into accounts, such as median household income, percentage of male(female), and percentage of the race (White, Black, Asian, and other race), percentage of households under the poverty line.

 

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025

Tags

3D modeling 3D printing arduino augmented reality banned books coding corpus building critical making Cultural Heritage data cleaning data visualization Digital Preservation digital reconstruction digital scholarship film editing game design games gephi human subject research linked open data machine learning makerspace makerspace residency mapping network analysis oculus rift omeka OpenRefine Photogrammetry Python QGIS R SketchUp stylometry text analysis text mining textual analysis top news twitter video analysis virtual reality visual analysis voyant web scraping webscraping

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Archives

Blog Tags

3D modeling (11) 3D printing (14) arduino (8) augmented reality (5) banned books (3) coding (12) corpus building (4) critical making (7) Cultural Heritage (11) data cleaning (4) data visualization (11) Digital Preservation (3) digital reconstruction (9) digital scholarship (12) film editing (3) game design (3) games (6) gephi (3) human subject research (3) linked open data (4) machine learning (6) makerspace (8) makerspace residency (4) mapping (30) network analysis (17) oculus rift (8) omeka (3) OpenRefine (4) Photogrammetry (5) Python (8) QGIS (10) R (9) SketchUp (4) stylometry (8) text analysis (10) text mining (4) textual analysis (32) top news (102) twitter (5) video analysis (4) virtual reality (17) visual analysis (5) voyant (4) web scraping (16) webscraping (3)

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025
  • From Theory to Practice: Weaving in Response to the Grid in the Global Context March 26, 2025
  • Visiting a Land of Twilight February 24, 2025

Archives

©2025 Loretta C. Duckworth Scholars Studio | Design: Newspaperly WordPress Theme
Menu
  • Scholars Studio Blog
    • Digital Methods
      • coding
      • critical making
      • data visualization
      • digital pedagogy
      • immersive technology (AR/VR)
      • mapping
      • textual analysis
      • web scraping
    • Disciplinary Fields
      • Anthropology
      • Archaeology
      • Architecture
      • Art History
      • Business
      • Computer Science
      • Critical Digital Studies
      • Cultural Studies
      • Dance
      • Economics
      • Education
      • Environmental Studies
      • Film Studies
      • Gaming Studies
      • Geography
      • History
      • Information Science
      • Linguistics
      • Literary Studies
      • Marketing
      • Media and Communication Studies
      • Music Studies
      • Political Science
      • Psychology
      • Public Health
      • Sculpture
      • Sociology
      • Urban Studies
      • Visual Art
    • Cultural Analytics Practicum Blogposts
  • Current Staff
  • Newsletter
  • About
    • Games Group