You must earn a “Satisfactory” on all parts of the assignment to earn a “Satisfactory” on the assignment.
The assignment must be submitted through GitHub Classrooms. Each student receives one “free pass” for not submitting assignments via specified channels, after which you will receive a “Not Yet” mark.
Read each part of the assignment carefully, and use the check boxes to ensure you’ve addressed all elements of the assignment!
Learning outcomes
This assignment will reinforce key concepts in geospatial analysis by practicing the following:
- load vector/raster data
- simple raster operations
- simple vector operations
- spatial joins
Instructions
- Clone repository from GitHub Classrooms
- Download data from here
- Unzip data and place in repository
- Edit Quarto document with responses
- Push final edits before deadline
Your repository should have the following structure:
EDS223-HW3
│ README.md
│ Rmd/Proj files
│
└───data
│ gis_osm_buildings_a_free_1.gpkg
│ gis_osm_roads_free_1.gpkg
│
└───ACS_2019_5YR_TRACT_48_TEXAS.gdb
| │ census tract gdb files
|
└───VNP46A1 | │ VIIRS data files
Background
Climate change is increasing the frequency and intensity of extreme weather events, with devastating impacts. “In February 2021, the state of Texas suffered a major power crisis, which came about as a result of three severe winter storms sweeping across the United States on February 10–11, 13–17, and 15–20.”1 For more background, check out these engineering and political perspectives.
In this assignment, you will identify the impacts of these series of extreme winter storms by estimating the number of homes in the Houston metropolitan area that lost power and investigate whether not these impacts were disproportionately felt.
Your analysis will be based on remotely-sensed night lights data, acquired from the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite. In particular, you will use the VNP46A1 to detect differences in night lights before and after the storm to identify areas that lost electric power.
To determine the number of homes that lost power, you link (spatially join) these areas with OpenStreetMap data on buildings and roads.
To investigate potential socioeconomic factors that influenced recovery, you will link your analysis with data from the US Census Bureau.
Description
For this assignment, you must produce the following:
Data details
Night lights
Use NASA’s Worldview to explore the data around the day of the storm. There are several days with too much cloud cover to be useful, but 2021-02-07 and 2021-02-16 provide two clear, contrasting images to visualize the extent of the power outage in Texas.
VIIRS data is distributed through NASA’s Level-1 and Atmospheric Archive & Distribution System Distributed Active Archive Center (LAADS DAAC). Many NASA Earth data products are distributed in 10x10 degree tiles in sinusoidal equal-area projection. Tiles are identified by their horizontal and vertical position in the grid. Houston lies on the border of tiles h08v05 and h08v06. We therefore need to download two tiles per date.
As you’re learning in EDS 220, accessing, downloading, and preparing remote sensing data is a skill in it’s own right! To prevent this assignment from being a large data wrangling challenge, we have downloaded and prepped the following files for you to work with, stored in the VNP46A1
folder.
Data files:
VNP46A1.A2021038.h08v05.001.2021039064328.tif
: tile h08v05, collected on 2021-02-07
VNP46A1.A2021038.h08v06.001.2021039064329.tif
: tile h08v06, collected on 2021-02-07
VNP46A1.A2021047.h08v05.001.2021048091106.tif
: tile h08v05, collected on 2021-02-16
VNP46A1.A2021047.h08v06.001.2021048091105.tif
: tile h08v06, collected on 2021-02-16
Roads
Data file: gis_osm_roads_free_1.gpkg
Typically highways account for a large portion of the night lights observable from space (see Google’s Earth at Night). To minimize falsely identifying areas with reduced traffic as areas without power, we will ignore areas near highways.
OpenStreetMap (OSM) is a collaborative project which creates publicly available geographic data of the world. Ingesting this data into a database where it can be subsetted and processed is a large undertaking. Fortunately, third party companies redistribute OSM data. We used Geofabrik’s download sites to retrieve a shapefile of all highways in Texas and prepared a Geopackage (.gpkg
file) containing just the subset of roads that intersect the Houston metropolitan area.
Houses
Data file: gis_osm_buildings_a_free_1.gpkg
We can also obtain building data from OpenStreetMap. We again downloaded from Geofabrick and prepared a GeoPackage containing only houses in the Houston metropolitan area.
Socioeconomic
We cannot readily get socioeconomic information for every home, so instead we obtained data from the U.S. Census Bureau’s American Community Survey for census tracts in 2019. The folder ACS_2019_5YR_TRACT_48.gdb
is an ArcGIS “file geodatabase”, a multi-file proprietary format that’s roughly analogous to a GeoPackage file.
You can use st_layers()
to explore the contents of the geodatabase. Each layer contains a subset of the fields documents in the ACS metadata.
The geodatabase contains a layer holding the geometry information (ACS_2019_5YR_TRACT_48_TEXAS
), separate from the layers holding the ACS attributes. You have to combine the geometry with the attributes to get a feature layer that sf
can use.
Make sure to check that these datasets have the same coordinate reference systems! If not, transform them to match.
Workflow outline
To complete complete the tasks of this assignment, you will need to break your analysis into the following key steps:
- find locations that experienced a blackout by creating a mask
- exclude highways from analysis
- identify homes that experienced blackouts by combining the locations of homes and blackouts
- identify the census tracts likely impacted by blackout
Below is guidance and suggestions for each of these steps.
For improved computational efficiency and easier interoperability with sf
, I recommend using the stars
package for raster handling.
Create blackout mask
To identify places that experienced a blackout, you should create a “mask” that indicates for each cell whether or not it experienced a blackout.
-
- hint: this will require creating a raster object for each day (2021-02-07 and 2021-02-16)
-
- hint: use
st_as_sf()
to convert from a raster to a vector and fix any invalid geometries withst_make_valid()
- hint: use
-
- (-96.5, 29), (-96.5, 30.5), (-94.5, 30.5), (-94.5, 29)
Exclude highways from the cropped blackout mask
Highways may have experienced changes in their night light intensities that are unrelated to the storm. Therefore, you should excluded any locations within 200 meters of all highways in the Houston area.
-
- hint: you may need to use
st_union
- hint: you may need to use
The roads geopackage includes data on roads other than highways. However, we can avoid reading in data we don’t need by taking advantage of st_read
’s ability to subset using a SQL query.
Below is a SQL query that can be used as an argument in st_read
:
"SELECT * FROM gis_osm_roads_free_1 WHERE fclass='motorway'"
Identify homes likely impacted by blackouts
The buildings geopackage includes data on many types of buildings. As with the roads data, we can avoid reading in data we don’t need.
Below is a SQL query that can be used as an argument in st_read
:
"SELECT *
FROM gis_osm_buildings_a_free_1`
WHERE (type IS NULL AND name IS NULL)`
OR type in ('residential', 'apartments', 'house', 'static_caravan', 'detached')"
Identify the census tracts likely impacted by blackout
-
- hint: make sure to join by geometry ID
Rubric (specifications)
Your output should serve as a stand-alone item that someone unfamiliar with the assignment would be able to understand your analysis, including the decisions made in selecting your approach and interpretation of the results.
Assignments will be deemed “Satisfactory” based on the following criteria:
Data analysis
-
- custom warning and error message (e.g.
warning()
andstop()
; resources from EDS 221) - informative comments (resource from EDS 220)
- BONUS: unit tests (e.g. using
{testthat}
; resources from EDS 221)
- custom warning and error message (e.g.
Plots, tables, and maps
-
- an informative title
- legends with legible titles, including units
- color scales that are accessible (i.e. make intuitive sense) and appropriate to the data (i.e. discrete vs. continuous)
- for maps: indication of scale and orientation (i.e. graticules/gridlines or scale bar and compass)
- for figures: axes with legible labels and titles, including units
- All tables should be rendered with legible titles and stypling (e.g.
{kableExtra}
)
Written reflections
Professional output
-
- document header with title, name, and date (ideally the date rendered)
- all packages are loaded together at the top of the document
- all unnecessary/distracting warnings and messages are suppressed
- include informative code comments when appropriate (resource from EDS 220)
- folding code or sourcing separate scripts when appropriate to direct reader’s attention
- succinct documentation between steps, such as section headers, descriptions for an analysis and map/plot interpretation
- complete and detailed data citations
- all defined variables are used in analysis (no variables are defined that are not used)
See examples of professional and unprofessional output on the Assignments page
Footnotes
Wikipedia. 2021. “2021 Texas power crisis.” Last modified October 2, 2021. https://en.wikipedia.org/wiki/2021_Texas_power_crisis.↩︎