This growth model utilizes land cover change scenario-based forecasting to estimate future growth projections in the greater Charlotte region, one of the largest and fastest growing areas of the US, to help build the case for expanding Charlotte’s Lynx light rail line.
library(tidyverse)
library(sf)
library(raster)
library(kableExtra)
library(tidycensus)
library(tigris)
library(FNN)
library(caret)
library(yardstick)
library(plotROC)
library(ggrepel)
library(pROC)
library(grid)
library(gridExtra)
library(viridis)
library(igraph)
library(mapview)
library(FedData)
palette2 <- c("#1b9e77", "#d95f02")
palette4 <- c("#66c2a5", "#41b6c4", "#2b8cbe", "#d95f02")
palette5 <- c("#edf8fb", "#b2e2e2", "#66c2a4", "#2b8cbe", "#fc8d62")
palette10 <- c("#f1eef6", "#d0d1e6", "#a6bddb", "#67a9cf", "#1c9099",
"#41b6c4", "#66c2a5", "#fc8d62", "#e34a33", "#b30000")#this function converts a column in to quintiles. It is used for mapping.
quintileBreaks <- function(df,variable) {
as.character(quantile(df[[variable]],
c(.01,.2,.4,.6,.8),na.rm=T))
}
#This function can be used to convert a polygon sf to centroids xy coords.
xyC <- function(aPolygonSF) {
as.data.frame(
cbind(x=st_coordinates(st_centroid(aPolygonSF))[,1],
y=st_coordinates(st_centroid(aPolygonSF))[,2]))
}
#this function convert a raster to a data frame so it can be plotted in ggplot
rast <- function(inRaster) {
data.frame(
xyFromCell(inRaster, 1:ncell(inRaster)),
value = getValues(inRaster)) }
# knn distance function
nn_function <- function(measureFrom,measureTo,k) {
#convert the sf layers to matrices
measureFrom_Matrix <-
as.matrix(measureFrom)
measureTo_Matrix <-
as.matrix(measureTo)
nn <-
get.knnx(measureTo, measureFrom, k)$nn.dist
output <-
as.data.frame(nn) %>%
rownames_to_column(var = "thisPoint") %>%
gather(points, point_distance, V1:ncol(.)) %>%
arrange(as.numeric(thisPoint)) %>%
group_by(thisPoint) %>%
summarize(pointDistance = mean(point_distance)) %>%
arrange(as.numeric(thisPoint)) %>%
dplyr::select(-thisPoint) %>%
pull()
return(output)
}
aggregateRaster <- function(inputRasterList, theFishnet) {
#create an empty fishnet with the same dimensions as the input fishnet
theseFishnets <- theFishnet %>% dplyr::select()
#for each raster in the raster list
for (i in inputRasterList) {
#create a variable name corresponding to the ith raster
varName <- names(i)
#convert raster to points as an sf
thesePoints <-
rasterToPoints(i) %>%
as.data.frame() %>%
st_as_sf(coords = c("x", "y"), crs = st_crs(theFishnet)) %>%
filter(.[[1]] == 1)
#aggregate to the fishnet
thisFishnet <-
aggregate(thesePoints, theFishnet, length) %>%
mutate(!!varName := ifelse(is.na(.[[1]]),0,1))
#add to the larger fishnet
theseFishnets <- cbind(theseFishnets,thisFishnet)
}
#output all aggregates as one large fishnet
return(theseFishnets)
}This model uses historical data from 2011 and 2019 to predict land change for 2027 in the greater Charlotte region, the boundary of which is defined as the Charlotte-Concord-Gastonia, NC-SC Metropolitan Statistical Area (MSA). Data come from the U.S. Census Bureau and land cover data come from the U.S. Geological Survey (USGS)’s National Land Cover Database (NLCD). Since the region spans two states, census data were retrieved separately for each state and then combined into a single dataset for spatial analysis. In North Carolina, the selected counties are Mecklenburg, Cabarrus, Gaston, Iredell, Lincoln, Rowan, Union, and Anson. In South Carolina, the included counties are York, Chester, and Lancaster.
## [1] "02524912081801068f04f286a086d19e9fc641ca"
readRenviron("~/.Renviron")
# Define counties by state
charlotte_msa_counties <- list(
NC = c("Mecklenburg", "Cabarrus", "Gaston", "Iredell", "Lincoln",
"Rowan", "Union", "Anson"),
SC = c("091", "023", "057")
)
# Variable to download
acsVariables <- c(totpop = "B01001_001")
# Pull data by state and county
msa_tracts <- map_dfr(names(charlotte_msa_counties), function(st) {
get_acs(
geography = "tract",
variables = acsVariables,
state = st,
county = charlotte_msa_counties[[st]],
survey = "acs5",
year = 2019,
output = "wide",
geometry = TRUE
)
}) %>%
st_transform(32119) # NAD83 / North Carolina (ft)## | | | 0% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |==== | 6% | |===== | 7% | |====== | 9% | |======== | 12% | |========= | 13% | |============ | 17% | |============== | 21% | |================== | 26% | |======================= | 32% | |=========================== | 39% | |============================ | 40% | |============================= | 41% | |============================= | 42% | |============================== | 43% | |============================== | 44% | |=============================== | 44% | |================================ | 45% | |================================= | 47% | |=================================== | 51% | |=========================================== | 61% | |============================================ | 63% | |================================================= | 70% | |================================================== | 72% | |========================================================== | 82% | |=========================================================== | 85% | |================================================================= | 93% | |================================================================== | 94% | |==================================================================== | 97% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 100%
## | | | 0% | |== | 3% | |==== | 6% | |========= | 13% | |=========== | 16% | |============= | 19% | |=============== | 21% | |====================== | 32% | |========================== | 37% | |============================ | 40% | |============================== | 43% | |===================================== | 53% | |============================================== | 66% | |==================================================== | 75% | |======================================================== | 80% | |======================================================================| 100%
# Create unified boundary
msa_boundary <- msa_tracts %>%
st_union() %>%
st_as_sf()
ggplot() +
geom_sf(data = msa_boundary, fill = "#a6cee3", color = "#1f78b4") +
geom_sf(data=msa_tracts, fill= NA, color= "#e34a33") +
labs(title = "Charlotte MSA Boundary") +
theme_void()lc_2011 <- get_nlcd(
template = msa_boundary,
label = "charlotte_lc_2011",
year = 2011) %>%
raster(.)
lc_2011_df <- as.data.frame(rasterToPoints(lc_2011))
colnames(lc_2011_df) <- c("x", "y", "landcover")
landcover_labels <- c(
"11" = "Open Water",
"21" = "Developed, Open Space",
"22" = "Developed, Low Intensity",
"23" = "Developed, Medium Intensity",
"24" = "Developed, High Intensity",
"31" = "Barren Land",
"41" = "Deciduous Forest",
"42" = "Evergreen Forest",
"43" = "Mixed Forest",
"52" = "Shrub/Scrub",
"71" = "Grassland/Herbaceous",
"81" = "Pasture/Hay",
"82" = "Cultivated Crops",
"90" = "Woody Wetlands",
"95" = "Emergent Herbaceous Wetlands"
)
ggplot(lc_2011_df, aes(x = x, y = y, fill = factor(landcover))) +
geom_raster() +
scale_fill_manual(
name = "Land Cover Class",
values = viridis::viridis(length(landcover_labels)),
labels = landcover_labels[levels(factor(lc_2011_df$landcover))]
) +
coord_equal() +
labs(title = "Land Cover - 2011 (NLCD)", x = NULL, y = NULL) +
theme_void() +
theme(legend.position = "right", legend.title = element_text(size = 10))The above map shows the distribution of landcover classes in the study area.
Here, we detect land cover change between 2011 and 2019 that has transitioned from undeveloped to developed land.
The above matrix reclassifies urban areas as 1 and everything else as 0.
developed_2011 <-
reclassify(lc_2011_rs, reclassMatrix)
developed_2019 <-
reclassify(lc_2019_rs, reclassMatrix)The above creates binary rasters: 1 for developed, 0 for undeveloped.
Now, we are comparing two rasters cell by cell. Since we only have 1 and 0 in our rasters now, 0 + 0 = 0 means there was no change in undeveloped land, whereas 0 + 1 = 1 means land went from undeveloped to developed. We are not interested in developed land that remains developed (1 + 1 = 2), or developed land that has transitioned to undeveloped (1 + 0 = 1). For this growth projection, we are only interested in rasters that converted from 0 to 1 (undeveloped to developed).
hist(development_change,
main = "Distribution of Development Change",
xlab = "Development Change Value",
col = palette2[2], # Use the darker color for developed (value = 1)
border = "white")This histogram shows the number of cells of each value, and you can see the majority of cells remained undeveloped between 2011 and 2019 (value of 0), less than 400 have remained developed, and the smallest number of cells have transitioned between undeveloped to developed land use.
Here, we drop all values that are not 1, since this is the only change we are interested in forecasting to 2027.
Let’s use Mapview to take a look at the
development_change raster.
mapView(development_change,
col.regions = palette2,
na.color = "transparent",
layer.name = "Development Change")The resulting map shows all raster grid cells in the MSA which were undeveloped in 2011 and transitioned to developed by 2019.
changePoints <-
rasterToPoints(development_change) %>%
as.data.frame() %>%
st_as_sf(coords = c("x", "y"),
crs = st_crs(msa_fishnet))
fishnet <-
aggregate(changePoints,
msa_fishnet,
FUN=sum) %>%
mutate(development_change = ifelse(is.na(layer) == TRUE , 0, 1),
development_change = as.factor(development_change)) %>%
dplyr::select(-layer)ggplot() +
geom_sf(data=msa_boundary %>%
st_transform(crs(fishnet))) +
geom_point(data=fishnet,
aes(x=xyC(fishnet)$x,
y=xyC(fishnet)$y,
colour=development_change)) +
scale_colour_manual(values = palette2,
labels=c("No Change","New Development"),
name = "") +
labs(title = "Land Cover Development Change", subtitle = "As fishnet centroids") +
theme_void()Here, we aggregate our 2011 raster land cover data into a few categories and combine this with our binary classification of land that was developed between 2011 and 2019 to see the amount and location of development based on specific land cover types.
We are going to recode our NLCD data into a few reduced categories to be used in our analysis. Here is the classification logic:
| Old_Classification | New_Classification |
|---|---|
| Open Space as well as Low, Medium and High Intensity Development | Developed |
| Deciduous, Evergreen, and Mixed Forest | Forest |
| Pasture/Hay and Cultivated Crops | Farm |
| Woody and Emergent Herbaceous Wetlands | Wetlands |
| Barren Land, Dwarf Scrub, and Grassland/Herbaceous | Other Undeveloped |
| Water | Water |
In the code block below, we create new rasters, descriptively called
developed_2011, forest_2011 etc , according to
the NLCD codes that correspond to our classification scheme.
developed_2011 <- lc_2011_rs %in% c(21, 22, 23, 24)
forest_2011 <- lc_2011_rs %in% c(41, 42, 42)
farm_2011 <- lc_2011_rs %in% c(81, 82)
wetlands_2011 <- lc_2011_rs %in% c(90, 95)
otherUndeveloped_2011 <- lc_2011_rs %in% c(52, 71, 31)
water_2011 <- lc_2011_rs == 11
developed_2019 <- lc_2019_rs %in% c(21, 22, 23, 24)
forest_2019 <- lc_2019_rs %in% c(41, 42, 42)
farm_2019 <- lc_2019_rs %in% c(81, 82)
wetlands_2019 <- lc_2019_rs %in% c(90, 95)
otherUndeveloped_2019 <- lc_2019_rs %in% c(52, 71, 31)
water_2019 <- lc_2019_rs == 11names(developed_2011) <- "developed_2011"
names(forest_2011) <- "forest_2011"
names(farm_2011) <- "farm_2011"
names(wetlands_2011) <- "wetlands_2011"
names(otherUndeveloped_2011) <- "otherUndeveloped_2011"
names(water_2011) <- "water_2011"
names(developed_2019) <- "developed_2019"
names(forest_2019) <- "forest_2019"
names(farm_2019) <- "farm_2019"
names(wetlands_2019) <- "wetlands_2019"
names(otherUndeveloped_2019) <- "otherUndeveloped_2019"
names(water_2019) <- "water_2019"rasterList_2011 <- c(developed_2011,
forest_2011,
farm_2011,
wetlands_2011,
otherUndeveloped_2011,
water_2011)
rasterList_2019 <- c(developed_2019,
forest_2019,
farm_2019,
wetlands_2019,
otherUndeveloped_2019,
water_2019)lcRasters_2011 <-
aggregateRaster(rasterList_2011,
msa_fishnet) %>%
dplyr::select(developed_2011,
forest_2011,
farm_2011,
wetlands_2011,
otherUndeveloped_2011,
water_2011) %>%
mutate_if(is.numeric,as.factor)
lcRasters_2019 <-
aggregateRaster(rasterList_2019,
msa_fishnet) %>%
dplyr::select(developed_2019,
forest_2019,
farm_2019,
wetlands_2019,
otherUndeveloped_2019,
water_2019) %>%
mutate_if(is.numeric,as.factor)Finally, we take this data and map them to see where developed has occurred by land cover type.
lcRasters_2011 %>%
st_centroid() %>%
gather(key = "variable", value = "value", developed_2011:water_2011) %>%
mutate(X = xyC(.)$x,
Y = xyC(.)$y) %>%
ggplot() +
geom_sf(data=msa_fishnet) +
geom_point(aes(X,Y, colour=as.factor(value))) +
facet_wrap(~variable) +
scale_colour_manual(values = palette2,
labels=c("Other","Land Cover"),
name = "") +
labs(title = "Land Cover Types, 2011",
subtitle = "As fishnet centroids") +
theme_void()Predictably, the majority of new development has occurred in forests and farmland. It is notable that there has been development in areas classified as water, which is likely emergent wetlands that may not be wet throughout the entire year and could therefore more easily be filled in and built upon. There was only marginal development in classified wetlands areas and “other Undeveloped” lands.
acs_vars <- c("B02001_001E")
charlotte_msa_counties <- list(
NC = c("Mecklenburg", "Cabarrus", "Gaston", "Iredell", "Lincoln",
"Rowan", "Union", "Anson"),
SC = c("091", "023", "057") # York, Chester, Lancaster (FIPS)
)
charPop11 <- map_dfr(names(charlotte_msa_counties), function(st) {
get_acs(
geography = "tract",
variables = acs_vars,
year = 2011,
state = st,
county = charlotte_msa_counties[[st]],
geometry = TRUE,
output = "wide"
)
}) %>%
dplyr::select(GEOID, NAME, all_of(acs_vars)) %>%
rename(pop_2011 = B02001_001E) %>%
st_transform(st_crs(msa_fishnet)) %>%
st_buffer(-1)We do this once more for t2, in this case, 2019.
charPop19 <- map_dfr(names(charlotte_msa_counties), function(st) {
get_acs(
geography = "tract",
variables = acs_vars,
year = 2019,
state = st,
county = charlotte_msa_counties[[st]],
geometry = TRUE,
output = "wide"
)
}) %>%
dplyr::select(GEOID, NAME, all_of(acs_vars)) %>%
rename(pop_2019 = B02001_001E) %>%
st_transform(st_crs(msa_fishnet)) %>%
st_buffer(-1)Then we can plot our data and see the spatial arrangement of population in our study area in t1 (2011) and t2 (2019).
One thing worth noting here is that the quantile symbology that we apply probably changes from t1 to t2 - so you can see that, on average, there was densification across the metro area on this time interval because the quantile ranges shift upwards.
grid.arrange(
ggplot() +
geom_sf(data = charPop11, aes(fill=factor(ntile(pop_2011,5))), colour=NA) +
scale_fill_manual(values = palette5,
labels=quintileBreaks(charPop11,"pop_2011"),
name="Quintile\nBreaks") +
labs(title="Population, Charlotte–Concord–Gastonia- MSA: 2011") +
theme_void(),
ggplot() +
geom_sf(data = charPop19, aes(fill=factor(ntile(pop_2019,5))), colour=NA) +
scale_fill_manual(values = palette5,
labels=quintileBreaks(charPop19,"pop_2019"),
name="Quintile\nBreaks") +
labs(title="Population, Charlotte–Concord–Gastonia MSA: 2019") +
theme_void(), ncol=2)Here we use Areal Weighted Interpretation (AWI) to distribute the population within each tract across the fishnet grid cells; this method assumes that population is distributed evenly across the tract, which is reasonable here because we are not measuring population as an outcome that requires meaningful precision in this analysis. The output is a new fishnet for each year that includes a population estimate within each tract.
fishnetPopulation11 <-
st_interpolate_aw(charPop11["pop_2011"],
msa_fishnet,
extensive=TRUE) %>%
st_centroid() %>%
st_join(msa_fishnet, ., join = st_intersects) %>%
mutate(pop_2011 = replace_na(pop_2011,0)) %>%
dplyr::select(pop_2011)We repeat the process for our t2 data.
fishnetPopulation19 <-
st_interpolate_aw(charPop19["pop_2019"],
msa_fishnet,
extensive=TRUE) %>%
st_centroid() %>%
st_join(msa_fishnet, ., join = st_intersects) %>%
mutate(pop_2019 = replace_na(pop_2019,0)) %>%
dplyr::select(pop_2019)How do these data look when we map them out and compare them to the census vectors? We can see that the modifiable aerial unit issue - where smaller tracts are higher density because of the logic of tract-drawing - no longer distorts the picture - we see increased density in the center, where we expect it.
grid.arrange(
ggplot() +
geom_sf(data=charPop11, aes(fill=factor(ntile(pop_2011,5))),colour=NA) +
scale_fill_manual(values = palette5,
labels=substr(quintileBreaks(charPop11,"pop_2011"),1,4),
name="Quintile\nBreaks") +
labs(title="Population, Charlotte–Concord–Gastonia MSA: 2011",
subtitle="Represented as tracts; Boundaries omitted") +
theme_void(),
ggplot() +
geom_sf(data=fishnetPopulation11,
aes(fill=factor(ntile(pop_2011,5))),colour=NA) +
scale_fill_manual(values = palette5,
labels=substr(quintileBreaks(fishnetPopulation11,"pop_2011"),1,4),
name="Quintile\nBreaks") +
labs(title="Population, Charlotte–Concord–Gastonia MSA: 2011",
subtitle="Represented as fishnet gridcells; Boundaries omitted") +
theme_void(), ncol=2)grid.arrange(
ggplot() +
geom_sf(data=charPop19, aes(fill=factor(ntile(pop_2019,5))),colour=NA) +
scale_fill_manual(values = palette5,
labels=substr(quintileBreaks(charPop19,"pop_2019"),1,4),
name="Quintile\nBreaks") +
labs(title="Population, Charlotte–Concord–Gastonia MSA: 2019",
subtitle="Represented as tracts; Boundaries omitted") +
theme_void(),
ggplot() +
geom_sf(data=fishnetPopulation19,
aes(fill=factor(ntile(pop_2019,5))),colour=NA) +
scale_fill_manual(values = palette5,
labels=substr(quintileBreaks(fishnetPopulation19,"pop_2019"),1,4),
name="Quintile\nBreaks") +
labs(title="Population, Charlotte–Concord–Gastonia MSA: 2019",
subtitle="Represented as fishnet gridcells; Boundaries omitted") +
theme_void(), ncol=2)Transportation access in the Charlotte MSA is critical for the growing economy, which includes a robust manufacturing sector in addition to financial, tech, and service-based employers. Automobiles are the primary method of transportation for the majority of residents, but Mecklenburg County also operates a transit system that includes two light rail lines and a bus system.
First highway vectors (primary_roads) are downloaded
from the Tigris package - we project the data and subset it
to the study area using st_intersection.
charHighways <- bind_rows(
primary_secondary_roads(state = "NC"),
primary_secondary_roads(state = "SC")
) %>%
st_transform(st_crs(msa_boundary)) %>% # Match to MSA CRS
st_intersection(msa_boundary) %>% # Clip to MSA
st_transform(st_crs(fishnet)) # Match CRS to fishnetLet’s make a map and examine the spatial relationship between highways and development.
ggplot() +
geom_point(data=fishnet,
aes(x=xyC(fishnet)[,1], y=xyC(fishnet)[,2],colour=development_change),size=1.5) +
geom_sf(data=charHighways) +
scale_colour_manual(values = palette2,
labels=c("No Change","New Development")) +
labs(title = "New Development and Highways",
subtitle = "As fishnet centroids") +
theme_void()Here, we pull highway data from 2011.
highwayPoints_fishnet_2011 <- msa_fishnet %>%
st_centroid() %>%
mutate(distance_highways_2011 = as.numeric(st_distance(., st_union(charHighways) %>%
st_transform(st_crs(msa_fishnet))))) %>%
as.data.frame() %>%
dplyr::select(-geometry) %>%
left_join(msa_fishnet, .) %>%
st_as_sf()
highwayPoints_fishnet_2019 <- highwayPoints_fishnet_2011 %>%
rename(distance_highways_2019 = distance_highways_2011)ggplot() +
geom_sf(data=msa_boundary %>% st_transform(st_crs(highwayPoints_fishnet_2011))) +
geom_point(data=highwayPoints_fishnet_2011, aes(x=xyC(highwayPoints_fishnet_2011)[,1],
y=xyC(highwayPoints_fishnet_2011)[,2],
colour=factor(ntile(distance_highways_2011,5))),size=1.5) +
scale_colour_manual(values = palette5,
labels=substr(quintileBreaks(highwayPoints_fishnet_2011,"distance_highways_2011"),1,8),
name="Quintile\nBreaks") +
geom_sf(data=charHighways, colour = "red") +
labs(title = "Distance to Highways (m)",
subtitle = "As fishnet centroids; Highways visualized in red") +
theme_void()In this step, we calculate the average distance to each grid cell’s 2
nearest developed grid cells in 2011 using a new variable created by our
k-nearest-neighbors custom function (nn_function) for
spatial lag, lagDevelopment_2011 and
lagDevelopment_2019.
fishnet$lagDevelopment_2011 <-
nn_function(fishnet %>%
st_centroid() %>%
st_coordinates() %>%
as.data.frame(),
lcRasters_2011 %>%
filter(developed_2011 == 1) %>%
st_centroid() %>%
st_coordinates() %>%
as.data.frame(),
2)
fishnet$lagDevelopment_2019 <-
nn_function(fishnet %>%
st_centroid() %>%
st_coordinates() %>%
as.data.frame(),
lcRasters_2019 %>%
filter(developed_2019 == 1) %>%
st_centroid() %>%
st_coordinates() %>%
as.data.frame(),
2)ggplot() +
geom_sf(data=msa_boundary %>% st_transform(st_crs(fishnet))) +
geom_point(data=fishnet,
aes(x=xyC(fishnet)[,1], y=xyC(fishnet)[,2],
colour=factor(ntile(lagDevelopment_2011,5))), size=1.5) +
scale_colour_manual(values = palette5,
labels=substr(quintileBreaks(fishnet,"lagDevelopment_2011"),1,7),
name="Quintile\nBreaks") +
labs(title = "Spatial Lag to 2011 Development, m",
subtitle = "As fishnet centroids") +
theme_void()The MSA includes Mecklenburg, Cabarrus, Gaston, Iredell, Lincoln, Rowan, Union, and Anson Counties in NC and York, Chester, and Lancaster counties in SC.
options(tigris_class = "sf")
nc_counties <- counties(state = "NC") %>% st_as_sf()
sc_counties <- counties(state = "SC") %>% st_as_sf()
all_counties <- bind_rows(nc_counties, sc_counties) %>%
st_transform(st_crs(msa_fishnet))
studyAreaCounties <- all_counties %>%
filter(
(STATEFP == "37" & NAME %in% c("Mecklenburg", "Cabarrus", "Gaston", "Iredell",
"Lincoln", "Rowan", "Union", "Anson")) |
(STATEFP == "45" & COUNTYFP %in% c("091", "023", "057")) # York, Chester, Lancaster
)We spatially join the studyAreaCounties object to our
fishnet and create a countyFishnet where each cell is
imparted with the name of the county it belongs to.
Here we compile our data to create a model which we can train for a future scenario projection. This includes:
For t1 (2011): - fishnet - highwayPoints_fishnet_2011 - fishnetPopulation11 - lcRasters_2011 - countyFishnet
For t2 (2019): - fishnet - highwayPoints_fishnet_2019 - fishnetPopulation19 - lcRasters_2019 - countyFishnet
dat_2011 <-
cbind( fishnet, highwayPoints_fishnet_2011, fishnetPopulation11, lcRasters_2011, countyFishnet) %>%
as.data.frame() %>%
dplyr::select(uniqueID, development_change, lagDevelopment_2011, distance_highways_2011, pop_2011,
developed_2011, forest_2011, farm_2011, wetlands_2011, otherUndeveloped_2011, water_2011,
NAME, geometry) %>%
filter(water_2011 == 0) %>%
rename_with(~ str_remove(.x, "_2011"))
dat_2019 <-
cbind( fishnet, highwayPoints_fishnet_2019, fishnetPopulation19, lcRasters_2019, countyFishnet) %>%
as.data.frame() %>%
dplyr::select(uniqueID, development_change, lagDevelopment_2019, distance_highways_2019, pop_2019,
developed_2019, forest_2019, farm_2019, wetlands_2019, otherUndeveloped_2019, water_2019,
NAME, geometry) %>%
filter(water_2019 == 0) %>%
rename_with(~ str_remove(.x, "_2019"))In this section we explore the extent to our possible predictors (distance to highways, spatial lag of development, and population change) are associated with development change. One note: In an initial run of the model, we did see that proximity to the Blue Line was a strong predictor of development between 2011 and 2019, but did not include the lines in our initial model after a conversation with Isabel. The Lynx Blue line was first built in 2007 and extended in 2018, while the Gold Line was completed in 2015. Because the transit infrastructure has not remained consistent throughout this time period, we did not include it here.
dat_2011 %>%
dplyr::select(distance_highways,lagDevelopment,development_change, pop) %>%
gather(Variable, Value, -development_change) %>%
ggplot(., aes(development_change, Value, fill=development_change)) +
geom_bar(position = "dodge", stat = "summary", fun.y = "mean") +
facet_wrap(~Variable, scales = "free") +
scale_fill_manual(values = palette2,
labels=c("No Change","New Development"),
name="Mean Value") +
labs(title="New Development as a Function of Continuous Variables") +
theme_minimal() This histogram shows that on average, land that was developed between 2011 and 2019 was closer to highways, closer to other development, and was in areas with higher populations, indicating densification over sprawl.
When we look at the type of land converted in this time interval, we see it was predominately forested land, with some farms converted and only marginal “Other undeveloped” land converted, per the table below.
dat_2011 %>%
dplyr::select(development_change, forest, farm, wetlands, otherUndeveloped) %>%
gather(key = "Land_Cover_Type", Value, -development_change) %>%
group_by(development_change, Land_Cover_Type) %>%
summarize(n = sum(as.numeric(Value))) %>%
ungroup() %>%
mutate(Conversion_Rate = paste0(round(100 * n/sum(n), 2), "%")) %>%
filter(development_change == 1) %>%
dplyr::select(Land_Cover_Type,Conversion_Rate) %>%
kable() %>% kable_styling(full_width = F)| Land_Cover_Type | Conversion_Rate |
|---|---|
| farm | 0.25% |
| forest | 0.92% |
| otherUndeveloped | 0.04% |
| wetlands | 0% |
Here, we train and test our model with an even 50% of data split between the training and test sets to help keep a fair number of 1s in our training set.
We estimate six separate glm models - adding new
variables for each.
Model1 includes only previous land cover
types.
Model2 adds the
lagDevelopment.
Model3 adds population
Model4 adds a fixed effect for county
(NAME)
Model5 adds the distance to highway.
Model6 is a modification of Model5
which interacts distance to highway and development lag. The
hypothesis here is that these two variables are related - the effect of
one depends on the other. e.g. Distance to nearest development depends
in part on access to transportation. Notice that the effect of both
variables is significant in this specification, but both are not
significant in Model5.
Model1 <- glm(development_change ~ wetlands + forest + farm +
otherUndeveloped,
family="binomial"(link="logit"), data = datTrain)
Model2 <- glm(development_change ~ wetlands + forest + farm +
otherUndeveloped + lagDevelopment,
family="binomial"(link="logit"), data = datTrain)
Model3 <- glm(development_change ~ wetlands + forest + farm +
otherUndeveloped + lagDevelopment + pop,
family="binomial"(link="logit"), data = datTrain)
Model4 <- glm(development_change ~ wetlands + forest + farm +
otherUndeveloped + lagDevelopment + pop + NAME,
family="binomial"(link="logit"), data = datTrain)
Model5 <- glm(development_change ~ wetlands + forest + farm +
otherUndeveloped + lagDevelopment + pop + distance_highways + NAME,
family="binomial"(link="logit"), data = datTrain)
Model6 <- glm(development_change ~ wetlands + forest + farm +
otherUndeveloped + pop + lagDevelopment * distance_highways + NAME,
family="binomial"(link="logit"), data = datTrain)The best-performing model was Model6, so this is which we use for validation and prediction.
##
## Call:
## glm(formula = development_change ~ wetlands + forest + farm +
## otherUndeveloped + pop + lagDevelopment * distance_highways +
## NAME, family = binomial(link = "logit"), data = datTrain)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.931e+00 7.392e-01 -5.318 1.05e-07 ***
## wetlands1 -1.326e+01 1.797e+03 -0.007 0.994
## forest1 2.520e+00 3.064e-01 8.223 < 2e-16 ***
## farm1 1.683e+00 4.125e-01 4.080 4.50e-05 ***
## otherUndeveloped1 3.546e+00 6.347e-01 5.586 2.32e-08 ***
## pop 2.771e-05 5.412e-04 0.051 0.959
## lagDevelopment -1.072e-03 2.708e-04 -3.958 7.57e-05 ***
## distance_highways 1.968e-04 2.031e-04 0.969 0.333
## NAMECabarrus 4.390e-01 6.677e-01 0.658 0.511
## NAMEChester -1.295e+00 1.167e+00 -1.109 0.267
## NAMEGaston -8.743e-01 7.586e-01 -1.152 0.249
## NAMEIredell -2.831e-01 7.204e-01 -0.393 0.694
## NAMELancaster 2.186e-01 7.342e-01 0.298 0.766
## NAMELincoln -9.108e-01 9.349e-01 -0.974 0.330
## NAMEMecklenburg 4.304e-01 6.675e-01 0.645 0.519
## NAMERowan -1.582e+01 5.154e+02 -0.031 0.976
## NAMEUnion 1.745e-01 6.757e-01 0.258 0.796
## NAMEYork 3.516e-01 6.590e-01 0.533 0.594
## lagDevelopment:distance_highways -1.311e-07 1.456e-07 -0.901 0.368
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1103.49 on 9209 degrees of freedom
## Residual deviance: 822.52 on 9191 degrees of freedom
## (1 observation deleted due to missingness)
## AIC: 860.52
##
## Number of Fisher Scoring iterations: 19
Here, we compare AICs for goodness of fit (lower is better).
data.frame(
Model = c("Model1", "Model2", "Model3", "Model4", "Model5", "Model6"),
AIC = c(Model1$aic, Model2$aic, Model3$aic, Model4$aic, Model5$aic, Model6$aic)
) %>%
ggplot()+
geom_bar(aes(x = Model, y = AIC), stat = "identity")+
theme_minimal()We create a new data frame testSetProbs that consists of
our class (e.g. development_change as a 1 or 0), and
probs - which is the prediction for each observation in our
test set using Model6. The type parameter is
set to response, which means our probs are
measures of estimated probability from 0-1.
testSetProbs <-
data.frame(class = datTest$development_change,
probs = predict(Model6, datTest, type="response")) This density plot is a key tool to figure out where we will set our threshold for classifying predicted probabilities as 1s or 0s (e.g. Predicted to Develop or Predicted Not To Develop).
Take a close look at this plot and see if you have some ideas about what threshold would separate out our 1s from our 0s most effectively with a minimum of error.
ggplot(testSetProbs, aes(probs)) +
geom_density(aes(fill=class), alpha=0.5) +
scale_fill_manual(values = palette2,
labels=c("No Change","New Development")) +
labs(title = "Histogram of test set predicted probabilities",
x="Predicted Probabilities",y="Density") +
theme_minimal()Here, we set 0.05 as our cutoff value - above that value, we predict
that cell is a 1, and below it, a 0.
We build a confusion matrix to see how accurate this model is overall, and what our error rates are. There are our four potential outcomes:
True Positive: We predicted a cell would convert from undeveloped to developed and it did convert.
False Positive: We predicted a cell would convert from undeveloped to developed and it did not convert.
True Negative: We predicted a cell would not convert from undeveloped to developed and it did not convert.
False Negative: We predicted a cell would not convert from undeveloped to developed and it did convert.
testSetProbs$predClass = ifelse(testSetProbs$probs > .05 ,1,0)
caret::confusionMatrix(reference = as.factor(testSetProbs$class),
data = as.factor(testSetProbs$predClass),
positive = "1")## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 8624 63
## 1 462 59
##
## Accuracy : 0.943
## 95% CI : (0.9381, 0.9476)
## No Information Rate : 0.9868
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.1656
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.483607
## Specificity : 0.949153
## Pos Pred Value : 0.113244
## Neg Pred Value : 0.992748
## Prevalence : 0.013249
## Detection Rate : 0.006407
## Detection Prevalence : 0.056581
## Balanced Accuracy : 0.716380
##
## 'Positive' Class : 1
##
Model Analysis
True Negatives (TN) = 8653
False Negatives (FN) = 63
False Positives (FP) = 434
True Positives (TP) = 58
The confusion matrix indicates that the model is highly effective at identifying cells that remain undeveloped, with a large number of true negatives. However, it struggles to accurately detect actual land conversions, as reflected in the low number of true positives.
The sensitivity (47.9%), or true positive rate, suggests that the model correctly identifies less than half of the cells that actually converted from undeveloped to developed. In contrast, the specificity (95.2%) shows that the model is very good at identifying cells that remained undeveloped.
While the overall accuracy (94.6%) appears high, this is primarily driven by the model’s strong performance in predicting the dominant class (non-conversion). This high accuracy can be misleading in imbalanced datasets like this one, where the majority of cells remain undeveloped.
ggplot(testSetProbs, aes(d = as.numeric(class), m = probs)) +
geom_roc(n.cuts = 50, labels = FALSE) +
style_roc(theme = theme_grey) +
geom_abline(slope = 1, intercept = 0, size = 1.5, color = 'grey') +
theme_minimal()library(plotROC)
# Ensure class is numeric (0 or 1)
testSetProbs$class <- as.numeric(as.character(testSetProbs$class))
# Calculate AUC
roc_plot <- ggplot(testSetProbs, aes(d = class, m = probs)) +
geom_roc(n.cuts = 50, labels = FALSE)
calc_auc(roc_plot)The Area under the curve (AUC) is 87.5% which is higher than 70% which shows strong overall model fit.
dat_2011_preds <-
dat_2011 %>%
mutate(probs = predict(Model6, dat_2011, type="response")) %>%
mutate(Threshold_5_Pct = as.factor(ifelse(probs >= 0.05 ,1,0)),
Threshold_17_Pct = as.factor(ifelse(probs >= 0.17 ,1,0))) %>%
mutate(confResult_05 = case_when(Threshold_5_Pct == 0 & development_change == 0 ~ "True_Negative",
Threshold_5_Pct == 1 & development_change==1 ~ "True_Positive",
Threshold_5_Pct == 0 & development_change==1 ~ "False_Negative",
Threshold_5_Pct == 1 & development_change ==0 ~ "False_Positive")) %>%
mutate(confResult_17 = case_when(Threshold_17_Pct == 0 & development_change == 0 ~ "True_Negative",
Threshold_17_Pct == 1 & development_change==1 ~ "True_Positive",
Threshold_17_Pct == 0 & development_change==1 ~ "False_Negative",
Threshold_17_Pct == 1 & development_change ==0 ~ "False_Positive")) %>%
st_as_sf()# Summarize by county and model type
dat_2011_preds %>%
as.data.frame() %>%
dplyr::select(confResult_05, confResult_17, NAME) %>%
pivot_longer(cols = starts_with("confResult"), names_to = "Model_Type", values_to = "Confusion_Result") %>%
group_by(NAME, Model_Type, Confusion_Result) %>%
tally() %>%
pivot_wider(names_from = Confusion_Result, values_from = n, values_fill = 0) %>% # Reshape to wide format
mutate(TN_Rate_Specificity = 100*( True_Negative/(True_Negative+False_Positive)),
TP_Rate_Sensitivity = 100*( True_Positive/(True_Positive + False_Negative))) %>%
dplyr::select(NAME, Model_Type, TN_Rate_Specificity, TP_Rate_Sensitivity) %>%
kable() %>%
kable_styling()| NAME | Model_Type | TN_Rate_Specificity | TP_Rate_Sensitivity |
|---|---|---|---|
| Anson | confResult_05 | 98.56828 | 20.000000 |
| Anson | confResult_17 | 99.77974 | 0.000000 |
| Cabarrus | confResult_05 | 84.77331 | 64.516129 |
| Cabarrus | confResult_17 | 99.91446 | 0.000000 |
| Chester | confResult_05 | 100.00000 | 0.000000 |
| Chester | confResult_17 | 100.00000 | 0.000000 |
| Gaston | confResult_05 | 99.47276 | 0.000000 |
| Gaston | confResult_17 | 100.00000 | 0.000000 |
| Iredell | confResult_05 | 95.80420 | 30.000000 |
| Iredell | confResult_17 | 100.00000 | 0.000000 |
| Lancaster | confResult_05 | 95.48998 | 61.538461 |
| Lancaster | confResult_17 | 100.00000 | 0.000000 |
| Lincoln | confResult_05 | 100.00000 | 16.666667 |
| Lincoln | confResult_17 | 100.00000 | 0.000000 |
| Mecklenburg | confResult_05 | 83.40865 | 78.571429 |
| Mecklenburg | confResult_17 | 99.48354 | 3.571429 |
| Rowan | confResult_05 | 100.00000 | 0.000000 |
| Rowan | confResult_17 | 100.00000 | 0.000000 |
| Union | confResult_05 | 92.39709 | 58.620690 |
| Union | confResult_17 | 99.95157 | 0.000000 |
| York | confResult_05 | 93.38095 | 59.375000 |
| York | confResult_17 | 99.90476 | 0.000000 |
| NA | confResult_05 | NaN | NaN |
| NA | confResult_17 | NaN | NaN |
ggplot() +
geom_sf(data= dat_2011_preds %>%
st_centroid() %>%
dplyr::select(confResult_05, confResult_17, geometry) %>%
gather(key = "Variable", value = "Value", -geometry),
aes(colour=Value)) +
facet_wrap(~Variable) +
scale_colour_manual(values = c("red", "yellow", "blue", "grey"), labels=c("False Negative","False Positive", "True Negative", "True Positive"),
name="") +
labs(title="Development Predictions - By Threshold") +
theme_void()The table and map also highlights development patterns across different counties in the region. Counties such as Anson, Chester, and Gaston exhibit very high specificity but negligible sensitivity, indicating minimal land conversion from undeveloped to developed. These areas show limited development activity and can be considered potential targets for future transit-oriented development (TOD). In contrast, more urbanized counties like Mecklenburg and York display higher sensitivity, meaning the model correctly predicted a significant number of conversions reflecting ongoing development activity in these areas.
Here, we propose a path for a new Lynx Silver rail line from South Gastonia in Gaston County, NC to Monroe, NC, the seat of Union County. This route is proposed for several reasons:
Charlotte currently has two primary light rail lines within Mecklenburg County that serve downtown, but neither runs to the airport. With high visitor volume, business traffic, as well as three professional sports team stadiums located downtown, congestion in the region could be dramatically reduced with the extension of a new light rail line that serves communities to the east and west of downtown and connects them through the airport. The airport is pursuing a $4B expansion campaign that includes a renovated and expanded terminal and new runway, which could be built to serve this rail line.
The region’s largest population group is 18 to 44 year olds and the median age is 38, demographics that are more likely to utilize public transit. Although Lynx ridership levels are still not back to pre-COVID levels, ridership of the Gold line increased by nearly 14% and the Blue line’s ridership also increased by 12% between 2023 and 2024, indicating there is a market for increased access to public transit.
Rather than working across state boundaries, it will be easier to keep the line within North Carolina from a financing and implementation standpoint.
## Reading layer `Proposed_Silverline' from data source
## `C:\Users\goyal\Box\LEUM_Assignemt_4\Data_Charlotte\Proposed_Silverline.dbf'
## using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 1375501 ymin: 445451.3 xmax: 1548474 ymax: 545096.7
## Projected CRS: NAD83 / North Carolina (ftUS)
silver_line_proj <- st_transform(silver_line, st_crs(fishnetPopulation19))
# Create a buffer (800 meters = ~2625 feet)
silver_line_buffer <- st_buffer(silver_line_proj, dist = 2625)
# Intersect buffered zone with 2019 fishnet population
pop_served <- fishnetPopulation19[st_intersects(fishnetPopulation19, silver_line_buffer, sparse = FALSE), ]
# Summarize population served
served_summary <- pop_served %>%
st_drop_geometry() %>%
summarize(population_served = sum(pop_2019, na.rm = TRUE))
print(served_summary)## population_served
## 1 325429.7
acsVariables2 <- c(
income = "B19013_001",
total_pop = "B02001_001",
white_alone = "B02001_002")
msa_tracts2 <- map_dfr(names(charlotte_msa_counties), function(st) {
get_acs(
geography = "tract",
variables = acsVariables2,
state = st,
county = charlotte_msa_counties[[st]],
survey = "acs5",
year = 2019,
output = "wide",
geometry = TRUE
)
}) %>%
st_transform(st_crs(fishnetPopulation19)) %>%
mutate(
pct_nonwhite = 100 * (total_popE - white_aloneE) / total_popE,
income_group = ntile(incomeE, 6),
race_group = ntile(pct_nonwhite, 6)
)
##Income map
ggplot() +
geom_sf(data = msa_tracts2, aes(fill = factor(income_group)), color = NA) +
scale_fill_manual(values = palette10, name = "Income Quintile") +
geom_sf(data = silver_line_proj, color = "white", size = 1.2) +
geom_sf(data = silver_line_buffer, fill = "lightblue", alpha = 0.3, color = NA) +
labs(
title = "Median Household Income and Silver Line Transit Access (2019)",
subtitle = "Silver Line buffer represents 800m walk distance"
) +
theme_void()## Race map
ggplot() +
geom_sf(data = msa_tracts2, aes(fill = factor(race_group)), color = NA) +
scale_fill_manual(
values = c("#FFA500", "#FF7F50", "#FF8C00", "#FF6347", "#FF4500", "#E67E22"),
name = "% Non-White Quintile"
) +
geom_sf(data = silver_line_proj, color = "white", size = 1.2) +
geom_sf(data = silver_line_buffer, fill = "lightblue", alpha = 0.3, color = NA) +
labs(
title = "Racial Demographics and Silver Line Transit Access (2019)",
subtitle = "Tracts grouped by % non-white residents"
) +
theme_void()## Population Map
ggplot() +
geom_sf(data = fishnetPopulation19 %>%
mutate(pop_quintile = factor(ntile(pop_2019, 6))),
aes(fill = pop_quintile), color = NA) +
scale_fill_manual(values = c("#edf8e9", "#bae4b3", "#74c476", "#31a354", "#006d2c", "#00441b"), name = "Pop. Quintile") +
geom_sf(data = silver_line_proj, color = "white", size = 1.2) +
geom_sf(data = silver_line_buffer, fill = "lightblue", alpha = 0.3, color = NA) +
labs(title = "Population Density and Silver Line Transit Access (2019)",
subtitle = "Silver Line buffer represents 800m walk distance") +
theme_void()#plotting along with highways to give people currently using a highway an alternate mode of transportation
ggplot() +
geom_sf(data = fishnetPopulation19 %>%
mutate(pop_quintile = factor(ntile(pop_2019, 6))),
aes(fill = pop_quintile), color = NA) +
scale_fill_viridis_d(name = "Pop. Quintile") +
geom_sf(data = silver_line_proj, color = "white", size = 1.2) +
geom_sf(data = silver_line_buffer, fill = "lightblue", alpha = 0.3, color = NA) +
geom_sf(data = charHighways, color = "gray30", size = 0.5) +
labs(title = "Population Density and Silver Line Transit Access with Highways (2019)",
subtitle = "Silver Line buffer represents 800m walk distance") +
theme_void()
Median Household Income and Silver Line Transit Access The map
illustrates income inequality across the Charlotte MSA, where
higher-income households are predominantly concentrated in already
developed urban and suburban areas.
Racial Demographics This map reveals clear patterns of racial segregation within the MSA. Areas with a higher percentage of non-white residents tend to be spatially clustered and often located in tracts with less infrastructure investment and limited proximity to high-capacity transit corridors.
Population Density Population density is heavily concentrated in the urban core, particularly around central Charlotte, and gradually decreases toward the periphery. The second map overlays current highway infrastructure, revealing that even areas with low population density have significant highway connectivity. However, development in these regions remains slow, underscoring the need for expanded public transit and affordable, equitable access to the urban center.
Hence, we are proposing adding a new Lynx Silver line which will serve approximately 325,430 people based on 2019 population data within a 800m/0.5mi walkshed of the line.
dat_2019_sf <- st_as_sf(dat_2019)
silver_line <- st_transform(silver_line, st_crs(dat_2019_sf))
dat_2019_sf <- st_transform(dat_2019_sf, st_crs(silver_line))
centroids <- st_centroid(dat_2019_sf)
dist_to_silver <- as.numeric(st_distance(centroids, st_union(silver_line)))
dat_2019$distance_silver <- dist_to_silverdat_2027_preds <- dat_2019 %>%
mutate(probs = predict(Model6, dat_2019, type="response") ,
Prediction = as.factor(ifelse(probs >= 0.17 ,1,0))) %>%
st_as_sf()Now let’s map it - where are the cells that are classified as likely to develop by 2027? Note that blank cells are water.
ggplot(data=dat_2027_preds) +
geom_point(aes(x=xyC(dat_2027_preds)[,1],
y=xyC(dat_2027_preds)[,2], colour = Prediction)) +
geom_sf(data = studyAreaCounties, fill = "transparent")+
labs(title="Development Predictions, 2027") +
theme_void()
The above map highlights all cells which will convert to developed in
2027 taking into account the new transport infrastructure.
Here, we assess the impact of our predictive model - how much development is to be expected in 2027 and where will it occur within the Charlotte region?
dat_2027_preds %>%
as.data.frame() %>%
filter(Prediction == 1) %>%
tally() %>%
rename(total_cells = n) %>%
mutate(total_area_m = total_cells * res(lc_2019_rs)[1],
total_km2 = total_area_m/1000000,
total_mi2 = total_km2*0.386102) %>%
kable() %>%
kable_styling ()| total_cells | total_area_m | total_km2 | total_mi2 |
|---|---|---|---|
| 55 | 49500 | 0.0495 | 0.019112 |
County-by-county forecast:
dat_2027_preds %>%
as.data.frame() %>%
filter(Prediction == 1) %>%
group_by(NAME) %>%
tally() %>%
rename(total_cells = n) %>%
mutate(total_area_m = total_cells * res(lc_2019_rs)[1],
total_km2 = total_area_m/1000000,
total_mi2 = total_km2*0.386102) %>%
kable() %>%
kable_styling ()| NAME | total_cells | total_area_m | total_km2 | total_mi2 |
|---|---|---|---|---|
| Anson | 5 | 4500 | 0.0045 | 0.0017375 |
| Cabarrus | 5 | 4500 | 0.0045 | 0.0017375 |
| Lancaster | 9 | 8100 | 0.0081 | 0.0031274 |
| Mecklenburg | 17 | 15300 | 0.0153 | 0.0059074 |
| Union | 8 | 7200 | 0.0072 | 0.0027799 |
| York | 11 | 9900 | 0.0099 | 0.0038224 |
Forecast by 2019 land cover type:
dat_2027_preds %>%
as.data.frame() %>%
filter(Prediction == 1) %>%
dplyr::select(farm, otherUndeveloped, forest, wetlands, water, NAME) %>%
gather(-NAME, key = "Variable", value = "Value") %>%
group_by(NAME, Variable) %>%
summarize(total_cells = sum(as.numeric(Value))) %>%
mutate(total_area_m = total_cells * res(lc_2019_rs)[1],
total_km2 = total_area_m/1000000,
total_mi2 = total_km2*0.386102) %>%
kable() %>%
kable_styling ()| NAME | Variable | total_cells | total_area_m | total_km2 | total_mi2 |
|---|---|---|---|---|---|
| Anson | farm | 0 | 0 | 0.0000 | 0.0000000 |
| Anson | forest | 0 | 0 | 0.0000 | 0.0000000 |
| Anson | otherUndeveloped | 5 | 4500 | 0.0045 | 0.0017375 |
| Anson | water | 0 | 0 | 0.0000 | 0.0000000 |
| Anson | wetlands | 0 | 0 | 0.0000 | 0.0000000 |
| Cabarrus | farm | 0 | 0 | 0.0000 | 0.0000000 |
| Cabarrus | forest | 0 | 0 | 0.0000 | 0.0000000 |
| Cabarrus | otherUndeveloped | 5 | 4500 | 0.0045 | 0.0017375 |
| Cabarrus | water | 0 | 0 | 0.0000 | 0.0000000 |
| Cabarrus | wetlands | 0 | 0 | 0.0000 | 0.0000000 |
| Lancaster | farm | 0 | 0 | 0.0000 | 0.0000000 |
| Lancaster | forest | 0 | 0 | 0.0000 | 0.0000000 |
| Lancaster | otherUndeveloped | 9 | 8100 | 0.0081 | 0.0031274 |
| Lancaster | water | 0 | 0 | 0.0000 | 0.0000000 |
| Lancaster | wetlands | 0 | 0 | 0.0000 | 0.0000000 |
| Mecklenburg | farm | 0 | 0 | 0.0000 | 0.0000000 |
| Mecklenburg | forest | 0 | 0 | 0.0000 | 0.0000000 |
| Mecklenburg | otherUndeveloped | 17 | 15300 | 0.0153 | 0.0059074 |
| Mecklenburg | water | 0 | 0 | 0.0000 | 0.0000000 |
| Mecklenburg | wetlands | 0 | 0 | 0.0000 | 0.0000000 |
| Union | farm | 0 | 0 | 0.0000 | 0.0000000 |
| Union | forest | 0 | 0 | 0.0000 | 0.0000000 |
| Union | otherUndeveloped | 8 | 7200 | 0.0072 | 0.0027799 |
| Union | water | 0 | 0 | 0.0000 | 0.0000000 |
| Union | wetlands | 0 | 0 | 0.0000 | 0.0000000 |
| York | farm | 0 | 0 | 0.0000 | 0.0000000 |
| York | forest | 0 | 0 | 0.0000 | 0.0000000 |
| York | otherUndeveloped | 11 | 9900 | 0.0099 | 0.0038224 |
| York | water | 0 | 0 | 0.0000 | 0.0000000 |
| York | wetlands | 0 | 0 | 0.0000 | 0.0000000 |
Planning context According to the U.S. Census Bureau, the city of Charlotte itself is the 15th most populous city in the US and the MSA is the largest metro area in the Carolinas and fourth largest in the southeastern US. According to the Charlotte Regional Business Alliance, the region’s population has grown by 20% since 2010 and is projected to grow an additional 12% by 2030, with an estimated 117 people moving into the region daily. The region has seen a 16% labor force growth since 2010 and has a lower unemployment rate and higher year-over-year employment change than the US on average, as well as the state averages for NC and SC.
Population and economic growth in the greater Charlotte region are due in part to the city’s role as a major financial center and transportation hub, with headquarters for Bank of America and Truist Financial, as well as the largest employment hub for Wells Fargo. Charlotte’s airport is the sixth busiest in the world, with nonstop destinations to 188 locations and an economic impact of nearly $40 billion.
Charlotte has two primary light rail lines within Mecklenburg County that serve downtown, but neither runs to the airport. With high visitor volume, business traffic, as well as three professional sports team stadiums located downtown, congestion in the region could be dramatically reduced with the extension of a new light rail line that serves communities to the east and west of downtown and connects them through the airport. The airport is pursuing a $4B expansion campaign that includes a renovated and expanded terminal and new runway, which could be built to serve this rail line.
Analysis Recommendations 1. Assess a Phased Rail Construction Approach and enable Transit-Oriented Development
In this analysis, we have forecasted potential development in 2027 in the broader Charlotte region based on land cover change between 2011 and 2019 and proposed creating a new Silver light rail line that would connect Charlotte’s downtown, airport, and communities to the west and east. Because the model was not trained with transit data (existing Blue and Gold Lynx lines), our results show limited support for this full route. We did not include the Lynx transit infrastructure because it has not remained consistent throughout our study time period (2011 - 2019), after discussing with Isabel. However, in an initial run of the model, we did see that proximity to the Blue Line was a strong predictor of development between 2011 and 2019. In retrospect, we should have incorporated at least the initial Blue Line (constructed 2007) into our model, along with highways, to better assess the influence of transit-oriented development. Unfortunately, we ran out of time to go back and try this again!
However, our current model does show that development in 2017 is expected to be concentrated in Mecklenburg County (which is where Charlotte’s airport and downtown are located). It is possible that adopting a phased construction approach could prove effective, focusing first on connecting downtown and the airport aligned with the already planned airport terminal expansion. The county’s public transit authority is mostly funded by a half-cent sales tax for transit paid by shoppers in Mecklenburg County, which has surged from $108M in FY19 to roughly $165 million in FY24; the rest comes from the federal government, the city, and fares. If the region wanted to finance a new Silver Lynx line, it could be financed with an increased tax rate (must be passed by voters) or by creating Tax Increment Financing (TIF) districts around rail stops to help finance expansion of the line and encourage more transit-oriented development (TOD) within a 0.5mi/800mi distance of the line.
2. Consider potential hidden impacts to wetlands. It is notable that the most common land cover type that experienced development change between 2011 and 2019 is “Other Undeveloped” lands. Additional analysis of soil type or other features of these areas would be extremely useful to determine more specifics. Although these lands could have been already-cleared sites being prepped for construction, it is also very possible that they were emergent wetlands which do not consistently appear wet or vegetated throughout the year. Wetlands are critical ecosystems that provide a variety of services and benefits like wildlife habitat and fisheries breeding grounds; recreation and tourism; flood risk reduction through erosion control, storm surge buffering, and water storage; and water filtering which improves water quality. Each wetland varies in its hydrology, plant community composition, soil types, biogeochemical processes, and other environmental factors. According to the North Carolina Division of Water Resources, riverine forest wetlands, mountain bogs, and pocosins are particularly common wetlands types in the Piedmont Region of the Carolinas, which may be obscured in the NLCD as forest, shrub, or bare land cover.
Wetlands across the US are facing increasing threats of degradation and loss thanks to development, hydrological alteration like levees and dams, invasive species, and sea level rise. In 2023, the North Carolina Legislature passed a law that substantially reduced protections for wetlands, making them more at risk from development pressures. To prevent the loss of these critical habitats and reduce flood risk for residents, implementing additional local restriction on development of wetlands would be beneficial to make up for the protections lost at the state level. In addition to local ordinances, the region could also pursue a wetlands mitigation bank to ensure there is no net loss of wetlands from development.
3. Protect prime forest and farm lands from development The majority of development our model predicts in 2027 will impact forest lands in Mecklenburg and Lancaster Counties. In order to restrict the impacts of this development on natural systems, the counties could consider implementing tree canopy ordinances that require a certain level of tree canopy coverage to be maintaned by developers. Some areas of Virginia do this already, and it helps to ensure tree replacement along with new development. Additionally, land conservations are an effective tool to help put high-quality land under permanent development restrictions. Both SC and NC offer tax credits to incentivize landowners to sell their development rights. Both counties could identify and contact the owners of prime forest and farm land to see if they would be interested in preserving their lands.