lydemapr
: an R package to map Lycorma delicatula
vignettes/introduction.Rmd
introduction.Rmd
The Spotted lanternfly (Lycorma delicatula, White 1841) is an agricultural pest native of China and Southeast Asia, first discovered in the United states in 2014 in Berks County, PA. Since then, this planthopper has spread throughout the Mid-Atlantic and Midwest regions of the country, threatening the wine and fruit industry and damaging ornamental trees.
Since its first discovery, many sources have collected data on the presence/absence and population density of this species in order to monitor its spread and impact. The lydemapr
package contains two anonymized datasets (at 1 km2 and 10 km2 resolution) resulting from an effort to combine, organize, and aggregate all available sources of data. In addition, this package contains useful functions to visualize the data within R.
The lydemapr
package was built with the intent to increase accessibility to key data on this species of interest, and to improve reproducibility and consistency of modeling efforts.
We are constantly looking to expand the data sources to have a full representation of SLF’s presence and abundance in the US. If you wish to contribute to this effort please contact the package authors.
To begin, let’s take a look at the data structure:
## # A tibble: 6 × 14
## source year bio_year latitude longitude state lyde_present lyde_established
## <chr> <dbl> <dbl> <dbl> <dbl> <chr> <lgl> <lgl>
## 1 inat 2015 2015 40.4 -75.7 PA TRUE FALSE
## 2 inat 2016 2016 40.3 -75.6 PA TRUE FALSE
## 3 inat 2016 2016 40.4 -75.5 PA TRUE FALSE
## 4 inat 2016 2016 40.4 -75.6 PA TRUE FALSE
## 5 inat 2016 2016 40.4 -75.7 PA TRUE FALSE
## 6 inat 2016 2016 40.5 -75.6 PA TRUE FALSE
## # ℹ 6 more variables: lyde_density <fct>, source_agency <chr>,
## # collection_method <chr>, pointID <chr>, rounded_longitude_10k <dbl>,
## # rounded_latitude_10k <dbl>
Each data point contains information on its source and specific dataset of origin (“source_agency”). The data is organized by year (specified as both calendar “year” and “bio_year”, running from May 1st to April 30th), coordinates, and state. Additional columns define whether SLF was found during the survey in that location (even as an anecdotal individual record, “lyde_present”), whether an established population was found there (“lyde_established”), and what the estimated population density of SLF was there (“lyde_density”). For additional information on the variables included, please consult the help file associated with the data by typing ?lyde
in the RStudio console. A Metadata file can also be found in the compressed folder lyde_data.zip
contained in download_data/
.
The package function lyde_summary()
breaks the data down into a quick summary, with data organized by different axes. We can take a look at the data split across year and States. It’s important to notice that the data is arranged yearly according to the biological year of SLF, and not calendar year. This allows for the appropriate inclusion of egg masses discovered during the winter months which were laid during the previous calendar year’s summer/fall.
# data by Year and State
knitr::kable(lyde_summary(year_type = "biological"))
2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | 2023 | |
---|---|---|---|---|---|---|---|---|---|---|
AZ | 0 | 0 | 0 | 0 | 0 | 10 | 139 | 120 | 197 | 326 |
CA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
CT | 0 | 0 | 0 | 0 | 0 | 3 | 2081 | 1442 | 1438 | 821 |
DC | 0 | 0 | 0 | 0 | 8 | 21 | 10 | 5 | 0 | 39 |
DE | 0 | 0 | 0 | 0 | 1075 | 2208 | 4546 | 6962 | 5175 | 2632 |
FL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
IL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 16 |
IN | 0 | 0 | 1 | 0 | 79 | 101 | 102 | 352 | 98 | 128 |
Indiana | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 33 | 79 |
KS | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 21 | 0 | 0 |
KY | 0 | 0 | 0 | 0 | 0 | 3 | 2 | 20 | 165 | 161 |
MA | 0 | 0 | 0 | 0 | 0 | 0 | 893 | 2859 | 2056 | 1743 |
MD | 0 | 0 | 0 | 1 | 39 | 2399 | 17408 | 4734 | 1470 | 1882 |
ME | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20 | 85 | 19 |
MI | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 133 | 307 | 339 |
MO | 0 | 0 | 0 | 0 | 0 | 15 | 18 | 0 | 0 | 37 |
NC | 0 | 0 | 0 | 0 | 0 | 14067 | 5 | 110 | 4819 | 886 |
NH | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60 | 11 |
NJ | 0 | 0 | 0 | 0 | 2443 | 9529 | 13075 | 83704 | 39397 | 1510 |
NM | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 28 | 26 | 7 |
NY | 0 | 0 | 0 | 0 | 18474 | 27046 | 18228 | 12869 | 19634 | 8498 |
OH | 0 | 0 | 0 | 0 | 0 | 0 | 681 | 575 | 304 | 303 |
OR | 0 | 0 | 0 | 0 | 0 | 0 | 92 | 15 | 73 | 0 |
PA | 372 | 7677 | 9269 | 9231 | 77055 | 150180 | 90476 | 69436 | 79699 | 66875 |
RI | 0 | 0 | 0 | 0 | 0 | 0 | 45 | 18 | 285 | 264 |
SC | 0 | 0 | 0 | 0 | 0 | 2 | 7 | 49 | 70 | 39 |
TN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 582 |
TX | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 75 | 218 |
UT | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
VA | 0 | 0 | 0 | 2 | 1523 | 4353 | 4100 | 2574 | 2748 | 2833 |
VT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 25 | 1 |
WI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
WV | 0 | 0 | 0 | 0 | 3 | 995 | 2368 | 2101 | 1450 | 777 |
Two functions allow the user to plot the data: map_spread()
and map_yearly
.
The first function produces a snapshot of the SLF spread in the United States, with reference to the sampling effort associated with surveying the spread. Surveys finding an established population are plotted on the map as filled tiles, color coded by the year of first discovery. Surveys finding no established population are plotted as grey tiles.
As the plotting of the data might take a long time to display within R, we encourage the user to assign the map and save it as a pdf instead, like we show below.
# assigning the map
map_1 <- map_spread()
The map can be saved as a pdf file at high resolution.
# If executing this line while running the vignette manually,
# be advised that it might take a considerable amount of time
# for the map to be displayed.
# It's advised to visualize the pdf file saved above.
map_1
The default function displays data aggregated at the 10km2 (Figure 1). The function can be customized to show the data at higher spatial resolution (1k2), by setting the function option resolution
to “1k”. This will take considerably longer, so saving the result as a pdf is preferable in this instance as well.
map_2 <- map_spread(resolution = "1k")
# If executing this line while running the vignette manually,
# be advised that it might take a considerable amount of time
# for the map to be displayed.
# It's advised to visualize the pdf file saved above.
map_2
The function displays data in a slightly different fashion at the 1km2 resolution (Figure 2). At 10km2 the data is plotted at filled tiles. This improves the visualization by representing the grid in which the data is organized more clearly. As tiles of size 1km are much smaller, we prefer to display survey points at this resolution as points on the map.
If the user wishes to visualize the data for a smaller area of the United States, the function allows them to specify which area should be mapped, by setting the zoom
variable to “custom” and specifying the boundaries of the mapped area through xlim_coord
(longitude) and ylim_coord
(latitude), as Laongitude and Latitude coordinates using the WG84 projection. Here’s an example of how this can be achieved.
# assigning object
map_3 <- map_spread(resolution = "1k",
zoom = "custom",
xlim_coord = c(-78, -74),
ylim_coord = c(38, 42))
# If executing this line while running the vignette manually,
# be advised that it might take a considerable amount of time
# for the map to be displayed.
# It's advised to visualize the pdf file saved above.
map_3
The second function, map_yearly()
allows the user to visualize the progression of SLF establishment, with a focus on the estimated population density through time. Note that the data here is not cumulative, meaning only data from a given year is shown in any given panel of the figure.
# running year-specific map
# assigning object
map_4 <- map_yearly(ncol = 2)
# If executing this line while running the vignette manually,
# be advised that it might take a considerable amount of time
# for the map to be displayed.
# It's advised to visualize the pdf file saved above.
# map_4
Temple University, sebastiano.debona@gmail.com↩︎
Temple University, mrhelmus@temple.edu↩︎