Dataset produced by aggregating and anonymizing several sources, containing presence/absence data for SLF in the United States, as well as establishment status and popoulation density

lyde

Format

A dataframe with 658,390 observations and 14 variables

source

Character variable defining the source of the data.

year

Integer value defining the calendar year when the information was collected.

bio_year

Integer value definig the biological year when the information was collected based on SLF life cycle. Biological year starts on May 1st and ends on April 30th.

latitude

Expressed in decimal degrees in WSG84 coordinate system

longitude

Expressed in decimal degrees in WSG84 coordinate system

state

Character defining the state where the data was collected, abbreviated with Census-official 2-letter code

lyde_present

Logical value (`TRUE`/`FALSE`) defining whether records are present for spotted lanternfly at the site at the time of survey. These might include regulatory incidents where a single live individual or a small number of dead individuals were observed at the site, but no signs of established population could be detected.

lyde_established

Logical value (`TRUE`/`FALSE`) defining whether signs of an established population are present at the site at the time of survey. These include a minimum of 2 alive individuals or the presence of an egg mass.

lyde_density

Ordinal variable defining the population density of spotted lanternfly at the site, estimated directly as an ordinal category by the data collector or derived from count data. The categories include: `“Unpopulated”`, indicating the absence of an established population at the site (but not excluding the presence of spotted lanternfly in the form of regulatory incidents); `“Low”`, indicating an established population is present but at low densities, reflecting at most about 30 individuals or a single egg mass; `“Medium”`, indicating the population is established and at higher densities, but still at low enough population size to allow for a counting of the individuals during a survey visit (a few hundreds at most); `“High”`, indicating the population is established and thriving, and the area is generally infested, to a degree where a count of individuals would be unfeasible within a survey visit.

source_agency

Agency/organization/platform responsible for data collection. `DDA`: Delaware Dept of Agriculture; `iNaturalist`: inaturalist.org; `ISDA`: Indiana State Dept of Agriculture; `MDA`: Maryland Dept of Agriculture; `NJDA_Public_reporting`: New Jersey Dept of Agriculture public reporting tool platform; `NYSDAM`: New York State Dept of Agriculture and Markets; `PDA_Public_reporting`: Pennsylvania Dept of Agriculture public reporting tool platform; `USDA`: United States Dept of Agriculture; `VA_Tech_Coop_Ext`: Virginia Polytechnic Institute and State University, and Virginia Cooperative Extension; `VDA`: Virginia Dept of Agricuture.

collection_method

Character string defining the method used for data collection. `field_survey/management` for data points collected by professionals during field operations; `individual_reporting` for data points collected by individuals through public reporting tools, inatualist observations, or citizen science projects.

pointID

Character string uniquely identifying each data point.

rounded_longitude_10k

longitude of the centroid of the closest 10 km2 grid cell, expressed in decimal degrees (WSG84 coordinate system), used to rarefy the dataset at a coarser resolution.

rounded_latitude_10k

latitude of the centroid of the closest 10 km2 grid cell, expressed in decimal degrees (WSG84 coordinate system), used to rarefy the dataset at a coarser resolution.