Summary

  • File names should be easy to understand, give information about the data, and be consistent

Surprisingly, file naming is one of the most important aspects of data management. File names should include all the information needed for someone to know what a certain data file is and for what project it is/was used. All components of file names should be in lower case (except for abbreviations and initials) and should use “_” instead of spaces. NOTE: file names should be unique so that no two files have the same name across the entire project.


The iEco File Naming Convention

\[<Project \space Name>\_<Data \space Type>\_<Author \space Initials>\_<Version>\]

Project Name: The first part of the file name should be the name (or agreed upon abbreviation typically same as R Package name) of the project you are working on. For example, caribmacro is the abbreviation (and package name) for the Caribbean herpetology macrosystems project (make sure to check with the PIs about what the name and abbreviation of your project is).

Data type: states what the data is. For example, for the occurrence of spotted lanternflies, the data type would be ‘occurrence’. For manuscripts and other types of files this is the name of the specific file. For example, for the manuscript on the drivers of herp species richness in the Caribbean, the data type would be ‘sr_drivers_manuscript’.

Author Initials: are the three (if applicable) initials of the person who created the data file (and for manuscripts, the person(s) who edited the file). For example, if Matthew Richard Helmus created the spotted lanternfly occurrence data then the author initials should be ‘MRH’.

Version: the version of the data and is used for version control. For the raw data the version should be ‘raw’, for the published data the version should be ‘final’, and for all other versions the version should be v0, v1, … , v{number of version before final}. See the Version Control section for more information about versions.

Therefore, the file name for the second version of the occurrence data that Matt Helmus made for the SLF project would be:

\[SLF\_occurrence\_MRH\_v2\]

And the file name for the raw and final versions of the occurrence data Matt Helmus created would be:

\[SLF\_occurrence\_MRH\_raw\]

\[SLF\_occurrence\_MRH\_final\]


Dates in file names

If the file is for a specific date, such as the photo back ups of the raw data sheets discussed in the Data Collection section or dictated observation recordings, you should include the YYYY-MM-DD date before the author initials. For example:

\[SLF\_survey\_photobackup\_2020-04-24\_MRH\_raw\]

If you have multiple date specific data files from a single date, add something to the name to make the two files unique (e.g. site, observer, time, survey number, etc.).


The order of the parts of the file name are important for the organization and sorting of files. With the iEco Lab naming convention, all of the SLF data will be grouped, and the data types within the SLF project will be grouped and ordered in ascending version or date.


Other File Types

For some data, such as those created through ArcGIS, file names can have limits for the number characters in the name. In these situations, the Project Lead and the PI(s) should discuss a naming convention for those specific cases. However, one possible alternative is to house these files within a folder that is named according to the naming convention for the lab and then the individual files within the folder can have simpler names and versioning attached to them.

For other common data types the iEco Lab has created the following naming conventions:

File Type Convention
Base Naming Convention \(<Project \space Name>\_<Data \space Type>\_<Author \space Initials>\_<Version>\)
Audio-Visual Files \(<Project \space Name>\_<Data \space Type>\_<Date>\_<Version>\)
Spatial Data \(<Project \space Name>\_<Data \space Type>\_<Projection>\_<Version>\)
Manuscripts \(<Project \space Name>\_<Manuscript \space Name>\_<Author \space Initials>\_<Version>\)
Edited Manuscripts \(<Project \space Name>\_<Manuscript \space Name>\_<Author \space Initials>\_<Version>\_<Editor \space Initials>\)

If a type of data is not in the above table, then you should use the base convention. In all of these conventions, the version is only needed if the data will be edited or modified in any way. This includes modifications within a statistical software, even if the modified data are not saved. Additionally, these conventions can be changed if needed. However, if a naming convention is changed then the new convention(s) must be recorded in a text file within the root project folder.


Glossary

References

Additional Resources


  1. Temple University, ↩︎

  2. Temple University, ↩︎

  3. Temple University, ↩︎

  4. Temple University, ↩︎