Skip to Main Content

Geological Sciences


Supplementary files are separate attachments that enhance or support your thesis and may include computer code, research data, audio or video files, and images or maps that are not part of the primary pdf.

Using best practices makes it easier for you to find, use, machine analyze, and ultimately upload your data to an online archive. It will also make it easier for your collaborators or other researchers not involved with the project to understand and use your data in the future.

Attribution: Best Practices for Supplementary Files from Earth and Planetary Sciences Research Guide, UC Santa Cruz University Library, used under license CC BY 3.0 / Adapted in format and layout


Include a brief descriptive document (often called a Readme.txt file) to help others understand your additional files and data.

File Naming Systems & Organizations

  • Decide on a naming convention before data collection starts
  • Use consistent, descriptive file names. Make it easy to predict what a file contains.
  • Develop a file naming scheme that makes sense to you.
  • Consider including:
    • Project name or project number
    • Name of file creator
    • Sequence ID
    • Accession Number
    • Location or spatial coordinates
    • Date or date range of project
    • Version number of file
  • Consider how files sort when deciding what element of the file name will go first.
  • Establish a folder hierarchy that aligns with the project. Example: [Project] / [Experiment] / [Instrument or Type of File]
  • Include an explanation of your naming convention along with any abbreviations or codes in your readme.txt file. 

File Naming Conventions

  • Keep the filename short (aim for less than 25 characters)
  • Use underscores instead of spaces
  • Avoid special characters such as: " / \ : * ? < > [ ] & $ .
  • Use the dating convention: YYYY-MM-DD or YYMMDD
  • Use the 3-letter file extension to indicate the file format, such as .txt, .pdf, or .csv.
  • When using number, use leading zeros to make sure files sort in sequential order. Use 001, 002, ...020, 021 … instead of 1, 2… 20, 21…

Case Study: File Naming Done Well (pdf file) - examples of a methods to name files. File names can include study site, water depth, date, and more. Find Case Study example 2 at Name files - Data best practices and case studies - Guides at Stanford University.

Some Recommended File Formats

Whenever possible use uncompressed, non-proprietary (open) formats.

  • Containers: TAR, GZIP, ZIP
  • Databases: XML, CSV
  • Geospatial: SHP, DBF, GeoJSON, KML, NetCDF, GeoTIFF/TIFF, NetCDF, HDF-EOS
  • Moving images:  MOV, AVI, MXF
  • Presentations: PDF
  • Sounds: WAV, AIFF, MXF
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Still images: TIFF, JPEG 2000, JPEG, PDF
  • Tabular data: CSV
  • Text: XML, PDF/A, HTML, ASCII, UTF-8
  • Web archive: WARC

For more guidance on appropriate formats, see the Library of Congress’ Recommended Formats Statement and Archivematica’s Format Policies for access and preservation.


Data can be more efficiently analyzed and better understood in the future if initially set up for a machine to read:


  • Provide CSV files (easily converted by most all spreadsheet software) when possible
  • Make the top row a header with variable names
  • Put a value in each cell so rows are associated with headings and can stand alone as a recording of your observation, count, etc.  
  • Consider how to convey null values so they are not mistaken for zero results
  • Express dates using accepted standards e.g. YYYYMMDD (as examples, see the documentation for Temporal Extents Best Practice from Earthdata Wiki or Working with Dates and Times By Using the ISO 8601 from SAS)


  • Include notes in cells (add notes to a separate readme file)
  • Use formatting comments or color coding to convey information; they don’t translate well to other software
  • Use blank spaces or symbols in column names
  • Leave cells blank (avoid misinterpretation as zeros)