Skip to Main Content

Working with Data

Finding, evaluating, and analyzing data

Working with Data

Wondering how to use datasets to enhance your research, or find data to make an infographic for your class? This guide will introduce you to data, including how to find it, how to prepare it, and how to analyze it.

Data, at its base level, is a representation of information from the real world. It is important to remember that data is collected by people who make decisions about what to include, omit, visualize, analyze, and present. As with any form of information, when encountering a claim backed up by data, consider the authority of the source to evaluate any claims made. 

If you have questions about data management, reach out to our Research Data Team

Isn't all data just numbers?

No! Much of the data we see on a regular basis is numeric (aka quantitative), but data can also be: 

  • Qualitative - such as transcripts of videos from interviews 
  • Geospatial  - such as latitude and longitude coordinates
  • Text - such as website reviews or novels 
  • Images - such as page scans or photographs

Different types of data appear in different fields of study, but many can be useful in multiple domains. For example, if you're a historian who usually works with text data, how might you incorporate geospatial data? Or, if you usually work with financial data, what kinds of text data could you incorporate in your research?

Graphic illustrating commons data types, including spatial, text, image, and spreadsheet data.

Data classifications

Data can generally be classified into two categories: structured and unstructured. 

Structured data

Just like it sounds, structured data is organized in some way. Structured  data is commonly found in spreadsheet formats (file extensions like CSV, XLSX, or TSV). Spreadsheet data is also called tabular data. Other structured formats include HTML, XML, JSON, and others.

Unstructured data

Unstructured data is different in that it is less organized than structured data. This can include groups of images (file extensions like JPG, PNG, or TIFF) or plain text files (file extensions like TXT). For example, if you have a JPG file of a data table, it may not be machine-readable, which limits its ability to be used for data analysis and visualization. Many computational methods require the data to be put into a structured format before analysis can be done, but some data analysis tools will do this step for you.