ENGL413: Document Design (Topics in Professional Writing)

How do you know if data is reliable and accurate?

When starting a search for data, there are a few key concepts you must keep in mind.

  • Data is ALWAYS created or influenced by people. We tend to think of data as objective, especially if it's been created by a computer (like a timestamp on a file) or is numeric. However, even if it's been created by a computer, a person still decided what information to collect, how to display, store, and share that information, or how to modify or clean the data. Knowing this will help you choose reliable and accurate data. 
  • Data is a representation of the real world. Think of data as a "report" or "summary", and not the thing itself. For example, demographic data for one zip code tells you characteristics of the people who live there, such as age, race, or occupation. The data represents the people, but is should not be used as a proxy or 1:1 representation for the people themselves. As such, data can only ever give one part of the story. Other information is necessary to provide a full picture and will strengthen your narrative. 
  • Pay attention to where your data comes from. Who created it? Thinking of the first point, why did they create it and what factors may have affected the data you're seeing? Generally, data collected and provided by governmental or non-profit organizations and scholars can be trusted to be accurate. If you're unsure, look into the organization -- do you trust them to give an unbiased view and provide accurate information?

Finding Data

Once you start looking for data, think about what kind of data you need. Do you need summary statistics (i.e., not raw data)? Do you need raw numeric data to run your own analysis? Do you need plain text to create a word cloud or perform other text mining methods?

It's useful to first think of what information you need (I need information on climate change funding) and then think of what data you need (I need a spreadsheet that breaks down each country's expenditures on climate change initiatives by year). 

A few sources for data are highlighted on this guide. If you're not able to find data on your topic from these, check to see if a research guide contains Data and Statistics or reach out to a librarian for help. 

Important note: Always try to download your data in Comma-Separated Values / CSV (extension .csv) format. Some tools accept Excel formats (extension .xls or .xlsx), but all will accept CSV files. Excel files are proprietary to the Excel software program, but CSV is a widely recognized and used format for tabular data. 

What kind of numeric data are you looking at?

For numeric and statistical data, know what you're looking at. If you see averages, trends, or medians, you're likely looking at data that's already been analyzed (someone has run statistical processes on the raw data). Depending on your project and needs, you might need only summary statistics or raw data.