Skip to Main Content

Cleaning Data with OpenRefine: Data Mining and Discovery

A guide to using the OpenRefine program to organize messy datasets

Data Mining and Discovery

The first step when working with data is to get it to a state where you can easily view and see the type of data you are working with. These are some of the first actions you may take when working with a dataset.


  • Reordering columns for better contextualization (keeping names, locations together)

  • Sorting data (alphabetically or numerically)

  • Faceting data will give you a list of the unique terms in a selected column (useful for noticing irregularities or typos)