Skip to Main Content

Cleaning Data with OpenRefine: Tips and Further Resources

A guide to using the OpenRefine program to organize messy datasets

Tips and Best Practices

  • Name your projects clearly and save frequently. OpenRefine by default autosaves your work every 5 minutes.

  • Document your cleaning steps: the history can be exported and reused.

  • Use facets to quickly spot inconsistencies or outliers.

  • Use the Undo/Redo panel liberally -- everything is reversible.

  • Use GREL (OpenRefine’s expression language) to build powerful transformations.

Tutorials and Learning Resources

Alternative and Additional Data Cleaning Tools

Datasets

Scholarly Publications & Analytics Librarian