Skip to Main Content

Cleaning Data with OpenRefine: Sorting Data

A guide to using the OpenRefine program to organize messy datasets

Sorting One Column

Select the column you would like to sort. Click the down arrow next to the column header. On the dropdown menu, select Sort.

 

 

A window will pop up allowing you to select whether you are sorting text, numbers, dates, or booleans. You can also choose the order in which the data is sorted and where to place blank or error cells. Once you have chosen your options, click OK.

 

 

The column will now be sorted according to your selections.

Sorting Multiple Columns

Follow the steps for sorting a single column.

Choose the secondary column to sort. In the popup window, make sure the box next to "Sort by this column only" is deselected.

 

 

Click OK and the nested sorting should be applied successfully. This will allow you to sort data with multiple subcategories in a way that makes sense.

Additional Information

Sorting does not permanently change the order of the data. If you would like to permanently change the order, navigate to the tab above the column headers next to "5, 10, 25, 50 records." Clicking on Sort, should show a drop down menu:

  • Reorder rows permanently allows you to make the sort permanent
  • Remove sort reverts the data back to the original order

Unless you choose to reorder rows permanently, OpenRefine does not automatically save your sorting order upon export. 

 

 

This same dropdown menu will show the current sorting method(s). If you have applied multiple, they will work in the order they were applied (ex. sorting a list of books by Author, then by Title)