Cleaning Data with OpenRefine: Faceting Data
Faceting Data
Faceting allows you to quickly view unique values in a column, edit those values, and narrow your display to show results containing a specific facet.
To display facets:
- Go to the column you would like to analyze and click the down arrow button on the column header.
- On the drop-down menu, select Facet
- Text - Choose if the column contains text
- Numeric - Choose if the column contains numbers, currency, prices
- Timeline - Choose if the column contains dates
- Scatterplot - Choose if the column contains numerical data that can be plotted on a scatterplot. OpenRefine will generate a scatterplot in a new window.
- A facet window will appear in the pane to the left side of the grid view. If you don't see it, make sure to switch the tab at the top from the Undo/Redo history tab.
For this example, we are faceting the column "What industry do you work in?" to get a general sense of how many responders fall into each category.
We click on the column header arrow > Facet > Text Facet.
We can see there are 14 unique categories in the data. However, there are some typos that will need to be corrected. These can be corrected from this view by hovering over the word and clicking the Edit text that appears. If you are working with large datasets, it is best practice to use the Cluster function to group typos like this. The Cluster button is accessible at the top of the side pane as well.