- Louisiana State University
- Research Guides
- LSU Libraries
- Cleaning Data with OpenRefine
- Splitting Column Values
Cleaning Data with OpenRefine: Splitting Column Values
A guide to using the OpenRefine program to organize messy datasets
Splitting Column Values
You may find data that would be better split into separate columns to aid in organization.
Click on the down arrow next to the column header of the column you would like to split. On the dropdown menu, go to Edit column > Split into several columns.
In the popup box, choose how you want the column to be split.
- Separator - Typically, cells containing multiple values are separated by a space ( ), a dash (-), a comma (,) , or some other form of punctuation. OpenRefine will split the column based on occurrences of chosen separator. If you only want a certain number of columns, you can type in the next box the maximum amount of columns.
- Field Length - Allows you to input a string of integers for separation. This is most useful for splitting up dates.
- Example: A column contains dates in the YYYY/MM/DD format. To split this into three separate columns, in the box under Field Lengths, type "4, 1, 2, 1, 2"
- 4 = YYYY
- 1 = /
- 2 = MM
- 1 = /
- 2 = DD
- OpenRefine will split the column into five columns, Year, Slash, Month, Slash, and Day. The two Slash columns can then be deleted.
- Example: A column contains dates in the YYYY/MM/DD format. To split this into three separate columns, in the box under Field Lengths, type "4, 1, 2, 1, 2"