OpenRefine capitalize first letter of all entries within a cell on a TSV

223 Views Asked by At

How do I capitalize the first letter of the first word of each entry within a cell on a TSV? For example is there a function to make 'Dogs||cats||fish' change to 'Dogs||Cats||Fish' and do the same to all cells on the TSV?

1

There are 1 best solutions below

1
b2m On

I am assuming your dataset looks like this:

All Column 1 Column 2
Dogs||cats||fish Apples||oranges
Elephants||tigers Peaches||bananas

Usually you would just apply toTitlecase() to each cell. But this would not work on your data because the words are not separated by whitespace but by the separator ||.

So you could either replace your separater || with whitespaces, apply toTitlecase() and replace the whitespaces again with your separator:

value.replace("||", "   ").toTitlecase().replace("   ", "||")

Note: I am explicitly replacing || with three whitespaces to avoid confusion with possibly whitespace separated words like Great white shark.

Or you could split on your seperator, perform the toTitlecase() operation and join the results back together.

forEach(value.split("||"), v, v.toTitlecase()).join("||")

To perform this on all cells on the whole dataset you can use the "All => Transform" dialog. "All" is the pseudo column you find on the left, where you can also star or flag single rows.