Annotate.io

Cleaning up text is painful. Send us examples of what-you-have and what-you-want, we'll learn how to transform the original into the cleaned version. Alpha to follow in early 2015:

Sign-up for alpha notification

You provide a small set of examples and we build a converter that you can use on-line to clean up large amounts of data robustly without building your own natural language processing and machine learning pipeline. You'll save time (no need to fuss with regular expressions!), we can handle one-off or long-running conversion jobs. You get to focus on pulling value out of your data rather than investing weeks cleaning it up.

Normalise job-advert salary fields:

Job-adverts are often filled in by humans, fields such as salary are written in a variety of forms. We can learn the mapping required to normalise your examples into a consistent format:

FromTo
"To 53k w/benefits""53000"
"30000 OTE plus bonus" "30000"
"40k-50k""40000,50000"

Convert eCommerce results into a consistent form:

eCommerce pages are often scraped from a variety of database sources, each product is written using a variety of units or common synonyms. We transform these into an easily-understood format:

FromTo
32 inch widescreen32"
32-in. TV32"
Thirty-three inch beautiful widescreen 33"

Alpha coming early in 2015:

Sign-up for alpha notification