It’s been a few weeks since my last post, so there’s a fair bit to catch up on. I’ll start with what’s been going on with VillageRatings.
I spent a while grappling with adding a large database of UK placenames and locations to VillageRatings. This turned out to be quite involved, mainly because the dataset I used needed a lot of cleaning-up and tweaking. Issues included converting between grid references and longitude and latitude and spreadsheet software creaking under the strain of a database with tens of thousands of entries. The biggest challenge, though, was dealing with multiple entries for the same place but with slight variations in spelling. Unfortunately, map locations were not unique to a place, so the removal of the duplicates was not easy to automate. Ultimately it taught me the importance of getting good quality input data if you’re dealing with large datasets.