When you’re building an application for travellers, you come up against the problem of how do you get enough data in the system so that it’s useful for 455 cities with a population of more than one million people, 1054 with more than 500,000 and 2851 with more than 150,000 people.
Or to look at the problem a different way - the most visited cities in the world…
Most visited cities
This table at Wikipedia shows the top 10 most visited cities to be Paris, London, Bangkok, Singapore, Kuala Lumpur, New York City, Dubai, Istanbul, Hong Kong, Shanghai.
Crawling data for these cities
I’m not sure what the take away from this data is. Seeding 2851 cities with data seems very doable in a week of crawling, seeding the top 10 visited cities with deep data is also achievable - it’d be interesting to compare the list of top 2851 cities with the hometowns of the users of weheartplaces.
The real take away
Is that it’s better to encourage users to add content to ‘empty’ cities, than to expect to have every city to have a persuasive dataset when you launch your travel site.