Here’s how we created our sample datasets:
- We searched Google for postal codes and major cities in each country.
- We used these to determine the geolocation (latitude and longitude) of each location.
- We generated new coordinates within a range of 100 meters to 4,000 meters from the original location.
- We reverse looked up these new coordinates to verify that they corresponded to actual addresses.
- We continued generating and reversing coordinates until we had at least 10,000 unique, real addresses with geolocations for each country.
- We searched Google for popular baby names (by gender) in recent years for each country.
- We also searched for common surnames in each country.
- We created a combined list of first and last names (by gender) and used these to generate email addresses with @hotmail, @gmail, and @outlook.com suffixes.
- We randomly generated birthdates for ages between 21 and 65 (working population).
- We searched for mobile phone number prefixes in each country and used these to generate random phone numbers with at least 13 digits.
- We used information about the average median income and family income per person in each country to generate [YearlyIncomeInUSD] with a certain margin of error.
- We selected values between 3% and 12% of [YearlyIncomeInUSD] to generate [CustomerLifetimeValueInUSD].
- We searched for credit card number specifications for MasterCard, Maestro, and Visa, and used these to generate random credit card numbers that passed the Luhn algorithm. We also generated random CVV numbers and expiry dates between one month from now and 10 years from now.
- We generated random IP addresses by concatenating four random numbers up to 254.
- We randomly generated values for [NumberOfOrders].
While it may be a lot of work, this process should allow you to create a high-quality sample dataset with real addresses in a few weeks’ time. Alternatively, you can simply purchase one of our sample datasets and receive instant delivery via email along with a receipt.