Masking location data before sharing it with 3rd parties – case study

The customer is a fund that has multiple apartments for short-term renting in Poland. They are actively investing in new locations, and they are collecting data about apartments and hotels pricing in local areas. Data collection is relatively expensive, and some addresses points to the individuals, so there is a risk of GDPR complaint. Data must be shared with consultants, their role is to define prospecting locations for new investments. The customer was unsure if those consultancies do not provide data to competitors, and they did not want to provide real addresses with real pricing. We proposed data masking as a trade-off solution for two reasons:

  • consultants might use the masked data to find prospecting locations because it has almost the same spatial and statistical properties as the raw data, but presents simulated locations and pricing
  • when data leaks no real addresses are compromised

Example

We got initial set of coordinates and short-term renting daily prices at each location.

Using obfuscation Camouflage API we can transform addresses to obtain statistically and spatially similar dataset, but it won’t be the same dataset.

The spatial distribution of locations is slightly different. It is enough to stop the identification of the exact location. The same applies to pricing. The spatial and statistical properties of prices are close to the baseline dataset. You can perform analytics on this artificial dataset, and build AI models without being anxious about the proprietary data leak.

Arę you interested in the Obfuscation API for your products? Then use the contact form on the main page, where you can apply for the Camouflage API access.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *