The customer has a large transactional dataset with clients’ addresses. This kind of dataset is sensitive, because it links physical locations to spending patterns. Data breach might put the customers in danger of theft and robbery, and would violate GDPR and similar regulations. Yet, the customer wanted to share this data with external company to build a location-based recommendation system. How have we solved this issue?
We cannot use the same transactional dataset, but the good approximation is the apartment price dataset provided by AirBnB. Let’s go through the example in Paris. We will mask input data by:
- shuffling coordinates, client addresses won’t be accessible
- transforming prices (values) and preserving their ranges, thus pricing (transactions) in neighborhood will maintain their patterns
Example
The initial data comes from AirBnB, and it shows the night stand prices in Paris. This is similar to transactional data from the customer’s database.

Using obfuscation Camouflage API we can transform addresses to obtain statistically and spatially similar dataset, but it won’t be the same dataset.

The spatial distribution of locations is slightly different. The difference is clearly visible in the image below, where real (baseline) locations are depicted as stars, and transformed locations as triangles, but pricing color range stays the same. You may notice, that the deviation between local price ranges are very low (when you look into stars and triangles that are nearby, they are in the same range).

It is enough to stop the identification of the exact location. The same applies to pricing. The spatial and statistical properties of prices are close to the baseline dataset. (How close? This is another topic, but for the statistics geeks the important fact is that local pricing gradient stays the same, but variance is slightly lower; dataset goes through the low-pass filter). You can perform analytics on this artificial dataset, and build AI models without being anxious about the proprietary data leak.
Are you interested in the Obfuscation API for your products? Then use the contact form on the main page, where you can apply for the Camouflage API access.
