Data Leakage

How third-party advertising tools can use your data to make your competition smarter.

By Content Manager David Vranicar

The side effects of advertising technology tools, or adtech, are well known. These tools might slow load times, or pose security risks, or not adapt to mobile, or display adverts to robots (and charge you for it).

Another byproduct of the adtech ecosystem might be even more important: data leakage, which is when advertising tools use data from a company’s website or app to optimise advertising for different, competing websites and apps.

Most websites have the same goal: Generate traffic and conversions.

Adtech tools share that goal. Adtech, however, is less concerned with where that conversion takes place.

If an adtech provider identifies someone who intends to buy a certain product from a certain company, then it can target that user with relevant offers – but from different providers.

When this happens, data from one site is quite literally being used to drive sales on a different site. That is data leakage. And in most cases, it is profit leakage.

This report will examine the phenomenon of data leakage. First, we will take a look at the types of third-party advertising tools that companies – ecommerce, publishers, telecom and more – use on their websites. Next, we will explore the sort of data that adtech tools collect and make available, as well as what happens to that data once it "leaks". And finally, we will see what it looks like when data from one website is used to help other ones.

What exactly is adtech?

Third-party advertising tools work by collecting user data on websites, and then leveraging that data to optimise advertising elsewhere on the web.

This way, if someone uses an online travel portal to plan a trip to Paris, then the adtech used by that travel portal can help place reminders, discounts and special offers all over the Internet encouraging that shopper to pull the trigger on a purchase. Adtech enables companies to communicate with interested users just about anywhere.

In years past, before adtech’s rapid evolution, digital advertising was context-centric, not user-centric.

This advertising space at the top of CNN’s Travel section would likely be occupied by companies that dealt with travel – regardless of who happened to be seeing the advertisement. Advertising was based on the assumption that someone on this page must have at least some interest in travel.

Assumptions, however, have been replaced by data. Now instead of snatching up space based on likely interests, companies on each side of the equation – the one placing the ad and the one with the ad space – have tools to match up ads with proven interest.

Adtech tools on Website A collect data about users on Website A. This data is then made available to ad networks used by websites throughout the web. When those networks identify Website A users at a different website, relevant ads are displayed.


As a result, travel companies can target travel-related users, not just travel-related websites. Everything is dynamic. That person interested in a trip to Paris might see travel ads whether they are at CNN Travel or Dictionary.com or BBC Food.

Adtech tools know what users have been looking at and, by extension, what they intend to buy.

How adtech works: an experiment

To dissect how these tools work, we created an imaginary online shopper.

He had three primary interests: basketball, pants and smartphones. Therefore, when adtech tools gathered data about his browsing habits, they would see someone who intended to buy basketball gear, pants and a new phone.

Before the shopping began, we created a sterile surfing environment. A brand new browser:

Without any history, cookies or extensions:


Next, it was time to browse; Adidas.de was the first stop.

Our shopper did not waste any time with jackets or women’s clothing. Instead, it was all about basketball.

That meant putting basketball items in the basket…

… and then putting different basketball items in the basket…

… and then putting different items in the basket yet again…

After a few visits, Adidas started to recommend products without our shopper having to make a single click. Indeed, recommendations highlighted what was seen last time…


… and even offered recommendations based on what appeared to be a raging obsession with basketball:


This is precisely the type of customised, user-centric experience that Adidas (or any other company) wants to create on its website. Customers love this personalisation. And in a different era, that would be as good as it gets: A company knows what you are looking for, and will make personalised offers when you are on its website.

With today’s advertising technology, however, the paradigm has shifted.

The personalisation at Adidas.de carried over to other domains in a way that would have been impossible before the rise of modern-day adtech. Those same Adidas.de offers started popping up all over the web.

As a matter of fact, because our shopper had displayed zero interest in anything besides Adidas basketball, there was nary an ad anywhere that didn’t have three stripes on it:

How do The Guardian and The New York Times and Dictionary.com all know that our shopper is interested in Adidas basketball? Well, user data that was collected by adtech tools on one website – Adidas.de – ends up in a data pool. This pool is available to ad networks, which are designed to place relevant ads for relevant users.

That is how basketball recommendations at Adidas’ website turn into Adidas basketball recommendations on The New York Times’ website.

And no doubt, this phenomenon is happening in every corner of the web.

Here, there, everywhere

Adtech is not unique to sports apparel. Other industries are saturated with these tools, as well.

Look, for example, at a few of Germany’s biggest telecommunications players. Each company sells mobile phones, each company offers data plans and each company has a fat list of adtech tools1.

Adtech tools are highlighted; other plugins include tools for analytics, heatmapping and more.


You see the same adtech sharing in the world of high-end fashion. Whether it is a company that offers an array of original products:


Or a company that has a luxury specialty:


Or a company that does not make its own products, but is instead a hub for multiple fashion giants:


As you can see, the tools that fuel sports apparel advertising also fuel telecom advertising and high-end fashion advertising.

After scouring Adidas.de’s basketball store, we had our shopper look for a phone – in particular, an iPhone and a generous monthly data plan from a specific telecom provider.

So our shopper browsed Apple phones, chose a cool new iPhone 6s, picked out the best data plan and ended up with a substantial cart:


Now, as we saw with at Adidas.de, our shopper suffers from a bit of indecision. Therefore, after selecting that iPhone, our shopper took it out of his cart and picked another, then another, then another. It took more than one week and dozens upon dozens of page views; there was lots of data for the adtech tools to work with.

The same pattern unfolded while shopping for snazzy clothes: Pants were added to, and removed from, a shopping cart over the course of several days.

With a trio of massive baskets, our shopping profile was complete. We had someone who was determined to buy basketball gear, a new phone and some high-end clothing.

And adtech tools took notice.

When you ad it all up

After the phone and clothes shopping, Adidas lost its monopoly on advertisements around the web. Here is how it looked:

The third-party-tool ecosystem placed ads throughout the web, doing everything possible to turn those baskets into purchases.

But after a few weeks, things changed: Advertising was no longer based on existing baskets or favourite brands. It was based on the type of items that our shopper intended to buy.

He was no longer an Adidas basketball fan; he was a basketball fan. He was no longer in the market for an iPhone from that particular store; he was in the market for a high-end smartphone from anyone. And no longer did clothes start and stop with those nice pants; more fashion giants got into the act.

And this is exactly what data leakage looks like: The intentions of our shopper were used to fuel advertisements for companies whose websites that our shopper had never seen.

In the process, data that was accumulated by the three original companies was used to optimise advertisements for different, competing companies.

Adtech tools on Website A send data to ad networks, where it eventually becomes available to other companies that utilise the same networks. Ads from those other companies can then be shown to relevant users, even if those users have never been to their websites. That is how Website B – which is in direct competition with Website A – leverages a different companies’ data.


Of course a company itself would never turn over this user-centric data to a competitor. But the advertising tools used and shared by all of these companies have no such reservations. Data from one site joined a data pool that was available for a seemingly endless number of other sites.

Here are some ads that our shopper saw a week after first browsing iPhones and data plans, none of which come from the company whose website our shopper actually visited:

Same thing with pants. This ad, for example…

… led to this landing page:


Hosen
, by the way, is German for pants. So this ad leveraged our shopper's interest not just in high-end clothes, but the precise type of high-end clothes.

There were other clothing options, too. Take shoes, for instance:


Those shoe ads are in direct competition with our original shop.

From the original shop:

And from the competition:

Eventually, our shopper started seeing competitors’ ads on the same page:


Adidas was not spared, either. Those Adidas-only pages from a few weeks earlier suddenly featured Nike ads from online shops like GetSupplied and Brands4Friends:


Adidas ended up sharing the same page with competitors:

As time went on, competitors’ ads persisted and even expanded. O2 and Vodafone were joined by other phone-and-data-plan providers like Base and Simyo and LowerPrice.us:


Data out, data in


Make no mistake: Data leakage is a two-way street. If Company B’s adtech tools can identify potential customers even if they were never at Company B’s website, then other company’s adtech tools can do the same.

In that sense, companies that fall victim to data leakage are also in a position to benefit from data leakage. Yes, some valuable user data is reaching other companies. But their data is there for the taking, too.

It is also important to note that adtech providers that help Company B gain from the efforts of Company A are not necessarily trying to hide it. The dissemination of data is written into contracts for all of these companies. For example:

[The adtech provider] is the sole owner of the Service Data and the Campaign Data and may use either for any purpose allowed by Applicable Law.


And:

The Client authorizes [the adtech provider]: (i) to collect, use, analyze and process the Client Data, to combine the Client Data with [adtech provider] Data… to improve [adtech provider] Technology […] and other [adtech provider] products, programs and/or services...


And finally:

[Adtech provider] is an independent contractor and not an agent of Client in the performance of this Agreement. This Agreement is not [to] be interpreted as evidence of an association, joint venture, partnership, or franchise between the parties.

But even if the benefits cut both ways, and even if adtech providers are relatively transparent, data leakage causes companies to lose control of the one thing that is most valuable to them. Data.

Rival basketball, phone and clothing vendors all found our shopper without ever actually finding him. They didn’t engage him on social media (he had no accounts). They didn’t reach him via email (he had no email, either).

They found him because – and only because – other sites found him first.

Data is the foundation of digital commerce. Companies need to know what people want. And how many they want. And how frequently. Websites that collect and study this data will get ahead. The problem with data leakage is that as long as some websites use these tools, other websites can benefit without having to lift a finger.

In the end, our shopper never made any purchases. There were simply too many choices.

Want to know more about data leakage?

 
 
 
 

Want to know how data leakage affects
your business?



For more on data leakage, check out this blog post from Webtrekk CEO Christian Sauer:

If you don’t want to share your customers, think twice before sharing your data.

Press inquiries should be directed to Marketing Manager Julia Gölles.

You can contact the author of this report at marketing@webtrekk.com.

Footnotes

[1] List of adtech tools comes from Ghostery, a browser extension that scans websites for adtech, analytics and other third-party tools.