Smoke and mirrors: Google announces a “major shakeup in advertising strategy” and plays the crowd for all its worth

4 March 2021 (Brussels, Belgium) – Yesterday, Google confirmed that it plans to phase out the practice of letting companies track users across the web using cookies by next year. Then, it pinky promised not to create an equally invasive workaround despite the potential hit to its money-printing advertising businesses.

Cookies are little crumbs of digital information that companies, advertisers, and websites have used to track your movement across the web. To a certain extent, the modern advertising businesses of internet giants like Google and Facebook are built on the buttery backs of cookies. More details about cookies in a moment.

But in recent years, browsers that include Safari and Firefox have restricted the use of cookies out of growing concern over user privacy. Google’s uber-popular web browser, Chrome, was one of the last major browsers to still support the practice. But now that Chrome (with its 60% market share) is phasing out cookies for good, the one-time tracking snack of choice is heading down the garbage disposal.

But what’s really happening is the mechanisms that power them are changing. From the Wall Street Journal:

Google’s heft means the change could reshape the digital ad business, where many companies rely on tracking individuals to target their ads, measure the ads’ effectiveness and stop fraud. Google accounted for 52% of last year’s global digital ad spending of $292 billion, according to Jounce Media, a digital ad consultancy.

In fact, not only does Google have a way to target ads nearly as effectively, its method only truly works at Google scale; that 52% number is probably going up. Continuing with the Wall Street Journal story:

Google had already announced last year that in 2022 it would remove the most widely used such tracking technology, called third-party cookies. But now the company is saying it won’t build alternative tracking technologies, or use those being developed by other entities, for its own ad buying tools to replace third-party cookies.

Correct. It’s not. But it is actually doing something else.

 

So, for those of us that follow this stuff what Google really should have said was “we’ve worked out how to do this without you realizing it so we’re going to stop doing it with cookies”. So let’s unpack this a bit.

Cookies, at a fundamental level, are about cloud-based tracking. It is as if you have a name card such that you show up at a website, where they take down your information, and send it on up to a centralized server along with information about the site you are visiting, what you did on the site, etc.; every website collects the same name card and sends the same sort of information to the same centralized server. Let’s suppose this centralized server is called “Google”. This information is attached to your profile, along with whatever data points Google can collect on its own properties. Which is a massive amount: everything from searches to Maps data to mobile app activity via its SDKs (software development kits) to Android and a whole lot more.

Google then turns around and sells inventory to advertisers, both on its own properties and also 3rd-party ones. Notice that I said inventory, and not data; advertisers don’t know you, what websites you visited, or anything else — in fact, they don’t care. The goal of an advertiser to achieve some sort of business goal, from app installs to e-commerce to brand awareness; the way it works is that an advertiser tells Google the goal it wishes to accomplish, how much it is willing to pay to accomplish that goal, and then Google harnesses its mountain of data to find the exact right users to advertise to. Incredibly enough, this happens in fractions of a second the moment you arrive on a website — your name card is also how Google knows which ads to show you.

From the Wall Street Journal article:

Instead, Google says it will use new technologies it has been developing with others in what it calls a “privacy sandbox” to target ads without collecting information about individuals from multiple websites. One such technology analyzes users’ browsing habits on their devices, and allows advertisers to target aggregated groups of users with similar interests, or “cohorts,” rather than individual users. Google said in January that it plans to begin open testing of ad buying using that technology in the second quarter.

Google’s implementation of this “privacy sandbox” is called “Federated Learning of Cohorts” (FLoC), and is detailed on this GitHub page (it’s open source). As the Wall Street Journal explains:

Browsers would need a way to form clusters that are both useful and private: Useful by collecting people with similar enough interests and producing labels suitable for machine learning, and private by forming large clusters that don’t reveal information that’s too personal, when the clusters are created, or when they are used. A FLoC cohort is a short name that is shared by a large number (thousands) of people, derived by the browser from its user’s browsing history. The browser updates the cohort over time as its user traverses the web…

The browser uses machine learning algorithms to develop a cohort based on the sites that an individual visits. The algorithms might be based on the URLs of the visited sites, on the content of those pages, or other factors. The central idea is that these input features to the algorithm, including the web history, are kept local on the browser and are not uploaded elsewhere — the browser only exposes the generated cohort. The browser ensures that cohorts are well distributed, so that each represents thousands of people.

In a nutshell, here is how this works:

• Given its huge amounts of first-party data, Google has the ingredients to create the best machine learning training sets in the world. The company will use these training sets to create machine learning models that fit data to some arbitrary number of cohorts. The company says the cohorts will contain thousands of people.

• Google will then place those machine learning models in Chrome and Android, both of which have overwhelming share.

• Chrome and Android will keep track of every website you visit, run the resultant data through those machine learning models, mark you as being a part of one of those cohorts (you can’t be a part of multiple cohorts), and report your cohort to whatever website you visit. In other words, whereas cookies were used for cloud-based tracking, cohorts will be used for browser-based tracking, and instead of matching ad inventory to your profile in the cloud, Google will match ad inventory to your profile in your browser (which never uploads your personal data).

In other words, whereas cookies were used for cloud-based tracking, cohorts will be used for browser-based tracking, and instead of matching ad inventory to your profile in the cloud, Google will match ad inventory to your profile in your browser (which never uploads your personal data).

From a privacy perspective, this is at first glance a great solution; Google no longer has all of your individualized data. As the EFF pointed out a year ago, though, there are important ways in which this approach is worse:

A flock name would essentially be a behavioral credit score: a tattoo on your digital forehead that gives a succinct summary of who you are, what you like, where you go, what you buy, and with whom you associate. The flock names will likely be inscrutable to users, but could reveal incredibly sensitive information to third parties. Trackers will be able to use that information however they want, including to augment their own behind-the-scenes profiles of users.

Google says that the browser can choose to leave “sensitive” data from browsing history out of the learning process. But, as the company itself acknowledges, different data is sensitive to different people; a one-size-fits-all approach to privacy will leave many users at risk. Additionally, many sites currently choose to respect their users’ privacy by refraining from working with third-party trackers. FLoC would rob these websites of such a choice.

Furthermore, flock names will be more meaningful to those who are already capable of observing activity around the web. Companies with access to large tracking networks will be able to draw their own conclusions about the ways that users from a certain flock tend to behave. Discriminatory advertisers will be able to identify and filter out flocks which represent vulnerable populations. Predatory lenders will learn which flocks are most prone to financial hardship.

FLoC is the opposite of privacy-preserving technology. Today, trackers follow you around the web, skulking in the digital shadows in order to guess at what kind of person you might be. In Google’s future, they will sit back, relax, and let your browser do the work for them.

As Ben Thomson points out, this sure feels like winning a battle to lose the war; again, in contrast to the deceptive way in which online tracking is represented, no one actually cares or wants individual-level data — it’s only useful in the context of a data factory. Advertisers want to achieve business goals, and Google wants to make money, and if the best way to satisfy the privacy industry is to require users to carry around easier-to-understand-and-act-on group labels instead of relatively worthless name cards then so be it.

And this gets to the question of competition. To go back to the Wall Street Journal:

Some analysts said Google could stand to benefit from the end of cross-website tracking because it is less reliant on data from other companies. Instead, it collects a large amount of data directly from users of its services, such as YouTube or Google Search. Google says it will still use that data, called “first-party” data, when targeting ads to be shown on its own websites. Many large advertisers also have a lot of first-party data on their customers.

This is definitely true; Google’s owned-and-operated advertising won’t really be affected, which is great news for the company given that 84% of its revenue comes from its own properties — which, of course, will have the same access to the cohort data as anyone else, just in conjunction with huge amounts of first-party data.

First-party data is going to be a massive competitive advantage going forward.

Moreover, while other sites will be able to make best guesses at what different cohorts represent, Google will understand them much more precisely given that it will be its own machine learning models generating them, based on its own private data sets and big picture understanding of what is driving what type of business results for advertisers. Plus, Google will be doing this at far greater scale than anyone else, which means its data advantage will only compound.

Ultimately, what makes all of this work is the fact that Google owns the device where users browse the web, whether that be Chrome on PC/Mac/iOS, or Android. True, most iOS browsing happens via Safari, but Safari has already cut off 3rd-party cookies by default; Google’s position on iOS isn’t really much worse than it was previously (Apple meanwhile, is reportedly hiring aggressively to build out its own browser-and-device-based ad solution).

Meanwhile, everyone that doesn’t have a browser or an operating system is in much worse shape, particularly Facebook, which will get much less useful data from Chrome once the third-party cookie ban goes into effect. Facebook will of course come to understand the cohorts better than anyone outside of Google, but it already is a data factory so it definitely prefers individual-level data.

Unfortunately for the social media giant, this is the state of the privacy debate: Google can own over half of the digital advertising market, cut its direct rival off at the knees, and receive widespread praise for having done so, even as users give out less personally identifiable information in exchange for being more easily profiled.

Ummm .. hurrah?

 

If there is any positive side to this it is that you are no longer individually identifiable. “You” don’t have a tattoo on “your” digital forehead – advertisers can only see the tattoo. As each person is assigned to a single cohort, and cohorts contain thousands of people, there’s just no way targeting can be as fine-grained. Surely we’ve all seen the investigative pieces where it was possible to target a single individual with ads if you know enough about them. That level of privacy invasion (may) be gone, and as people have multiple interests it’s very unlikely that the targeting will be good enough to single out any individual niche interest or condition.

All of this is unquestionably good for Google but they’re designing the replacement for cookies, so how could it not be? I need to think through the new “profile” bits a bit more.

And as I have noted in previous posts, most companies simply won’t be able to understand the FloCs and will have to figure out how different FloCs behave on their own sites over time (the few data privacy “experts” I have spoken with are clueless).

Facebook on the other hand, although losing out on individual level data, will know exactly who is in which FloC the first time they login from a browser that exposes a FloC ID. They’ll be able to build models to reverse engineer the FloCs and know as almost as much as Google does about them in a relatively short timespan. I’m not sure that exactly counts as “cut off at the knees”. The only disadvantage here is that Facebook will get less individual level data from outside Facebook itself as time goes on, so Google’s advantage should grow over time.

Leave a Reply

Your email address will not be published. Required fields are marked *

scroll to top