E143 - Dr. Augustine Fou, FouAnalytics, Independent Ad Fraud Researcher, Marketing Science Consulting Group, Inc.

August 1, 2023 Debbie Reynolds

Find your Podcast Player of Choice to listen to “The Data Diva” Talks Privacy Podcast Episode Here

Your browser doesn't support HTML5 audio

The Data Diva E143 - Dr. Augustine Fou and Debbie Reynolds - (52 minutes) Debbie Reynolds

SUMMARY KEYWORDS

websites, data, advertisers, people, ads, cookie, privacy, bots, ad tech companies, tracking, consent, visited, parameters, sites, buying, privacy regulations, targeting, analytics, fingerprinting, digital marketing

SPEAKERS

Debbie Reynolds, Augustine Fou

Debbie Reynolds 00:00

Personal views and opinions expressed by our podcast guests are their own and are not legal advice or official statements by their organizations. Hello, my name is Debbie Reynolds; they call me "The Data Diva". This is "The Data Diva" Talks Privacy podcast, where we discuss Data Privacy issues with industry leaders around the world with information that businesses need to know now. I have a special guest on the show. Dr. Augustine Fou is the CEO of Fou Analytics; I've been a huge fan of you, I read all your work, your newsletter stuff that you put out, and you're very deep into obviously analytics, but a lot of how that plays out in marketing and how people use the Internet and attribution for ad campaigns and things like that. So, tell me a bit about your journey and what got you interested in doing analytics at this level.

Augustine Fou 01:05

Thanks, Debbie. Glad to be here with you. I've been doing analytics, you can almost think of it as all my life. So my Ph.D. is actually in chemical engineering and materials science from MIT. So I take a very kind of scientific orientation to things. So even in marketing, where a lot of things can be thought of as subjective or creative. When we moved into digital about 25 years ago, there's a lot more metrics and a lot more analytics to work with. So that became a sweet spot for me. All right, I'm looking at the analytics, trying to see if we can figure out if the digital marketing is working or not. But there's also some dangers lurking in there. Because you know, the area that I study right now is ad fraud and bots, and the bots are not only causing the ads to load, but they're also clicking on them. So if you're looking at some analytics, like click-through rates or the number of clicks you got, sometimes those could actually be misleading, right? You might actually get more clicks from bots than you do from humans. So you might actually think that campaign’s working really, really well when it's actually not, right? So my focus right now has been to try to find the fraud, try to find the bots, and help marketers clean up those campaigns. And, you know, that's why I've been focused on the analytics side of things for many, many years now.

Debbie Reynolds 02:27

Wow, that's, that's a jump in terms of what your degree is in and what you're interested in. I guess I'm similar in that way. I was a philosophy major, somehow, I ended up in technology.

Augustine Fou 02:40

Well, it's good. I think we, you know, we bring different perspectives to it. And you know, you've heard of the scientific method, which is, you know, like, let's come up with some hypotheses and then run experiments to prove or disprove. So I've had the luxury of doing that. Over the years, I've been a small business owner, been using free platforms like Google Analytics for many years, and now recently built my own platform focused on analytics for digital media, is unlike GA, Google Analytics, or Adobe Analytics, which are meant for your websites, we now need to measure the ads and see where they went and also see whether bots or humans cause those ads to run. So Fou Analytics is that platform, which I've been building for the last 10 years. And I think the scientific method is very much needed now. Because we still have to ask and probe the data, right? We're seeing a lot of clicks, but are we actually getting any sales from it? So we have to move beyond the quantity metrics or the vanity metrics. Often, you know, those big numbers are very attractive, right? Just like in a video game. If you get a higher score, you win. In this case, a lot of marketers are looking at higher click-through rates and higher numbers of clicks, and they think their campaign is winning or doing better. But you really have to kind of look a little bit deeper than that to figure out if is it actually driving any sales or incremental sales. So you know, bringing the scientific method to the art of marketing is kind of the way I think about it.

Debbie Reynolds 04:18

Tell me a little bit about just marketing in general on the Internet. I think we see a lot of people, especially nowadays, with things like Apple and app transparency, where they're having people have opt-in rights to be able to get different marketing and things like that. Tell me about the importance of marketing on the Internet and why you do what you do, and why that's important to people and to brands.

Augustine Fou 04:52

So let me take a little step back since I've been observing digital marketing from the earliest days, so the mid-90s So, basically, in those days, we would have websites like Yahoo.com. And you can think of them as giant portals because they aggregated all types of information, from sports scores to news to weather, so on and so forth, stock quotes, and things like that. So it became a central destination that a lot of people went to. And in the good old days, they basically took the magazine advertising model, right, the ad-supported model, and just translated it to webpages. So on the top of Yahoo, you would have a banner ad on the right-hand side and maybe on the bottom of the page. So Web 1.0 was basically taking offline stuff like content and putting it on websites and doing the exact same revenue model using advertising to support the content. And then, when we moved into Web 2.0, we saw Blogger and WordPress come along. And so those blogging platforms allowed more and more people to get content online. So previously, without blogging platforms, people had to know how to code HTML, make websites, and put content online. And that was kind of limited to whoever was a web developer back then. So the rise of the blogging platforms allowed more and more people to get content online. And then we fast forward to what we now know as social networks. So I'm going to call that web 3.0. So in the early days, it was MySpace and Friendster, and then more recently, it's Facebook and Twitter. And in those cases, you don't even have to write a long blog post, you could just tweet 140 characters or 240 characters now and get some content online and easily post pictures. So again, became easier and easier for more people to get content online. So we saw a dramatic increase in the number of sites and the amount of content online. Now keep in mind all of that was still ad-supported, right? So when we have what we now call the long tail of websites, so millions upon millions of super tiny niche websites, most of the large advertisers are not going to go negotiate advertising deals with each one of those small sites, right? They probably wouldn't have even heard of them before. So that gave rise to what we now know as ad exchanges. So those exchanges are kind of like Wall Street. So the ad exchanges bring together buyers and sellers of shares of ad impressions, just like Wall Street brings together buyers and sellers of shares of stock. So there's a direct parallel to Wall Street. But when the ad exchanges became the dominant form of buying media, it actually created a giant loophole for fraudsters to get into the mix, right, because previously, when a large advertiser was buying ads from a large publisher, like the New York Times, or Washington Post, or Wall Street Journal, you would sit across the table and talk to someone like a salesperson sales rep. But now, when you're buying from the Ad Exchange, you really don't get to see who represents those small websites, right? There may be 10's of thousands of small websites or hundreds of thousands of small websites that you never interact with. But yet you're buying ads from the exchange, and your ads end up on those small websites. So the loophole that I mentioned before is where the exchanges were allowing fake websites and fraudsters into the exchange, right? So a fraudster can easily set up 10,000 websites at a time, right? Just think about using a WordPress template, right? They're just using the templated approach. And they can create thousands upon thousands of fake websites. And so those exchanges immediately start making revenue because they're selling ads to large, unsuspecting advertisers. Okay, so that's kind of what gave rise to the fraud and what gave rise to this kind of programmatic media buying. And as more and more dollars flow through the exchanges. They're now flowing to these unknown websites. Now along with that, we saw the rise of ad targeting. And this is where we're going to intersect with privacy issues, right? So when you're showing ads on all those different websites, the advertisers were starting to worry about waste. So oh, we want to make sure our ads are targeted to the people who might be interested in it, because we don't know where the ads are going right now. So the rise of ad targeting means that more and more of the websites had to start collecting information about the users. And unlike people being logged into Gmail or people being logged into Facebook, when you're visiting a content website, like when you're looking up a recipe online, you're typically not logged into that site. Okay, so those are all kinds of drive-by visitors, is the way I call it right. They're not logged into the site. So in those cases, they don't actually know who you are. The site doesn't have your login, right? They don't know who you are. So they're basically focused on the cookie, right? So when these people visit a variety of different websites, ad exchanges, and ad networks, now new ad tech companies are setting cookies on those users so that they can kind of track where they're going. And they can now derive some information about these anonymous users like, Oh, if they visited a whole bunch of recipe websites, they might, they must be a home cook, or they must love recipes or something, right? So they're trying to derive some parameters about who those people are and what they like. Because once they have that, they can sell that data to advertisers to use for targeting. So this gave rise to this flywheel of websites, adding more and more JavaScript tags to their websites in order to collect more and more data about users. Now all of that was unregulated. So you can almost think of that as an unchecked unregulated industry where anything goes; you can collect anything and everything about those people and set anything and everything you want me to set cookies and track them. So after that happened for 10 years, we started realizing, oh, there might be some privacy implications, because maybe people don't want to be tracked that way, maybe they were not aware they were being tracked. So they'd be a little more specific. When a person went to New York Times, they knew they were interacting with that site. And if, for example, New York Times ask them for permission to track them or set a cookie, I'm sure they usually are fine with that because they chose to go to New York Times website. But what they didn't know is that there might be 100 to 200 other ad tech companies with JavaScript code that's been added to the New York Times website that is now also tracking the user without their knowledge and definitely without their consent. Right. So they don't know that there's 100 other ad tech companies tracking them, right? They thought they were interacting with New York Times the publisher. So I've written about that over the years. And you know, like, these big publishers didn't have a privacy problem 10 years ago, but now they do, because of all the other third-party ad tech, JavaScript tracking tags, right, so that I am careful to differentiate first-party tracking versus third-party tracking, right? It's the third-party tracking by unknown ad tech companies. That's the problem. Right? Not the first-party tracking that New York Times does or New York Times setting a cookie because the person wants to stay logged into their session on New York Times. Okay, those are fine. But all the third-party tracking that's unknown and certainly unconsented is the issue right now. So let me kind of tie this back to the evolution of digital marketing. Okay, so now a lot of advertisers believe that with this additional targeting and tracking, they can make their ads more targeted and more relevant. And now, I will tell you that from my own research, the ads are not terribly targeted or relevant. But it's kind of hard to believe. So we have to just kind of think about your own common sense, right? So for the listeners, when was the last time you saw an ad that was highly relevant to you? And also, when was the last time you saw an ad that you thought was creepy? That was following you around the internet? Right? So a lot of the stories that I hear are, oh, these ads are so creepy, right? I just looked at the sunglasses on Amazon. And in the next five minutes, those same exact sunglasses show up in a banner ad somewhere on another site that I'm visiting. Right? How does that happen? It's because Amazon soldier data, or some of these other companies bought the data from Amazon and use it to retarget you down to the item that you looked at. Okay. So there's a lot of problems with ads, right? So when the consumers are experiencing what they feel are creepy ads, that's an example of ad targeting working; they didn't know that they were tracked; they didn't know that the item that they looked at was being recorded by somebody. And then that item showed up in a banner ad somewhere. So that's the feeling of creepiness that gives you a sense of how ad targeting works. But a lot of the other information that is used as targeting parameters by the advertisers is all derived. Okay, so or inferred, okay, so it's inferred from the website visitation patterns. So let me use two simplistic examples to illustrate when it does work. And then when that kind of inference starts to fall apart. So if, for example, an anonymous user visited Playboy, Maxim, Sports Illustrated, or ESPN, you can kind of surmise that that was a male user. Okay? If the anonymous user visited Victoria's Secret, Venus shaving products, and other female-oriented websites, you can probably surmise that was a female user right without that person logging in or telling you, but what do you infer when a person is visiting Amazon? I calm? Or what do you infer about who they are and what they like when they're visiting walmart.com or CNN, for that matter, right? So, other than a very few simplistic examples, the inferences get extremely bad. Okay, so that's what we're finding. In one study conducted in 2018, we saw that even for gender, the ad targeting parameters that were being offered for sale was less accurate than random. So, for example, if you wanted to target males only, you'd have a 50-50 chance of targeting if you just did spray and pray. If you purchase the targeting data, it was only correct in 42% of your impressions, meaning it was less accurate than random. And then when you added just one more parameter, age, so gender plus age, it was down to 12% accuracy. So only one in 10 ads were actually targeted correctly to those two parameters. So what a lot of advertisers don't realize is that even though they are paying extra for all these targeting parameters, it's actually making their campaigns worse because the accuracy of the information being used for targeting is actually not that good. In fact, it's outright bad. So hopefully, that kind of gives you the history of what brings us up to date now; right, this is what's being done. So a lot of advertisers are paying extra to buy those targeting parameters. And we're now seeing the ripple effects of privacy regulations starting to be enforced. Right. So obviously, GDPR has been around for a number of years CCPA. And some of the new California regulations and privacy regulations and other states, those are starting to get not only get on the books but also getting enforced. And in those cases, I actually think it's a good thing because it's going to force the advertisers to wean themselves from this bad data that they believe is helping their digital advertising. Now, let me tie this back to the fraud side of things; right, the bot activity; not only are the bots causing the ads to load, but they're also clicking on them. So what you're observing is higher click-through rates in the digital campaigns are not coming from better targeting. They're actually coming from more bots. Okay, so do you see how it was kind of a circular thing where the advertiser said, Oh, let me go do more programmatic advertising. Let me go buy more targeting parameters. And oh, I saw higher click-through rates. But all of that is actually due to fraudulent activity from the bots, it doesn't mean that the campaigns are working better. So hopefully, as we see the ripple effects coming from the privacy regulations, more and more of the advertisers will start to wean themselves off of buying targeted parameters, right targeting parameters in programmatic. And in doing so, they might actually see their digital marketing improve in performance, meaning actual outcomes.

Debbie Reynolds 18:01

Wow, that's great. Thank you. I want to talk with you a little bit about the third-party cookie, you touched on it a bit. So someone I actually had a gentleman on my show; he was someone who had deep roots, Navteq. And he referred to the third third-party cookie or as the third-rail of the advertiser. And we know that a lot of laws and regulations in the privacy realm are really trying to target this kind of third-party data sharing, and definitely the third-party cookie. So we are seeing big companies like Apple and Google try to move away from third-party cookies and do different things. But I guess I'm going to talk about two things. One is the third-party cookie issue. And then the other is my belief is that third-party cookie is one issue. But I think the tracking of individuals without their consent it's kind of a bigger issue. So even if we get rid of third-party cookies, I don't think that that solves the issue. What are your thoughts?

Augustine Fou 19:07

That's true. And the cookies have just been a convenient metaphor for third-party tracking of individuals, right? So like I said earlier, I differentiate third-party from first-party because New York Times being the first-party that you're interacting with; if they set a cookie, that's fine because the humans know they're interacting with New York Times, right? So they would expect that, and if asked, they would give consent. What they don't know is all those third-party ad tech companies, right? A lot of the names are things like Crux digital, or live ramp, or whatever, you know, gum gum, or whatever. So they would never have even heard of these ad tech companies before. So even if those ad tech companies ask them for consent, the humans probably not going to give them consent because they don't know who they are. And they don't actually know where the data goes after it gets collected. Okay. So cookies are Just a convenient way of talking about third-party data collection that is privacy-invasive. There's other things that are being done that consumers don't know about, right? So if you download an app onto your mobile device, not only do they get your device ID, but if you also grant them GPS access, right so most people, if they're using a sports tracking or fitness tracking app, they would give them GPS access, because they want to record their runs. Right. So, in that case, the app has access to your precise location through GPS. And most people may not have thought through the implications, the privacy implications of that. But you might remember a few years back, the location of private or or secret military bases was leaked because people were using a fitness app while working out on base, right? And also the identity of a lot of soldiers. You know, military personnel were leaked because of their location and the activities that they were doing. So in those cases, you can think of that as unintended consequences or privacy leakage that they didn't know what happened, right? They thought they were just using a fitness app. So that's why we need to be more sensitive about it. And more people need to be educated about the privacy implications of tracking by third parties, right, whether it's the app itself, all the other trackers that are added into the apps themselves, or into the ads that run in the apps, right? So there's layers upon layers, right? Those third-party trackers can be added into the ads themselves; those are called ad tags. And some can serve a legitimate purpose, like detecting the viewability of the ad or detecting fraud or IVT, invalid traffic. But yet, that data is being collected, and no one knows where that data goes. Or at least there's no reg regulatory controls over that just yet. So again, it's important for these regulations to be enforced so that we can actually start stemming the tide of all that data going to other places, right? And just imagine if a foreign entity, or a nation-state or a terrorist group got access to that kind of data, right? There's a lot of other bad stuff that can happen. So one of the moves that Apple is making is, it's really cracking down on not only third-party cookies, right? They've been cracking down on third-party cookies using their Safari browser for many years now. But they're now also cracking down on the availability of the IP address of the device as well as the device ID. Because once the bad guys get access to those things, they could also use it for even more nefarious purposes. Right, so whether it's Apple doing that, or Google planning to that with the Chrome browser, or the Chrome browser may follow in the footsteps of Safari, where they're deleting third-party cookies after each session and or making the IP address less available to, you know, to third parties, right. So all of those are moves in the right direction when it comes to protecting the privacy of users who really don't understand this well enough to protect their own privacy.

Debbie Reynolds 23:15

Very good. I would love your thoughts on fingerprinting. This has come up as an issue, especially as we're seeing proposals, especially in laws or regulations, where they want to have people make individuals more anonymized in some way, especially as they're handling data. But as you know, these things you're doing on the Internet and even your settings or your computer, that information can be cobbled together and possibly re-identify you. Tell me a little bit about the fingerprinting issue.

Augustine Fou 23:50

Yep. So fingerprinting basically means they have a way to uniquely identify your browser and your session without asking you to provide any of that information themselves. So previously, people understand PII, personally identifiable information, to be things like your email address, your name, your phone number, and maybe your home address and things like that. So for a while now, people are saying, Oh, we don't have PII. What they're not saying is that they do have fingerprinting. Okay, so in light of the privacy regulations being enforced in recent years, fingerprinting has become more and more prevalent because instead of setting a cookie in your browser, if they're no longer able to set a cookie, they now have to rely on an alternate method to uniquely identify you. So fingerprinting for the audience is basically if they collect enough JavaScript parameters, say, for example, your screen resolution, your browser version, maybe the list of plugins that you have in the list of fonts that you have, they can now hash those together, smash it together and basically uniquely identify you. Because you will have a certain list of plugins or fonts on your particular computer that someone else does not. Right. So it makes it kind of a way to uniquely identify. Now the ad tech companies are going to argue that that is sufficiently private because it doesn't use things like your email address, your name, or your phone number. And they can say most of those are anonymous JavaScript parameters. So that is accurate. But like you said, the key issue is that they can use those things and then later re-identify the user by combining it with other datasets, right? So if they have a unique fingerprint on your browser, and they know that some person visited or looked up information about certain drugs, like some cancer drugs, for example, if they do a Google search, and they look at some of those pages, they may actually be able to re-identify you and know private details about you that you may not want them to know. Okay, so there isn't a danger still in those kinds of things. So, on the one hand, people would argue that fingerprinting is privacy-friendly because it doesn't use outright PII. But yet, there's still ways that they can be recombined with other datasets so that the user is uniquely identified.

Debbie Reynolds 26:21

Let's talk about data brokers. Obviously, they, just like your Wall Street example, they buy and sell data about individuals to, I guess whoever wants to buy it. I know, as we're talking about marketing and ad campaigns, we also know that people want that data for other purposes. So whether that be maybe an insurance company wants to purchase that data to be able to find out more information about someone maybe that they think they want to insure, or maybe a financial institution wants to know more background information around someone, and they'll use maybe a data broker to buy information. Tell me about this data broker space and kind of the privacy issues there with people.

Augustine Fou 27:08

Yeah, the main challenge with data brokers is that very often, you don't know where the data came from, and you have no way of tracking back to where it came from. So a lot of times, data brokers, and I won't mention anyone specifically, but a lot of customers, like the advertisers, would upload lists of their own customers, right email lists, or phone lists, to those data brokers for what is known as cookie matching. So they'll upload a list, and then the data broker or the platform would basically do a cookie match to try to find the cookies that are supposed to be associated with that audience. So that when those cookies show up, again, somewhere online, they can actually show an ad to them. Okay, so that's the process. Imagine if that process were repeated by hundreds of advertisers, right? Because they have hundreds of these data brokers have hundreds of customers. That means that the data is actually coming from real customer lists from real advertisers, many, many of them. But once it gets commingled into that vast database, right, they're now calling it data lakes or something like that, right? You quickly lose track of where the data came from. And you also lose track of the accuracy of that data. So, on the one hand, you have people trying to buy that target that data for targeting purposes. On the other hand, they don't realize whether that data has consent or not. Right, so now that we're in the phase of enforcing GDPR, and other privacy regulations, the collector of the data has to prove that they have consent before they can collect it. Now, they also have to prove that they have consent before they can sell it or share it; you see how there's many layers where they probably don't have proper consent on file or could produce that evidence. Therefore, we might be looking at an entire layer of this industry called the data brokers that might be operating illegally under the new privacy regulations because they simply can't show where the data came from and can't provide proof that they have consent to sell that data. Right. Even if they claim it's aggregated, it's rolled up, it's anonymized, whatever, whatever. Those, to me, are still just kind of short-term workarounds. So just like fingerprinting, right? It's like, oh, we can't use cookies anymore. So we'll do a workaround and do fingerprinting instead. Okay, I think the law will still find that that's a way around it, as opposed to a genuine way of abiding by the law. Right. So I think that we're going to be in a transition period for a number of years, where the advertisers will have to transition to the new way of doing things right without the cookies and potentially without workarounds, like fingerprints.

Debbie Reynolds 30:01

Speaking of workarounds, in a way, I'd love your thoughts on clean rooms. That's kind of the new buzz I've been hearing a lot about, I want your thoughts on that.

Augustine Fou 30:12

Just like a whole bunch of stuff and ad tech, it's just a buzzword. And people would like to think that the clean rooms are the data that's being uploaded by the advertisers themselves. But again, we lead into other problems, like who has access to the cleanroom? Is that data now copied from the cleanroom to something that's not clean? Right? So there's a ton of other issues associated with that. A lot of people just say, oh, you know, fingerprinting is the solution to GDPR. And privacy regulations, clean rooms are the solution to you know, advertisers now using their own first-party data. But again, there's many layers upon layers of assumptions that go into it. And, and other things like who has access to that data, you can't assume that it's private, you can't assume that it's the, you know, the rights of the users are being respected. So I think it's yet another buzzword, shiny objects that ad tech companies have come up with, and they've come up with a lot over the last decade. And I think advertisers should be just a little bit more careful before they wade into something like this, that they don't really understand all the nuances and all the underlying technologies.

Debbie Reynolds 31:25

What are your thoughts about what is happening in the world right now that concerns you from a privacy perspective, like something that you look at? You see this happening? You're like, I don't like this?

Augustine Fou 31:41

Well, I mean, in addition to all the not private things going on online, right, the places you're surfing, whatever. And I'll give you one more specific example, like Pornhub. Okay, humans go to watch porn, but Pornhub has a ton of other third-party trackers on there. So while you think you're using a VPN while surfing Pornhub, other people are able to track what porn you're watching. Okay. So those are probably things that consumers don't know is happening. And they probably won't like it if they found out. But those are just examples of online things. But now you pair it with offline things like in London, for example. And then it's pervasive in China. We have these traffic cameras, right, that take pictures of people, not only their vehicles but also the people. And the cameras are getting high resolution enough that they can actually see your irises right and can scan your irises. So something that was contemplated in a movie, like Minority Report from literally two decades ago, is actually coming true. It's happening in real life. And on the one hand, you know, they even showed this in the movie Minority Report, when you're walking by a billboard, the billboard would change the ad to match the person that's looking at it, right, because they scan your eyeball, and then target the ads based on what they think you like. Okay, so if they have the ability to do that, and now they literally do have that ability to do that. Would people think that it is okay, for that kind of violation of their privacy, all in service of showing them a more relevant ad? In most cases, if we know that we haven't come across very relevant ads at all in all of our experiences online. So why would we think that this is going to be more accurate, right, and kind of tying this back to some a danger that I see if you think about passwords, and you know, how many of those big data breaches there have been, people's passwords have been stolen. And so now the fix is to change your password. Okay, so you've been getting notices your email and your password have been compromised many times over. So change your password. But as we move into convenience, right, so some of the phones will allow you to put your fingerprint in to log in, or some of them will allow you to scan your face to log in, right? In those cases, if your eyeball gets compromised, you can't change your eyeball. Okay, you can change your fingerprint; you can change your face. So when we see biometrics used for login purposes, that's an incorrect use of biometrics, right? It may be convenient, but what people don't realize is you can't change your eyeball. So someone compromised that database, and you're using your eyeball to log into things. That's a very dangerous and untenable situation. So I always recommend, you know, using a YubiKey, or some third-party authenticator kind of thing. But don't put in your own biometrics because once those get compromised, and we can rest assured they will be compromised, just like anything else. It's online or connected. You can't change it. So you know, those are kind of an extreme edge cases of privacy. Right here. We're just talking about cookies and fingerprinting your browser and whatever. But we're moving to a point Were people's biometrics and face are scanned and all that are online everywhere. Remember those Facebook quizzes? Which celebrity do you look like? People uploaded their own face pictures. And they told you that this is their face that they want to be representative of them. So all those were basically data harvesting operations by good actors and bad, alright, bad actors now have your face. So over time, they can now use that, combine that with AI, and basically impersonate you without you knowing. So there's a lot of danger ahead; I just would warn people to be more cautious about what they do and be more suspicious of anything that you see online. And I would also tell the advertisers to use less of this targeting data because it's actually not helping your campaigns, your digital campaigns; it's actually making them worse.

Debbie Reynolds 35:51

That's great advice. I want your thoughts about accuracy. So I guess I have a philosophical thing I want to chat with you about and get your thoughts. So my thought is privacy laws that bring in the element where a company needs to make sure your data is accurate. That's kind of like the EU way of doing it. I think that putting in accuracy and privacy or data protection laws forces an element of transparency because how could you know something was accurate unless you were in contact with the person, right? What are your thoughts about that?

Augustine Fou 36:35

Well, you know, if we go back to offline stuff like a mailing list, I can actually witness I've seen the accuracy of that data set go worse and worse, right to the point where now, I get 15 mailings from Dell, each with a slight misspelling of my name, all coming through to the same apartment number. It's like if they actually looked at the data, it could tell there's there aren't 15 Different people with very similar names living in the same apartment, right? They could tell that the data is not accurate. But that's a kind of an overt or publicly visible thing, right where the mailing lists are totally crappy. I would say the stuff online is even more crappy because you don't even have their mailing address or their name. This is just anonymous cookies. And, you know, just a visitation pattern of the websites they visit. And on top of that, in digital, we have a unique problem that we don't have in the offline world, like with mailing lists. In digital, we actually have bots actively visiting different sites in order to pretend to be certain audiences. So say, for example, in pharmaceutical advertising, the pharma companies are willing to pay much, much higher CPMs to target doctors, right? Because they want their ads to show to doctors, not to consumers. So what they're then doing is, oh, well, let's look for those cookies that visited certain websites like the Journal of Clinical Oncology, New England, Journal of Medicine, and various other, you know, health-related websites, right? So what the bots are doing is they're deliberately visiting a handful of those sites to collect cookies. So now to the eyes have the data brokers; oh, this cookie must be a doctor because they visited these medical sites, medical content sites. So now the data broker is selling that cookie as if it were a doctor, even though they have no idea whether it's a doctor, right, because like you said, they never contacted them. And it's just a cookie, right? So now, big advertisers are paying extra premiums on the CPM, thinking that they're targeting doctors, but yet they're actually targeting a bot that visited a whole bunch of sites that doctors visit, okay, to pretend to be that audience. So on the one hand, it is important for the data accuracy for the data to be accurate when we're dealing with consumers. But those are completely different things when we talk about the accuracy of the targeting parameters. So yeah, you're right, you have certain, requiring the vendors to make sure the data is accurate, dramatically raises the cost, meaning the amount of work they have to do to reach out because once they reach out to the person, the person won't know who Crux digital is they won't know who k two is or live ramp is they're probably not going to give consent. They're gonna say who the F are you, right? Why would I ever give you consent to collect my data and sell it? So those are kind of inconvenient truths that the head tech companies don't want to go down. So they're going to fight tooth and nail to say accuracy should not be a requirement in the data because they've been able to sell the crappy data up till now. They want to keep doing that. Okay, so I think there are some pros and cons to forcing vendors to have more accurate and actually show that they have consent, right? Because under GDPR, the consents are very, very specific, right? It's for this browser, for this device, or for this person. And if you change the browser, you have to get consent over again. And in most cases, I'll tell you from my data, the bots are the ones giving consent because they want to cause the ads to load. I don't know that many humans that have given consent. Now, when I look at the data, you have 250 parameters and 250 different vendors that you have to check off one at a time to give explicit consent to each vendor under GDPR. When I see all of them checked off and all consents are given, that's not a human, right, a bot does that. So they can do the thing that they're meant to do, which caused the ads to load. And in most cases, that's not proper consent, right? Because GDPR, to me, requires informed consent. And if you just check off everything, that's not informed consent; that means you didn't read it. Okay. So even if a human did that, that is not proper consent under GDPR. So that's what I mean by we're going to be in this transition period for at least a number of years going forward. Because even when the privacy regulations are enforced, so many of the vendors are going to just find a workaround and just say, well, well, we'll wait until we get sued. Because we're not going to change our business practices, because we're not going to go rebuild our technology, we're not going to go throw out our old data and get new data, because they're not going to get any new data. So they're just going to keep going on until GDPR enforcement has finally a large enough fine to put some of these companies out of business, right? Because when they find Facebook or Meta a billion dollars, it does not matter to meta, right? Because a billion dollars, they can just write in a check and be done with it. But they'll just hold that up in court for the next five years. So again, we need GDPR enforcement, vigorous enough to actually put some of these companies out of business. Otherwise, nobody is going to care, and nobody's going to do anything.

Debbie Reynolds 42:13

I think that's true. I want your thoughts about privacy and advertising. So I've heard someone from the IAB couple of months ago had a rant of some sort where they were talking about privacy being the enemy of marketing or ad tech. What are your thoughts about that?

Augustine Fou 42:35

First of all, nobody should listen to anything the IAB says. Second of all, that rant is because their source of revenue is the ad tech companies. So they're going to say whatever they need to say as a trade lobbying group to protect their own paying customer's interests. So I would not put any credibility or any weight into that rant. And of course, they're gonna say privacy is at odds with these ad tech companies making money.

Debbie Reynolds 43:05

Yeah. When I think about that, and when I heard about that, I'm like, well, the data, if I didn't exist, you wouldn't have data of me, right? So to be upset at the individual for wanting to be able to control the data about themselves. He was kind of crazy.

Augustine Fou 43:20

Yeah, I'm not even gonna name the name, but you know who it is. And they set it on the main stage. So it's like, yeah, they gotta take full responsibility for that. But again, like I said, Nobody should put any weight or credibility into what the IAB says, yeah, in a trade lobbying group for the last 25 years. And all of the stuff they claimed before where they say, oh, we'll do self-regulation, we'll do all this kind of stuff has not worked out at all. So nothing that they say, is credible.

Debbie Reynolds 43:49

Yeah, yeah. Now we talked about bots becoming more popular now. Right now, we have things like ChatGPT and other types of bots, right? Is that going to impact ad tech or privacy in the future? Just your thoughts.

Augustine Fou 44:06

It won't impact privacy, and it won't impact ad tech that much because bots have been prevalent before, right? So if you if this is the first time consumers are hearing about ChatGPT and bots, then they think it's something new. But bad guys have been using previous versions of ChatGPT to create fake content for their fake websites by the 10s of 1000s for the last 20 years, so it is nothing new. And if you assume that 99% of the longtail websites are fake and populated with fake content anyway or plagiarized content anyway, then you'll understand that even with ChatGPT, it's not going to make it much worse than it is already. Most people don't agree, and most people don't understand that. There's so much Fraud already in digital advertising, they think it's going to lead to a bump up in the amount of fraud. Right? So yes, ChatGPT has made it easy to create content, but it's really now just spilling into what I'm going to call the public sector, meaning the average consumer can type in a prompt and say, write an essay, write a web page with this type of content, and they can do it. Right, that's been available and accessible to hackers and bad guys for 20 years and is nothing new. Okay, so in that case, the bad guy's process has been to set up 10s of 1000s of fake websites and add them into the ad exchanges, kind of like this brings us full circle to what we talked about at the beginning of this podcast, right? Because with the rise of ad exchanges, it gave rise to more opportunities for fraud because now the bad guys can scale their fraud operations as well by now creating automatically 10's of 1000's of websites using WordPress templates and early precursors to ChatGPT, where they're basically plagiarizing content, remixing it slightly to defeat all the plagiarism checkers, right? If they just move a few words here and there, it's not going to match exactly. Therefore, it's not going to be shown as an identical match, right? So they get by all the plagiarism checkers, and all the images are stolen as well. Okay, so now they have all these websites that they operate. Even if some get taken down, they got 10,000 more doesn't matter, they can keep making ad revenue. So those are the types of sites that populate those ad exchanges. And I'll kind of close with this thought, right? I use this exercise in class with my students, I asked them to name off 10 websites you use every single day. They can't even get to 10. Right? Just think about that yourself as well, right? And then I asked them, can you name 10 mobile apps that you use every single day? They can't even get to 10? So what are the 10 million other apps that are in the App Store? Who uses them? What are the 100 million other websites in attic ad networks, right, that are selling ads, and who uses them to such a degree where they can actually generate hundreds of billions of ad impressions? It's not humans, it's bot activity; it's an automated activity that makes it look like there's all this activity on those sites. And not only do the bots cause the ads to load, they're also clicking on them. So for the last 10 years, advertisers think, wow, the more I bind programmatic, the better my campaigns perform because I'm getting even more clicks. Okay. But if they understand that not only were the ads generated by the bots, but the bots were also clicking on them, they can now see this come full circle, which is all of the stuff that you were seeing all those vanity metrics and quantity metrics, were all caused by bot activity, not more human seeing your ads and clicking on your ads, because they liked them and they were more relevant. That brings us full circle. It does.

Debbie Reynolds 48:06

Absolutely, absolutely. If it were the world according to you, what would be your wish for privacy or data or ad tech anywhere in the world, whether it be human behavior, regulation, or technology? What are your thoughts?

Augustine Fou 48:23

It will be super simple. And the way we do better digital marketing going forward is by doing digital marketing like we did in 1995. What I mean by that is advertisers who want their digital marketing campaigns to be effective should buy ads from legitimate publishers that they have heard of. Right. So if you haven't heard of a publisher if you haven't heard of a website or a mobile app before, it's likely that no other humans have either, okay, so if you buy ads from legitimate publishers, I'm just using a few off the top my head in the US New York Times Wall Street Journal, Washington Post, Hearst Conde Nast, Meredith, you would have placed ads in front of humans, right? Because humans actually still go to those sites. When you're buying from the long tail of sites, you're probably showing your ads on hundreds of 1000's of sites that don't have very many humans. So if we go back to just simplifying it, advertisers buying ads from legitimate publishers, those legitimate publishers have real human audiences. So now your ads are actually shown to humans, they're going to have some kind of impact on your business. And on top of that, the real legitimate publishers now have a source of ad revenue, right? It's been in decline for the last 10 years because that those ad revenues are flowing elsewhere to the programmatic changes. But now if we start to bring that ad revenue back to legitimate publishers, we're supporting journalism or supporting good content. And on top of that, the publishers are going to start reducing the number of third-party trackers that they put on the site. So overall, when you're interacting with New York Times, you're, you know, you're interacting with them. And we're going to now reduce the number of third parties that are harvesting the person's information. So we're going to also do better in privacy, right? And some of those legitimate publishers are actually doing whatever they can to be compliant with the law and also to respect users’ privacy, right? Whereas all the fake sites are certainly not going to do that. So do you see how when I said we're doing digital marketing going forward, kind of like we did it 20 years ago, it's almost like we're re-establishing or restoring the original contract of the internet, which is three parties, the consumer, the publisher, and the advertiser? In the last 20 years, we've seen that go awry because there's been a fourth leg added to that stool, right, a three-legged stool of stable, a fourth leg added to that stool has thrown it out of whack out of balance for the last 20 years. And the fourth leg is the ad tech companies because their sole purpose is to extract as much profit for themselves and their venture capitalists as possible. So when that leg is trying to pull as much value out of the ecosystem instead of adding anything back, that's what's thrown everything out of whack for the last two decades, right? Most egregiously, in the last decade, the last 10 years of programmatic media buying, if we go back to the original contract of the internet, which is a three-legged stool, and we re-establish that balance. That's how we're going to make digital marketing and advertising better for the advertiser. We're going to now restore ad revenue for legitimate publishers, so they can actually publish legitimate content, not clickbait. And then, finally, the consumers, their privacy will be respected because we don't need to collect all this data about them. Because that didn't add anything to the effectiveness of digital advertising.

Debbie Reynolds 51:58

That's mind-blowing. Thank you so much.

Augustine Fou 52:02

Thank you, Debbie.

Debbie Reynolds 52:02

It's been such a thrill to have you on the show. And I love your work. People should definitely connect with you on LinkedIn and look at the stuff that you publish. And you know, I read all your stuff. So I think it's, you're doing such a good service.

Augustine Fou 52:16

Thank you very much.

Debbie Reynolds 52:17

People just don't understand how the Internet works. I feel like that's part and parcel of what you discuss.

Augustine Fou 52:23

Well, thank you very much for having me. It's been a pleasure talking to you.

Debbie Reynolds 52:27

All right. Talk to you soon.

Augustine Fou 52:29

Thank you, Debbie.