E133 - Vadym Honcharenko, Privacy and Data Protection at Grammarly

40:00

SUMMARY KEYWORDS

privacy, data, ai, user, companies, tools, requirements, grammarly, transparency, services, regulations, world, tendency, raised, related, aspects, specific, good, privacy policy, eu

SPEAKERS

Debbie Reynolds, Vadym Honcharenko

Hello. On this episode of “The Data Diva” Talks Privacy podcast, we have Vadym Honcharenko. He is the Data Privacy and data protection manager at Grammarly in Ukraine. During this show, we talked a bit about the Italian Data Protection Authority, that a couple of days earlier had banned ChatGPT in Italy. Since the time of that recording, OpenAI has come to an agreement with Italy. So now, ChatGPT is unbanned in Italy, but during the conversation, we talk at a higher level about what these generative AI tools mean to data protection, not only in Europe but also in other jurisdictions. Enjoy the show. 

Debbie Reynolds  00:00

Personal views and opinions expressed by our podcast guests are their own and are not legal advice or official statements by their organizations.

Hello, my name is Debbie Reynolds. They call me "The Data Diva". This is "The Data Diva" Talks Privacy podcast, where we discuss Data Privacy issues with industry leaders around the world with information that businesses need to know now. I have a special guest on the show, Vadym Honcharenko. He is privacy and data protection at Grammarly in Europe, and I know Grammarly very well. Ukraine-originated company. You guys are in Poland right now. I actually looked up my stats recently on Grammarly because I want to tell you about it. So I think I've been using Grammarly for 270 weeks. So it's been a long time. And I really like the tool, but I would love to know your trajectory into privacy and what got you interested in this role and at Grammarly. I think you're in an interesting space. So you're at a technology company. So you guys are considered leaders in the way that you're using AI and AI tools. But then also you're running a company. So then you have internal things that you have to think about. You're in Europe, but your product is used all over the world, and you have offices in different countries. So tell me a little bit about that. How does that make your job more complex? More interesting? What are your thoughts?

Vadym Honcharenko  01:09

Yeah, thank you. First of all, thank you, Debbie, for inviting me. As you said, my role is related to privacy and data protection at Grammarly. Grammarly is about AI, we all know, and it's a great intersection between technology, privacy, and its role in it. Grammarly mostly covers the questions, both legal and technical, for evaluating different privacy risks for new features at Grammarly, some kind of critical changes in the way we process data in our systems, like critical vendors, onboarding, etc. So you know, my role is about legal and both technical aspects of finding the right way and to make our user's privacy the best; yeah, that's a really good question. Yeah, for sure, when the business provides its services all over the world. And of course, that's why we need to collect the data to provide our services; it gets more and more complicated all over the world regarding privacy regulations which re getting more and more complicated, but first of all, the framework that I personally use, I would say that we've got that standard, I would say that the GDPR raises the bar and different regulations all over the world, trying to do find the right balance between the local standards and more of the GDPR and user rights, privacy rights, etc. But, yeah, I think that it makes this fact that other countries and regulations are getting closer to the GDPR as a higher standard, makes things easier to some extent, therefore, you know, where we don't have to think through these as well, what GDPR says, now we can just understand it in a way that what were the privacy regulations all over the world say, of course, there are some differences, no doubt about it, for example, compared to the US privacy regulations and requirements and the EU based requirements. But generally speaking, yeah, we're starting with something related to the EU, for example, covering everything getting back to the US and whether we need to address some specific concerns around the US laws in different States like California, and all other States that are getting through this process all the way and making their laws, more and more comprehensive and are specific to user rights. So yeah, of course, you know, covering different regulations all over the world one by one and trying to make this process less complicated. But that's not always how it can be done, of course.

Debbie Reynolds  05:17

Yeah, well, one of the things, one of the tricks that I use to figure out or navigate a lot of privacy laws is that a lot of law regulations that were passed between 1995 and 2018 very much follow the European Data Directive, and then the ones after GDPR came out, a lot of those borrows a lot from GDPR. So that's how I do it. So I know the date when the certain law was passed, and I know what was happening around that time. And so, as you were saying, I'm glad to see the GDPR definitely is the gold standard because it's the most comprehensive one in the world. And we've seen since 2018. A lot of countries, even States in the US, have borrowed stuff from the GDPR. I think that does definitely help. I think in the US; it's sort of annoying because we have different States trying to pass different laws in some ways. I think they want to put their own little spin on it. So it’s, like, different enough to be annoying without being completely different. So I think it just makes it more complex.

Vadym Honcharenko  06:32

Yeah, that's a good point, especially the example that it raised; in my mind is personal data breach notification requirements in the US; every State has its own rules, right? So you need to get the updates regarding this. But you'll hear about the new guidance for data breach notification from the EDPB. And first of all, related to the requirements for non-EU controllers and whoever has representatives in the EU in some countries, in a case when the British will go or a lot of countries, you would need to inform every country specifically mostly, first of all, when considering the special requirements of every country. So probably, that also makes things complicated, but we'll see how it will go from the business practice.

Debbie Reynolds  07:37

What in the world is happening right now that concerns you in privacy? Like something that you see happening, you're like, oh, my goodness, I don't like the way this is going. What are your thoughts?

Vadym Honcharenko  07:51

Well, that's an interesting one. Yeah, like, first of all, things that are, you know, everybody's chatting through is ChatGPT, and the Generative AI and privacy aspects of all the things that the Italian DPA covered in the recommendations that they published not a couple of days ago, as far as I remember. So I think that would be the trend in how you TPAs and actually countries all over the world, we'll look through this tendency because we all know that Canada also started the investigation and in the US. Also, there were, like, kind of concerns from different organizations, including the FTC, to start the investigation and look through how it all works from the privacy perspective. So I think that I wouldn't say that I don't like the way it goes; I would just say how excited I am that these things are happening because it could make things clear for everybody, right, and including for the companies that are making their business around AI and privacy, which are, you know, like, specific aspects of these kinds of services. And probably like, specifically, it would be the things related to the EU law and mostly for when I was looking through the requirements, and they're concerned that the Italian DPA raised for ChatGPT. The most important aspect for me was the legal basis for collecting and processing EU users' data for development purposes. Because the same concern was raised earlier you remember the Meta enforcement for advertising and including for using data for dwelling purposes, which they use as a contract necessity, which is not the case. That's why they had issues around this. And now ChatGPT is also reconsidering the practices and communication they use in the privacy policies, terms, etc. Regarding what would be the right legal basis for using data, developing features for AI, etc. I think this is an interesting trend that, as I mentioned, started eta enforcement. And now it's also been raised in an Italian case for changeability. So now the question is if that should be consent or legitimate interest. Let's say it's a legitimate interest; what are the additional guidance that EU DPAs might provide regarding balancing the right aspects of balancing users' rights and users' privacy rights, and companies' interests? Of course, we will know the guidelines that already held that we already have from the DPAs. But it would be, I think, this is the stage where there is a necessity to have more guidance around this. And I think these two cases, especially ChatGPT. And we know that Spain also wants to investigate this case around ChatGPT; I think this stage is something that would need to have us, give us more clarity on this because, as we all know, most of the companies that use AI, that use data for developing their features, most of them rely on a legitimate interest in their privacy policies.

Debbie Reynolds  11:57

Yeah, you bring up an interesting point; I think one of the things that AI does, especially AI uses data from the Internet to power their tools; the issue that I see that comes up around privacy, and this is how jurisdictions are grappling with it is that a lot of the laws are written in a way like you have data, you give it to someone else. And then you try to give people control over what happens with the data that you give someone. But if there's data just open on the Internet, in certain jurisdictions, there's no law against companies sucking data from the Internet, right, because they assume in some way, it was public in some way that you give them consent. Two of the big things I think to come up here is that I think AI or generative AI tools raise more privacy risks because it pulls in more data from humans. And these AI tools aren't typically transparent. But then the undercurrent of this is that a lot of laws haven't really addressed, in my view, a situation where our data is not, so I didn't give ChatGPT data about me, right, but there's data about me on the Internet. So how do we manage that, especially, you know, when I tell a lot of my European friends because they are in Europe, and they're under GDPR? Things like data scraping are not legal, but in the US, it is. So I think that these different places, looking at these different jurisdictions looking into this issue, is a good thing. Because I think this is the space where data brokers live, and this is where they thrive in that gray space that hasn't really been defined so well. What are your thoughts there?

Vadym Honcharenko  14:01

That's a really good point. Yeah, you know, the things that we see now in the Italian DPA case with ChatGPT. And we'll get in back to this example again, but it's super interesting. The case that it actually practically addresses the EU law requirement that even though the data is public and it's personal, it doesn't matter. The data about this or that user, which is public, is personal data, and there are no differences regarding the requirements around it. So the risk level, the risk tolerance of the public personal data of somebody, is not lower. So the requirements are still the same. But it's still interesting that in the ChatGPT case, this requirement becomes super practical; we also saw that Italian DPA in their requirements for April 30th or the middle of May to be implemented by OpenAI, also related to these kinds of concerns for users and nonusers. So that they can jump in, register, look through, and find the settings to opt out of these data to be used. So this is pretty simple, but nobody does it. At this stage and such companies with such specific practices that you mentioned when they scrubbed the data publicly, they didn't think through this. And I think that another good example, which was raised is about ClearView AI, the company that uses public information and sorry, public pictures from different social media profiles of people all over the world, and use it to create some kind of metadata pattern so that if you want to find somebody, you can just do this with their database. So there were a lot of concerns around from different EU-based DPAs for the Clearview AI, but it sounds like this specific requirement that it is still somebody's personal data, it doesn't matter which date it is public, I think it's this specific case was not finalized. To some extent, I would say, therefore, now, you kind of takes round two for this specific requirement to make everybody aware that data scraping, when released or somebody's personal data, is not legal and require that same kind of requirements.

Debbie Reynolds  16:51

Yeah, I hadn't thought about it like that. So this is like the second round of trying to address that issue. So ClearView AI, got booted out of a couple of countries, some in Europe, I think in Australia as well. They had a case against them in the US, a couple of cases against some of the US where I think they were barred in some way from selling their tool to law enforcement for a period of time or something like that in the US. I don't think the issue was data scraping. I think the issue was data scraping, putting it in a database, and then using it for law enforcement. So I think it had a commercialization aspect to it. That could cause harm, right? So basically, and then in the US, we have this whole thing about you being innocent until proven guilty. But if you're scraping data, putting everybody in the database, you're almost creating a situation where you're guilty until proven innocent. So I think that that has a different connotation, I think in the US than it does in other countries. So, you know, it's just interesting to see, I would love to see more talk around this. And I think also we're on a collision course with transparency in AI. So transparency has been like the biggest, not the biggest, but one of the most major issues around privacy, as it relates to AI, I think relates to transparency. So it's like, okay, you have this cool tool, it does this cool thing, how do I know as a consumer what it's doing? And then what it's doing with my data? So I think that we're not even sure what all ChatGPT or these generative models are doing, right? So what these regulations of different jurisdictions are trying to do is try to create more transparency, and like you say, like Canada, studying what's happening with ChatGPT? And part of that is around transparency. How do we create regulations against something that we don't truly understand? What are your thoughts?

Vadym Honcharenko  19:01

It's an interesting concept, and transparency was raised; again, I'm getting back to the Meta example because, for my understanding, it also raised transparency, fairness, and all other aspects that are related to these kinds of requirements. I wouldn't say, like GDPR, its, you know, its transparency is the basic thing that should be covered in any regulation and even out of their specific regulatory requirements, because it's business, it should be a business practice. So that the first thing that would create user trust around the data that the service collects, and more specifically for these kinds of services. But yeah, like the transparency, you know, this aspect of the requirements, including for the AI base in Gen AI tools that are creating, they're being created all over the world from different services. I think that we're now in a stage, including getting back to the requirements for the Italian DBA, specifically, but the tendency, the global tendency, I think, is about making these things more explicit to the user while they use this service. What I mean here is that it is the time when we need to think about explainability as transparency. So it means that the user would not need to say, hey, please read our privacy policy; everything is in there. Well, even though it is not always the case that it is there, right? Even if the user would go and read doesn't mean that they would get a lot of transparency, Even if it would be described good. Not all the people want to do this and would be able to understand. So I think that the better things around satisfying the transparency is also not just about telling some kind of things in the private symbols, which are required by different regulations, GDPR, etc, which is a good thing. But I think that we need to think about the context, which is about when you provide some kind of data, that just in time notice for this or that action and controls and settings that the user might have to have to in the field full control of their data would be also a good aspect to add transparency, not just making the things in privacy policy or something like this, right?

Debbie Reynolds  22:01

Yeah. It sounds simple. I did a video a while back around incremental consent, which is, as we move through these tools, and people are deciding different features that they want to use instead of having like an 80-page privacy policy, which maybe you still have. But as people are choosing different paths for different things that they want to use, maybe have situations where they're being informed during their journey within the tool about what's happening. I think that's a great thing, though. They won't be like, oh, well, too bad. You didn't read this 80-page privacy policy before you did this tool. And I think it's very important. Especially, I work a lot with people who are developing tech in the Metaverse, like augmented reality and stuff. And so having that during journeys in terms of being able to have people make choices during that, I think it's going to become even more important. So I'm going to talk about AI and in general, you know, you have a very sophisticated tool; I like it a lot. I like that you all are always forward-thinking and how you develop new features that I'm always excited about. But I talked to a conference yesterday about legal innovation, and I was trying to tell people regardless of what happens with ChatGPT, generative AI, and large language models, that type of capability is being, will be incorporated in tools, and maybe they use every day. So people need to get accustomed to maybe learning how to use prompts or what AI is doing. So that they have more comfort there, so give me your thoughts about that.

Vadym Honcharenko  23:56

Yeah, that's an interesting point. I think that not getting back to the specific tendencies around ChatGPT and the Italian case. I agree that this prison of having these opportunities in different services, to make a prompt and receive, I don't know, an email generated for some recipient, and you feel that this is a really good tool because you don't have to spend a lot of time on this task, this is a really cool tendency. And this pattern, this possible technical possibility, would be implemented into different services. That's for sure. That will be ChatGPT or we call it different ways. Wouldn't be OpenAI as a provider or the other companies like Microsoft, Google, etc.; this is the tendency, and we, as humans, we feel this cool feature that should be provided to us as for the privacy aspects and tendencies around this, I think that the best tools would be that provide the information or first of all, what data would need to be processed and stored and for how long, etc., about the user. So they have this context. And the second is the practices the company uses for having users' privacy as the highest priority, when they do well, these features, and when we provide these kinds of features. First of all, it's the things around the data minimization when you think through this task as, first of all, understanding whether we need all these dead elements to provide this kind of service. This is the most important part of every tool, including these kinds of tools, which are Gen AI and privacy around this. So second part would be the requirements for the data processing environment and security expert aspects of this. So I think that sufficient level of transparency and controls in privacy setting dashboards, just-in-time notices, and things around data minimization and data security would be the most trending thing for these kinds of services.

Debbie Reynolds  27:06

We're definitely in a new frontier here. And I think it's going to even raise privacy even higher than it has been because now we have tools that are using data in different ways we didn't expect. So it's not a straight line, in terms of how data goes in. It comes from all different areas. So you have situations where maybe multiple companies are gathering data about people, you know, we had a guy, I don't know if you saw this on the news, this is in the US, where a guy said that someone asked ChatGPT about him. And they said that he had done some crime that he hadn't done or something. And so he's suing them to figure out why it says that? I'd love your thoughts. People have been chatting about this on LinkedIn and this about hallucinations, which I don't think it's a good; I don't think the word hallucination really captures what happens in these tools. Because when you think of a hallucination, you think about something that is just unexpected that happens where you have situations where you may ask the tool something, and it's giving you a very confident answer that sounds plausible in some way but may not exactly be true. What are your thoughts?

Vadym Honcharenko  28:27

There, that's a good one. I think that the privacy aspect that we discussed that would be important for these kinds of services is one side, but of course, first of all, bias and around things, which is most of course related to privacy about accuracy is crucial because the ChatGPT case raised the concerns for privacy in users now understand what they might have missed in this specific case, but from the user's perspective, the first thing that they would probably see next when they would not receive the good level of transparency what would happen with their data, etc. The next thing that they will definitely notice is that too provides some kind of, as you mentioned, hallucination or even incorrect data about a person, for example, as we know this example about the man saying that some specific person was deceased when he was alive and different other examples to your point. It is something that is really harmful. First of all, not just from the privacy, but from a different perspective. it discriminates against the user. It's harmful. And these kinds of things should also be on the list of these companies because otherwise, nobody would use it. If things would happen as unfortunately happen at some stage, do they OpenAI with users seeing the other users’ information in the history of their prompts? So yeah, privacy and security are important, but things around bias and things that might discriminate in general, in things that might be perceived as harmful for the user, these are the things we need to see. Because when a company like OpenAI uses public data they need, they first need to think about that; let's call them filters that they are using for then providing this information and your response to this or that prompt. This is not about saying that this is all in the Internet, all these tendencies, all these biases, all these injustices that we scraped through the data. This is not the Internet's fault. This is not partly probably not the human's fault. But this is the company's responsibility to implement these kinds of controls for the content. So that it would be like kind of sensitivity control so that they are not harmful to the users. And of course, these kinds of things are related to those companies that buy their services as a product in use for their services. Because even though the other company buy let's call it say, ChatGPT, as a service, they are also responsible for these kinds of sensitivity filters as well. But the company is OpenAI, which collected this data first responsibility to provide their services in this responsible way as well. So yeah.

Debbie Reynolds  32:32

Yeah, this is just a whole new world; I think we're all going to be very busy talking about AI and how it's going to change a lot of what we do and how it's going to really elevate the conversation; I think, around privacy. And as you said, I had a friend that where ChatGPT said he passed away two years ago, and he's trying to figure out, like, where did you get that from? That's nowhere on the Internet. So how did you come up with that answer? And I think part of this, in my view, is that companies that create tools where they want to give an answer, and they want to give you an answer. And so if you enter something into a chatbot, and they didn't give you an answer, you would probably be like, oh, this thing sucks. I don't want to use it, right? So maybe what they're trying to do is generate answers even when they don't have them and make up things. Even this guy, when he prompted it about this person's, like show me the obituary, and came up with a fake leak from like The Guardian, made up a link on the fly that was supposed to be to the Guardian newspaper obituary, which wasn't true. So I mean, obviously, humans created these tools. So you can't blame AI that it does this, you know?

So if it were the world, according to you, Vadym, and we did everything that you said, what would be your wish for privacy, or technology or AI, or anything anywhere in the world, whether that be regulation, human behavior, technology, what are your thoughts?

Vadym Honcharenko  34:13

I think that the high-level making, I wish that companies all over the world and first of all, business leaders, for the biggest companies like Google, like Facebook, like Amazon, and all the big companies, first of all, which collect the most efficient of information about users all over the world because we've all used their tools. I think that these kinds of companies would should; my wish would be that they would use the data in a more responsible way from a privacy perspective. Because, you know, which are related to profiling, which is related to advertising and using the data, or different and unusual for the user, or even the purpose of that I would not be even aware of is something that is a global trouble, I would say and the Meta case was unfortunately, a good example on this front regarding the usage of advertising of users' personal data for advertising purposes. Whenever I was looking through this case, I was just asking myself, in what way would I even know that up there? Even though if I go to that privacy policy or Terms of Service? How would I know that all the data, all the data that I use, and I say generate, while using Facebook, would be used for advertising purposes for me, I said, when generating Facebook, that is not even the case, as we know that from the technical perspective, you don't even need to be a Facebook user, so that they would use your data, which might be useful, like buttons, etc. There are a lot of researchers around this. So I think that would be a good point. So that everybody has more trust for big companies, and then that becomes a good standard business tender for smaller companies. And this responsible use of personal data and which is all clear. And not about just satisfying the regulatory requirements, which is, unfortunately, not even the case right now. But having this level of understanding that this is a business standard. And I need to do this because this is a good practice, and then specifying and satisfying different regulatory requirements on top of that, but not the regulatory requirements, should be a first point of view. But the attitude to the user is the highest. Well, their privacy should be the first point.

Debbie Reynolds  37:36

So I think what we're moving towards now is a situation where companies can't just think about their own business interests, especially as it relates to them touching the data of humans. And I have to tell people, this is like, products and services that you create are for humans. So you should be concerned about their trust and also protecting the data that they entrust you with. So that's a great point. That's a great point. Well, thank you so much for being on the show. I really appreciate it all the way from Poland. And I know that other people will really love this. We need to get together probably maybe in a year or so to see how things are played out with generative AI; I'm sure it'll have a lot more adoption as we see it seep into a lot more different tools, for sure. I think it's totally going to change and elevate the privacy conversation for sure.

Vadym Honcharenko  38:40

Thank you, Debbie. Yeah, and I agree that one year is a good term because, you know, these things are getting more and more intense. So yeah, we'll see. We'll see new requirements; we'll see new business standards. And thank you, thank you again for this great job on privacy topics.

Debbie Reynolds  38:59

Yeah yeah, we'll talk soon. Thank you so much.

Previous
Previous

E134 - Pamela Gupta, CEO, Co-President Trusted AI an OutSecure Inc company

Next
Next

E132 - Michael Thoreson, Founder, KRATE Distributed Information Systems