Welcome to another edition of the SafeToNet Foundation’s safeguarding podcast with Neil Fairbrother, exploring the law culture and technology of safeguarding children online.
In this Safeguarding Podcast with Hany Farid, Professor at the University of California, Berkeley: PhotoDNA, what is is and how it works, what PhotoDNA doesn’t do, what are Hashes and do they work in an End-to-End Encrypted world, is Apple’s NeuralHash child safety proposal the incipient slippery slope as many claim, Apple’s Secret Sharing Threshold and why that’s a problem, and “WhatsApp’s hypocrisy”.
There’s a lightly edited transcript below for those that can’t use podcasts, or for those that simply prefer to read.
One of the key tools used to find illegal images of children online is called PhotoDNA. One of the key technologies that social media companies seem intent on implementing everywhere is end-to-end encryption. PhotoDNA relies on being able to read “digital fingerprints”, codes or “hashes” as they are known, in order to identify an image as being CSAM or Child Sexual Abuse Material. So is there a conflict here between the privacy offered by end-to-end encryption and the desire to eradicate social media and the internet in general of CSAM? And then there’s the whole recent Apple CSAM announcement.
To help guide us through this complex topic, I’m joined by Hany Farid, Associate Dean and Head of School for the School of Information and Professor at the University of California Berkeley, who worked with Microsoft to create a technology called PhotoDNA. Welcome to the podcast, Hany.
It’s good to be here Neil thank you.
Thank you, Hany. Could you provide our audience from around the world please with a brief resumé so that our audience understands and has an appreciation of your background and expertise?
Sure, I’m happy to. So I’m, as Neil said, I’m a Professor here at the University of California, Berkeley. I’m pretty new to the campus here. I was at Dartmouth College for 20 years before coming here. I am by training an applied mathematician and computer scientist. I specialize in digital forensics, primarily concerned with issues of how do you determine if digital media has been manipulated or altered and along the way, somewhere over in the last 10 years or so, I got very interested and involved in a number of issues of internet scale content moderation.
So how do you deal with everything from child safety online when you’re talking about billions of users and billions of uploads and millions of hours of video uploaded every day. How do you deal with domestic and international terrorism online? How do you deal with the distribution of dangerous drugs and weapons? How do you deal with mis- and dis-information that is now leading to disruptions of societies and democracies? And in general, I think about these really interesting and complex and what I consider to be critical problems at the intersection of technology, society and law.
And the last thing I’ll say on this is my view over the last 20 years is we’ve all been running headfirst into this technology world without really thinking very carefully about the harms and how to mitigate the harms. And I think our adolescence is over. We’ve gotten through 20 years, we’ve all woken up with a little bit of a hangover and we’ve now got to start taking seriously how we can harness the power of technology while mitigating the very real harms that we are seeing to children, individuals, to societies and to democracies.
Okay, thanks for that Hany. Now I mentioned PhotoDNA and you’re widely credited as being the inventor of PhotoDNA and the three letters in there, DNA, may resonate with lots of folk because people are used to hearing about DNA evidence extracted from the scenes of crime. And it seems to me that there are some parallels between organic DNA and digital DNA as used by your technology and that all centres around what are known as “hashes”. Could you give us a brief overview for the man in the street of what PhotoDNA is, how it works and what on earth is meant by a hash?
Sure. Let’s first set the record straight. While I did work on that program, it was a collaboration with Microsoft and a number of very talented folks from the legal side and from the engineering side and I was part of a larger group that developed this technology back in 2008.Let me give you the backstory here because the backstory is really interesting, how we got there, and then I’ll talk a little bit about the technology.
So you have to rewind to early 2000 when the internet was nothing compared to what it was today. It was a very, very different landscape just at the turn of the millennium. And already at the very early days of the internet, we saw a troubling rise in the distribution of child sexual abuse material, what some people call child pornography, but what we call CSAM in the industry to be more descriptive and more honest in fact about what the content is.
And we saw this troubling rise of digital technology being used to weaponize against children. And for years, the early tech companies, the then giants of the tech industry, were being asked to do more, to make their platforms safer, the same way we’re asking the current giants of the technology sector to do the same thing.
And their response was, this is just too hard of a problem. We simply don’t have the computing ability to deal with determining if there’s a child in the image, determined if the content is sexually explicit, determine if the child is underage and this again in early 2000 technology was probably true. And by the way, that was at a time when the internet was much, much smaller than it is now and even then the technology company said, this is too hard of a problem. We can’t solve it.
And I was invited to a meeting in Washington, DC by the technology companies, by the National Association for Missing and Exploited Children and a few other organizations to talk to them because my expertise at the time was in digital forensics and image analysis. And what struck me there was something… I think they were right by the way that they didn’t have the technology to solve the problem I just enumerated… but where they were wrong was they were trying to solve the wrong problem.
Okay. Now it’s just before we go onto what the right problem should have been perhaps, some of the criteria that they I think they were talking about back then were really tough. I think there was a requirement to analyze an image in under two milliseconds, you couldn’t misclassify an image as a CSAM at a rate of no more than one in one in 50 billion, you had to classify correctly in image at a rate of no less than 99% and you couldn’t extract or share any identifiable image content. Those terms were set up to fail.
Yeah, they were set up to fail and any one of them would have been impossible with the technology then, and even today. But here’s the thing again, it was the wrong problem. So here’s the only insight I had honestly. At the end of a long day of meetings, Ernie Allen, who was the then head of NCMEC, the National Centre for Missing and Exploited Children. told us the following two interesting facts.
One is that the National Centre at the time was home to millions, it’s now tens of millions, of previously manually identified images and video of child abuse. That is examiners at the Centre had verified that this is a child, typically by the way prepubescent, so under the age of 12, no issue about age verification and the content was sexually explicit, millions and millions of images. And here’s the other interesting fact that he shared with us is that the same pieces of content keep getting shared day in and day out, month in and month out, year in and year out. In fact, decade in and decade out the national center was seeing the same images circulated for years and years and years.
And I got to the end of that meeting and I said, well, guys, I think you’re right with the criteria Neil that you enumerated, you’re never going to solve this problem. Not today, not tomorrow. But here’s maybe the problem you want to think about, which is what do we want to do? We want to stop the distribution of all CSAM. It’s awful. It’s awful. It’s illegal. It’s harmful to the kids. It creates a marketplace which incentivizes future abuse. It’s terrible. But you can’t solve that problem today. So let’s solve a more modest problem, which is to stop the redistribution of previously identified content.
Once somebody has put eyes on the content saying this is an eight-year-old, this is sexually explicit, I should be able to stop the redistribution of that particular image. And maybe I don’t catch everything, but let’s start where we can actually solve the problem. This is by the way, what academics do well, there’s all kinds of problems we want to solve, but we’re also realistic about what we can solve and we go after those problems and I’ll solve tomorrow’s problems tomorrow.
So here’s how we thought about the problem. So now I’ve changed the problem definition from “stop all CSAM” to “stop a subset of CSAM”. Okay? So this problem surely should be easier intuitively because now I’m not asking you to identify a child. I’m not asking to identify a sexual act. I’m not asking you to identify age. I’m just simply saying, show me if this image exists in your upload.
And so this is where robust hashing or perceptual hashing or fuzzy hashing as it’s known, came in. And here’s the core idea. Here’s a nugget of it. And this is where your DNA question comes in. What we’re going to do is reach into a piece of content, whether that’s an image or a video, potentially even an audio and extract a distinct digital signature, a hash, and that hash has to have many important properties. One of them is that it’s distinct. So if I have two distinct images, an image of you and an image of me, they should have different signatures.
The signature should be stable over the lifetime of the content. So if the image gets recompressed. If it gets resized, if the colour is changed, if somebody adds a logo, if somebody modifies it just a little bit, the signature should be about the same and I should be able to extract that signature very, very fast about one signature for in about two milliseconds to one thousands of a second. And that’s a tall order.
And here’s, let me just say, this is the important part is to tell you where the tension is here. The tension is between that robustness that I just described, you should be able to identify the same piece of content even it’s been modified a little bit and the distinctive property of the signature. So it’s very, very easy to create what we call a “hard hash” that is highly, highly distinct, provably, mathematically distinct, but it’s not robust to little modifications.
Yeah, exactly. And this is one of the evasive tactics that people who share and store this stuff use isn’t it? They will edit crop, rotate, swirl images…
Exactly. And you see this in the copyright infringement space. YouTube for years has had content ID that allows you to identify copyright infringement. And you’ll see that people do these really extreme crops, or they flip something upside down to try to modify it. And so what we needed to do is to develop a technology, this is what we call PhotoDNA, this robust perceptual hashing, that found a compromise between distinctiveness, so we are not distinct to one in every possible image in the universe, but we are distinct on the order of one to approximately 50 billion, but we are also robust to some modifications, but not all modifications. And so we found a compromise between making sure that we don’t misidentify two images, but that we catch as much as we can.
And here’s what’s interesting by the way and I didn’t quite appreciate this in the early days. I started this off by saying, look, we’re going to solve a subset of the problem and that’s true, but it turned out it was even better than that because of what we know about people who traffic these materials, they don’t traffic in one or two images. They traffic in hundreds and thousands and tens of thousands of images.
And if you have one image, one that I’ve seen before, well, guess what? I get it all. Because if you upload a thousand images to a service that uses PhotoDNA, and one of them is in my database of previously identified material, well then I catch that one image. I get a warrant to come search the rest of your files. I get it all. And then that can be reviewed. And if there’s new material that can be added to the database. And here’s what I really like about this technology, it’s extremely specific and it’s efficient, it’s highly accurate. And it piggybacks off of existing cybersecurity technologies.
This basic technology is how we find viruses. It’s how we find malware. It’s how we find other harmful content that is attached to an email. We’ve all received an email where an attachment has been ripped out because it has been deemed to be a threat to us. How has that done? Every single email you send, whether it’s through Google or Microsoft or pick your favorite mail server, gets scanned. It gets scanned for spam, for malware, viruses and all forms of phishing, ransomware attacks because we have deemed that without doing that, the threat to individuals safety is too high. And this is exactly what PhotoDNA does. It does exactly the same thing except now we’re protecting children.
Okay. Now a couple of follow-on questions then Hany if I may. There are some people who are vociferous against the online services using products like PhotoDNA to search for these hashes. But they are not vociferous about the very same services using similar technologies to search for viruses and other damaging content. So there seems to be a little bit of a contradiction there.
I would call it hypocrisy, but go ahead.
Well, you did in an op-ed, which we’ll come onto shortly, I think. But we’ve talked about PhotoDNA and I think we’ve got a reasonable understanding of that, a visual image of all these photos being analyzed. So just to clarify, is PhotoDNA itself doing the analysis of the image? Is it an image classifier, or is it relying on those images being classified by, in this case, NCMEC, who then put the PhotoDNA stamp on the image?
Yeah. Good. So there’s two parts to the technology. There’s the core underlying algorithm, which takes as input an image and returns a signature. And the signature is just a bunch of numbers. You can just think about it as just a bunch of numbers catenated together that describe in a nutshell, in this very compact way, what the image is. Okay. And that’s the core technology.
Now that technology is useless unless you have, for example, a NCMEC or a Canadian-style or Canadian Centre for Child Protection, that will tell you what the signatures are of previously identified harmful content. So there’s this two parts to it. NCMEC provides the database. That’s what we search for. And the original database that we did that we released back in 2009, the kids in that database were under the age of 12. They were all prepubescent. Every single image had an explicit sexual act and all the images had been triple vetted by human moderators.
So you have to be of course very careful what goes into that database to make sure in fact that it has been identified as CSAM. And then in a very targeted way, you just look every single image that gets uploaded, you extract a signature and you compare it to the database. And here’s the nice thing. If it’s not in the database, you don’t glean any information.
And by the way, for everybody saying, “Wow, the platforms shouldn’t look at my images”. I don’t know how to tell you this, but they already are. Every single image that gets uploaded gets opened. You check it for a virus, because viruses can be attached to JPEG images. The metadata gets stripped out, and saved by the way, and it re-saves the image. Every single piece of content that you upload is being touched by the services to deal with cybersecurity and we are simply piggybacking off of that because we have said that the global distribution of tens to hundreds of millions of children with an average age of eight being sexually abused is not acceptable in our society.
Okay. Now I think that the original NCMEC database that PhotoDNA was trained on, if that’s the right term, was approximately 80,000 images, which to the man in the street would seem like a very large number. But there were a couple of things here. One is the available pool of images must be growing. And the rate of image identifying must be limited by the number of people within NECMEC, the Canadian Centre, the IWF and similar organizations that do the classifying. So can they ever keep up?
The answer is yes and no. So you’re right that while the original database, and I wouldn’t use the word training, by the way, I think that’s a dangerous word. There’s no training of the algorithm, right? All you’re doing is saying “Identified this image”. This isn’t AI, we’re not inferring anything from the images. But the original database we released when we were testing the software, had 12 signatures in it. And it is grown to 80,000 and continues to grow.
No, we can’t keep up and there’s something deeply disturbing and sad about that. That literally the content is being created faster than we can vet it. Having said that PhotoDNA has been in use since 2009. It is in use in most major platforms. And last year alone, one year, NCMEC’s cyber tip line received over 20 million reports of CSAM from all of the platforms, 99%, high 90s% of those were from PhotoDNA. It has been highly effective.
And here’s the thing I think everybody needs to understand. This was never a law enforcement issue. This was not about putting people in jail. Don’t get me wrong. I don’t have a problem putting people in jail who abuse children. I think they should go to jail, but that’s not what this technology was designed for. This technology is victim centric, because if you talk to the young kids who are victims of child sexual abuse, you know what they’ll tell you that single worst day of their life was when they were physically abused. The second worst day of their lives, which continues every single day, is knowing that that image and that video of their rape is being distributed online to the tune of millions of people around the world. And so this was always about preventing the revictimization of these children. And to that end, it has been effective. It hasn’t solved all of our problems, but it has been a highly effective technology.
Okay. Now with the inception of Apple’s smartphone and other smartphones which now these days have got 4k video on them. I’m sure that these beautiful quality videos that are captured by these devices are very attractive to the collectors, sharers and purveyors of CSAM. How does PhotoDNA address the issue of video, if it does? If it doesn’t, can the issue of video be addressed because there’s a lot of images within a video, particularly when it’s running in Slow Mo mode at 120 frames a second.
Sure. So when we developed PhotoDNA back in 2008, you know, we knew that video was up and coming, but there wasn’t enough bandwidth, and the video recorders weren’t that good, and we knew we were going to have to deal with it, but images dominated the landscape according to the child protective services.
Today I believe it’s roughly half, maybe slightly more than half of reports to NCMEC are in fact video. And at its core PhotoDNA is not designed to deal with video, but it can be modified to deal with video. So here’s the simplest way you could do it. What’s a standard video? Let’s forget about slow-mo for a minute. Standard video is a 24 to 30 frames, images, for every second of video. That’s a lot of images by the way. Even a three minute video, you’re talking about thousands and thousands of still images.
So what you can do, here’s the simplest way you can modify PhotoDNA is in the database you extract key frames of a video that are particularly representative of the violence and the crime associated with the content. And then when the video is uploaded, you extract every other frame, every 10th frame, every 30th frame, you sample the video in time, you extract individual images and you compare them to the database. And because the PhotoDNA algorithm is so efficient, and because the database lookup is so efficient, even for relatively long videos, you could do this.
Now, there are more sophisticated ways to do full blown video analysis, and much to my frustration, despite when we released PhotoDNA, we said to the industry, you must, must start working today on a video version of this. Here we are 13 years later, and there’s still no standard, the way there is a video standard. The technology sector continues to drag their feet despite the fact that there are solutions out there, we have proposed them. We have given them to the companies. We have said, let’s just set a standard and the companies continue to drag their feet much to my frustration. And despite knowing that more than half of the content being distributed is in fact video, this is a solvable problem, we’re now just choosing not to solve it.
What is their motivation Hany do you think for not implementing what seems to be a perfectly reasonable suggestion?
Look, you need to ask them. I’m not going to answer that question, but here’s what I will point out is that in 2003, the then giants of the tech industry were told about this problem and they dragged their feet for five years. Do not, do not, let them rewrite history and tell you how they were all over this in the early days, they were not.
They came at this reluctantly and dragging their feet. And here’s why. I don’t think they’re bad people. I don’t think they like child predators. I don’t even think they want this stuff on their network, but here’s the problem they have is once you go after CSAM and you show that you can do it, it opens the gate to content moderation that they have to deal with now, terrorism and extremism. They have to deal with illegal drugs. They have to deal with illegal weapons. They have to deal with mis-information. They don’t want to do content moderation because they’re in the business of monetizing user generated content.
And as soon as you crack that door open, it complicates life for them. Despite the fact, by the way, that before PhotoDNA, YouTube had released Content ID to find copywriting infringement material. Why? Because they were being threatened with billions of dollars in lawsuits. And by the way, would you please tell me what it tells us about a society that protects the financial interest of the movie and the music industry before it protects children?
And by the way, when Facebook tells you that they don’t want to be the arbitor of truth, they don’t want to tell people what they can and cannot do, we protect speech, would you please ask them why they ban legal, protected, adult pornography? YouTube, they have always been very good at taking down content that they don’t want if it’s bad for business, but when it comes to protecting children and dealing with real world harms, they somehow take this very idealistic approach and I’m not buying the story.
Well, the tide might possibly be about to turn on that particular point. I’m sure you’re well aware of the John Doe versus Twitter case which previously Section 230 of the CDA has prevented such legal actions being taken against these companies. We do have a podcast on this, by the way, if you care to download it for more details. But the brief summary is that there is now a case being permitted by the courts which previously Section 230 would have chucked out and the young lad, the 16 year old in question, his legal team now have the go ahead to sue Twitter. So do you think that might be a turning point where Section 230 loses its legal effectiveness in this regard?
Just, just for your listeners, let me just very briefly explain 230. So 230 is a US law, Section 230 of the Communications Decency Act is a US law that was written in the mid-nineties that said the tech companies cannot be held liable for either over-moderating or under-moderating. That’s the rule in a nutshell, and the companies have used this rule, which by the way was called, literally it was called. the Good Samaritan law.
It was written to encourage content moderation, and you wouldn’t be penalized if you made mistakes. And the tech companies have done some judo on this law that allows them to hide behind that. Despite the fact that their platforms have been used to traffic young women, traffic children, traffic illegal drugs and illegal weapons and all forms of harm, and the most recent ruling in Texas, I believe, the judge said this is this is not what 230 was designed for. This is not the interpretation, we’re not protecting speech. We are going after a faulty product that caused harm to a young child.
I do think to your question, Neil, there is a tipping point. I think my view of the landscape is post Cambridge Analytica everybody woke up and started realizing some really bad things are happening on the internet. I think people were too slow, I’m coming to that realization, but I think there is an absolute realization that the tech companies are running absolutely afoul of every sense of human decency, and they need to be reigned in, and they need to do that from a regulatory perspective, because we are now talking about global trillion dollar companies that are virtual monopolies, and there is no pressure that can be brought upon them.
Look, the FTC hit Facebook with a $5 billion fine, and they shrugged it off and their stock price rose more than 5 billion the next day. When you can’t fine a company in the billions of dollars and create change, we’ve got to start looking at more effective solutions.
Now talking about multi-billion, if not trillion-dollar companies, we mentioned Apple in the introduction, and I don’t think we can have a discussion about PhotoDNA and CSAM now without mentioning Apple. You wrote an op-ed, which appeared in Newsweek I think recently, and you said that “…child rights advocates cheered and privacy rights advocates jeered, however both are putting too much stock in Apple’s announcement, which is neither cause for celebration nor denunciation”. So can you, in a very brief, short nutshell, explain what Apple’s proposal is and why it doesn’t appear to be pleasing anyone at all?
Yeah, good. So Apple announced after what I would consider years and years of dragging their feet, that they will scan images uploaded into iCloud for CSAM the same way by the way, the vast majority of platforms do including Facebook. And what they also said is that, you know, because this has been a very privacy focused company, is that the actual algorithm that will do the equivalent of PhotoDNA, the hashing, will run on your device. And the reason they did that is that it’s more privacy preserving this way, that you give less information to Apple for that.
But the idea is, anytime you push a photo up to iCloud on your device, it will do the hashing. And then in the cloud, it will make the determination if it’s CSAM or not. And the privacy people went absolutely crazy.
And they went crazy in a very interesting way, because what they said, what the headlines were, is that Apple is scanning your photos, which first of all is not true. Scanning is a very dangerous word here. What they are doing is what Apple has been doing an email for years, by the way, which is looking at things that are being pushed into their cloud service, and whether it’s an email or iCloud or whatever, for very specific known bad content. It’s the same technology that we have been using for over a decade. It’s a variant of PhotoDNA, a slightly newer version of it developed at Apple.
And here’s why I think the privacy people got it wrong. And let me explain why I don’t think there’s too much room for celebration. So for example, the developer of WhatsApp came out very publicly condemning this in the very strong words and saying Whatsapp, we’ll never ever run this technology. And here’s what he either failed to remember or conveniently didn’t tell you, is that his own app, WhatsApp, does something very similar.
So when you send a message on WhatsApp, fully end-to-end encrypted, they sell privacy as their main selling point, [yet] any link, any URL that you attach in a text message is scanned on your device by Whatsapp, and what does it scan for? For spam and malware.
And so here’s a guy who says, well, look, I want to make sure that our app is safe for users and so we’re going to scan on your device for harmful content. And then he comes out and says, well, if you do the same thing, but for child sexual abuse, where the average age of a child is eight year old being raped, well, now I take issue with that. And I think that is the height of hypocrisy. I don’t buy the privacy argument,
Mr. Carhart, I think his name is apologies if I mispronounced it, in the same press release I think he said that Whatsapp themselves had reported thousands of CSAM images to NCMEC.
Right. But they are not running PhotoDNA. And so they are doing that using other types of technologies. And by the way, thousands is a very small number because Facebook, for example, reports tens of millions. So he is extremely under-reporting this.
But here’s the issue I take, is the same people who are jumping up and down on the heads of Apple for deploying a technology seem to celebrate and in fact are happy to use and deploy exactly the same technology when it comes to other cybersecurity threats like spam and malware and viruses. And I don’t think you can have it both ways.
And here’s the other argument that you will hear and we’ve been hearing this for over a decade is if you do this, you will do that. If you can scan my photos for CSAM, well, you’re eventually going to stop the distribution of political dissent.
And I, again, reject this because again, the same technique, you can say exactly the same thing about spam filters, about malware filters, about virus filters. Once we have the ability to search for URLs to determine if they’re spam or malware, well, sure. I can use that to stop you from sending a link to the New York Times or to the BBC or to NPR. And so let’s stop with the slippery slope arguments. PhotoDNA has been used for over 10 years, and it has not led to the type of doomsday predictions that people predicted 10 years ago. And if you keep making predictions and you keep being wrong, at some point, you need to sit down and shut up.
So with Apple’s technology as you rightly say, they are have a reputation for privacy, for focusing on privacy and for defending privacy. They have fought off requests from certainly the US government, as far as I’m aware and other administrations around the world to provide some kind of backdoor entry into even known captured terrorists’ iPhones.
When it comes to their recent CSAM proposition that they say about one of the key pieces of the technology, which is called NeuralHash, that this has a built-in safety margin. That is to say a safety margin against what I think is known as a false positive. And what they say is that “…by building in an additional safety margin Apple would assume that every iCloud photo library is larger than the actual largest one, and that Apple expect to choose an initial match threshold of 30 images”.
This seems to imply that Apple’s NeuralHash technology that’s looking for the hashes of CSAM material, won’t be triggered unless you have 30 or more matches that you’ve tried to upload to iCloud. Now Apple, I think justify this, as I say, as a safety margin, but here’s the thing. One image is the digital image of an offline crime scene and is illegal and if these images have already been hashed, and it’s the hash is that they’re looking for, why is Apple saying that you got to have at least 30 files to give less than a one in a trillion error rate? It makes no sense.
Yeah. I found this quite offensive. So let’s talk about what the concern is. The concern is no technology is perfect. Legitimate emails sometimes get spam filters, attachments sometimes get ripped out. Nothing’s perfect. And so you need to put safeguards in place to make sure we’re not making mistakes. And the way you do that is put humans in the loop. So with PhotoDNA every single report is eventually seen by a human. People are saying, well, what happens when somebody goes to jail for this technology? That is just the most patently absurd argument, because the number of human beings that will have to look at this content before, in fact anybody is contacted, are the safeguards we have in place.
Now here’s my guess. I have no internal knowledge of how Apple did this is they built yet another hashing algorithm, NeuralHash, it’s a variant I suspect of PhotoDNA, or at least in principle. And it’s, you know, it’s not perfect. It has a one in pick-a-number between 1 billion and a hundred billion false alarm rate. And if you say, well, if I get one, you know, one in a billion times, that might be a mistake, but if I get two and those two images are different, well, maybe that’s now, you know, one in a billion times a billion, right? So that’s a really small number. And what they did is they set this really high threshold of 30 images, which I think is preposterously high. My guess is they did that to appease some of the privacy folks who are jumping up and down in their head saying you are invading privacy. And this was one of the safeguards.
They didn’t have to do that because what they can do is just simply every time there’s a hit or maybe make it two images, is that a human would review the image. And if in fact that images CSAM then you make the report to NCMEC, as you are obligated by law. But that does mean that a human being has to look at an image and if their mistakes are on the order of one in a thousand, that means there’s a lot of humans putting eyes on your photos and some people will find that invasive. If the error rate is one in the thousand, you shouldn’t be using the technology. So I would argue that if the error rate is one in a billion or 10 billion, and occasionally somebody at Apple, or NCMEC, has to look at an image that that is a fair price to pay, to protect children online.
So I think they went overboard in their caution to say one in a trillion, 30 images, I think they could have done better than that. And this is part of the reason why I don’t think we should celebrate Apple. First of all, we could have deployed this technology a decade ago. Second of all, it only deals with images and not video, so it’s only half the problem. And not for nothing, you can just circumvent it by turning off iCloud syncing. So it’s not like this thing is going to… if you don’t like Apple running this, well then just turn off iCloud syncing.
Yes well that does beg the question, what is Apple’s motivation here? Is it simply to, and quite justifiably some people may say, to ensure that this illegal CSAM content doesn’t appear on their commercial services, or is their motivation to try to eradicate CSAM from the internet and social media services in general? I suspect it’s the former, rather than the latter.
I can’t speak for what’s going on in Tim Cook’s head, but here’s what I can tell you is Apple has consistently been one of the most stubborn industry partners in dealing with this problem. They have consistently said that they will not do even the bare minimum.
My read of it is that they’re reading the legislative landscape. There are any number of bills coming down from Capitol Hill, in the UK, and in the EU that is going to force the company’s hands. They probably saw the writing on the wall and wanted to get ahead of it. I don’t think they wanted to do this. I think if the regulatory landscape was different, they wouldn’t have done it because they have consistently said that, you know, they don’t want to get involved. Despite the fact, by the way, they do all forms of cybersecurity protection, the way we were talking earlier.
Part of the process as you’ve described is that once they reached their threshold of matches humans will moderate the image. So someone will inspect what’s been reported. Some people are saying that Apple don’t have the legal right to do that because the laws as they pertain in the States say that these reviews must be carried out by NCMEC.
That’s incorrect. Under the Terms of Service, they can absolutely look at that material. The law says that if you identify what is apparently child sexual abuse material, you must, as a matter of federal law in the United States report it to NCMEC. It absolutely does not say that Apple cannot review it.
Right. And the other point that might be pertinent to this, which you described, is obviously every view of an image is a re-abuse of the victim. So even though these human moderators might be doing the right thing, it’s nonetheless another pair of eyes…
And by the way, they, Apple doesn’t have to review the images. So I will tell you when we deployed PhotoDNA some companies said, you know what? We trust the technology, it’s been vetted. We know what the error rate is. We’re simply going to send the reports to NCMEC. Nobody inside the company is going to look at this material and we’re going to let the experts deal with this. Other companies said, no we’re going to look at the content. That’s a choice they are making. They don’t have to do that.
And by the way, we should absolutely worry about accusing somebody inappropriately, but NCMEC will fix that for us, right? There will eventually be eyes on before any law enforcement contacts are made. So that was a choice they made, and it wasn’t a necessary one. They chose to make that decision.
But you’re absolutely right. Every time somebody looks at that image, that is a revictimization of the child. I would argue that having a relatively small number of analysts, looking at those images to verify them is a small price to pay as opposed to allowing for the free distribution of tens of millions of images around the world with no safeguards in place.
Okay. Now obviously I’m based in the UK, in Europe, and Apple is a US company, and indeed you’re based in the States as well and the States is famous throughout the world for its Constitution. And the First Amendment in particular is used a lot to justify people saying all sorts of things. So is there any Constitutional basis at all that allows anybody to possess, share, or even create in the first place child sexual abuse material? Is there any First Amendment Freedom of Speech issue at play here?
So let’s get a few things straight. I’m not a constitutional lawyer, but this I know for a fact. There is no protection for child sexual abuse material. There is no, “I was doing research”, “This is for entertainment purposes” [excuse]. The material is absolutely 100% illegal. You have no rights to hold this material.
And in fact laws outside the US are even stricter. On the other side of the pond in your neck of the woods, not just actual children being sexually abused, but depictions of children being abused in cartoon characters in the UK are illegal. So the US has a little bit less conservative laws, but there is no protections afforded by the law from the First Amendment down that allows you to possess this material. And not for nothing, the terms of service of every single online provider also says, you can’t do these things on our service. So you have a double net here.
I read through for the first time ever actually last week the iCloud terms of service, which I’ve agreed to dozens of times and it quite clearly says that “Thou shalt not upload illegal content to this service”. So what Apple…
It says “thou”, by the way, but I get what you’re saying!
What Apple seem to be doing is simply putting in place a mechanism to enforce, in one particular aspect, their own terms of service, which all iCloud users have already agreed to.
PhotoDNA was always voluntary, right? We developed the technology with Microsoft. We gave it to the companies for free, and they did not deploy the technology for law enforcement purposes. They deployed it for terms of service purposes. They don’t want this stuff, they have the right to say we don’t want this… the same way, by the way that YouTube and Facebook can say, we don’t want legal speech, adult pornography, on our platform because it’s bad for business. The same way you can ban all forms of content that are protected, because you don’t want it. You have the right to enforce your rules.
Okay. Now we started the podcast talking about encryption and end-to-end decryption and I’ve seen lots of comments on various forums, Apple forums, all sorts of things saying Apple are obviously not encrypting everything, because if they can see even these hashes, then it’s not true end-to-end encryption. And as I said at the outset, end-to-end decryption has child protection people worried because it might hide the very information we’ve been talking about such as the hashes and it might stop these companies from looking for this kind of content. So what’s going on here with end-to-end encryption in the context of finding these hashes?
So let’s go back to PhotoDNA for a minute. The way PhotoDNA works is you upload an image, say to Facebook, when that image hits Facebook servers they open it up, they check it for viruses. They rip out the metadata, they recompress it, they make it smaller. And then they run PhotoDNA, and then it eventually gets hosted. And to do all of that, they need to be able to see the image. They need to have access to the pixels. So if you have what is called a fully end-to-end encrypted system, where when an image leaves my device, it is encrypted. As it makes its way through the server nobody can read it including Facebook, including law enforcement, including me and you. And when it reaches its intended recipient, it is decrypted, so-called end-to-end encrypted.
And in that scenario even if that content goes through a central server, you can’t run things like PhotoDNA, you can’t run anything, which is why, by the way, on WhatsApp, fully end-to-end encrypted, the spam and malware detection is done on the client, on the device when you receive it to make sure you’re not clicking on something that’s harmful.
And it is why Apple is running NeuralHash, the hashing algorithm on the device, so in fact they can allow you to still have fully end-to-end encrypted text messaging but we can ensure some safety here. And here’s where I come down on this and I will say, I think reasonable people can disagree on this, but here’s where I come down. There is nothing in the offline world that is immune to a lawful warrant. There is nothing that is absolute in terms of your privacy.
If there is a lawful warrant, you can search my body, you can search my home, you can search my car, you can search my place of business, you can search my bank, you get access. And we have deemed that appropriate in liberal democracies because we have this trade-off between law enforcement, security and individual privacy.
I would argue that I don’t think we should treat the online world differently. If you take the position that I have a right to bulletproof end-to-end encryption, and nobody should ever be able to get access to that, well, then you are creating an environment where criminals, child predators, terrorists, people committing financial fraud and any number of harmful things, are going to thrive. If you take that purist view, then there’s no more discussion here, but I take a different view, which is [that] we have to find a balance between individual rights and individual privacy and individual freedoms and the benefits and securing our society and protecting children.
And I think the types of technologies that we’ve developed, that we’re talking about here that are highly targeted, highly specific and can chip away and make these devices a little bit safer, we pay a very, very small price for that. And I will say, we’re already doing that. As I’ve said earlier, with all the cybersecurity, spam and malware and virus protections that we all want our devices anywhere.
So I think we should have a reasoned, an honest and serious debate about how we balance individual rights and privacy with public security and protecting children and other vulnerable groups online. But if you take the position that your privacy and your rights are the single and only thing that are important, well, then I think it’s hard to have a conversation about these things.
Okay. Now we are sadly running out of time, but a couple of other questions, if I may. Hashing as we’ve described it and the whole process involving NCMEC and PhotoDNA, and even PhotoDNA for video, Apple’s NeuralHash, they provide a retrospective analysis of an event that’s already happened. And as you said, some of these events happened some years ago, and these images of offline crime scenes have been circulating for years.
Now, most products and services have a product life cycle and we’ve got with Apple’s announcement, I guess, version one of their anti-CSAM proposition. What do you think we might reasonably expect in version two at perhaps next year’s Worldwide Developers Conference? So Craig Federighi, for example, has already hinted at some improvements, particularly around that 30 file threshold.
Well, here’s what I would like to see. I don’t know what we’re going to see. I’d like to see a video version because if we know that half the content out there is video, we need to start dealing with video. And so I would very much, and I continue to push the industry to move more aggressively on video.
Here’s the other thing I would like to see in the child safety space and this isn’t really an Apple issue, but it’s more in the gaming industry, is dealing with grooming and sextortion that is happening on social media and in gaming platforms. We’ve seen a troubling rise in so-called self-generated CSAM, where kids are being groomed and extorted and tricked and coerced and blackmailed into creating abuse images of themselves and then that gets distributed online. And I think the gaming platforms, social media platforms and a lot of particularly the new social media platforms have got to start getting a handle on these technologies that are being used in very harmful ways for children.
The data from certainly the Internet Watch Foundation indicates that the single most dangerous place for a child to be with a smartphone is the locked family bathroom. That privacy lock on the bathroom door gives the illusion of security, but behind which children, as you rightly say, are inveigled into doing all sorts of things. But Apple may have a part of a solution here because the other part of their recent announcement dealt with iMessage and they’re using AI in there, I believe, to identify explicit content into, or out of, a child’s iMessage account. Is that potentially a hint of things that may come?
Absolutely. We haven’t talked about this. This is an opt-in feature, so it’s not quite as controversial. And what it does is, it is an AI system that tries to identify sexually explicit material. And if your device belongs to a child, and if the parental controls have been set up properly, the parents can be notified that the child either received or is about to send a sexually explicit piece of content and then the parents can set the controls on what to do there. That is absolutely, I think, an encouraging technology, and I think we should enable parents to have the technology to do this.
But I think we also have to be very careful here not to put all the burden on the parents, because what we know first of all, is that most cases, the kids are more tech savvy than the parents. We know that the kids find workarounds of parental controls, we know this. And I think it’s a little bit of a cop out to say, well, we’ve given parents, you know, all these tools that they need to figure out how to use and make sure their kids don’t circumvent. So our hands are clean. I think it has to be a more collaborative enterprise.
Okay. Now for all of Apple’s success particularly with the iPhone, they are just one smartphone manufacturer. There are others mostly using the Google Android operating system. So Samsung for example, is one. Would you expect these other device manufacturers to be following Apple’s lead here and be introducing similar kinds of technologies?
Here in the US Apple is about 50% of the market share. The Android is the other half, but worldwide, some almost 8 billion people, Android dominates and Apple is largely irrelevant in vast parts of the world. I don’t know how this is going to unfold, but if history is any lesson, what happened with PhotoDNA is it started out with Microsoft and then Facebook came on board. Everybody else was still whining and complaining about it, but eventually the public pressure kept increasing. And now most major tech companies use PhotoDNA or a version of PhotoDNA.
I suspect, you know, once the technology exists and once it’s been verified that it works, once we get through all the sort of the headaches and the hurdles of releasing these technologies and it gets normalized and we understand it, and we realize that it’s not the Doomsday that is being predicted, there will be pressure on other companies to deploy technology to make their devices more safe. I think that’s a matter of time, but that will take years not months.
Okay, Hany, thank you so much for your contribution to this really vital debate. A fascinating insight into all things related to CSAM technology.
Thank you so much. It was great talking to you as always.