Technology is the proverbial double-edged sword. And an experimental European research project is ensuring this axiom cuts very close to the industry’s bone indeed by applying machine learning technology to critically sift big tech’s privacy policies — to see whether AI can automatically identify violations of data protection law.
They’ve also now got support from European consumer organization BEUC — for a ‘Claudette meets GDPR‘ project — which specifically applies the tool to evaluate compliance with the EU’s General Data Protection Regulation.
Early results from this project have been released today, with BEUC saying the AI was able to automatically flag a range of problems with the language being used in tech T&Cs.
The researchers set Claudette to work analyzing the privacy policies of 14 companies in all — namely: Google, Facebook (and Instagram), Amazon, Apple, Microsoft, WhatsApp, Twitter, Uber, AirBnB, Booking, Skyscanner, Netflix, Steam and Epic Games — saying this group was selected to cover a range of online services and sectors.
The AI analysis of the policies was carried out in June, after the update to the EU’s data protection rules had come into force. The regulation tightens requirements on obtaining consent for processing citizens’ personal data by, for example, increasing transparency requirements — basically requiring that privacy policies be written in clear and intelligible language, explaining exactly how the data will be used, in order that people can make a genuine, informed choice to consent (or not consent).
In theory, all 15 parsed privacy policies should have been compliant with GDPR by June, as it came into force on May 25. However some tech giants are already facing legal challenges to their interpretation of ‘consent’. And it’s fair to say the law has not vanquished the tech industry’s fuzzy language and logic overnight. Where user privacy is concerned, old, ugly habits die hard, clearly.
But that’s where BEUC is hoping AI technology can help.
It says that out of a combined 3,659 sentences (80,398 words) Claudette marked 401 sentences (11.0%) as containing unclear language, and 1,240 (33.9%) containing “potentially problematic” clauses or clauses providing “insufficient” information.
BEUC says identified problems include:
- Not providing all the information which is required under the GDPR’s transparency obligations. “For example companies do not always inform users properly regarding the third parties with whom they share or get data from”
- Policies are formulated using vague and unclear language (i.e. using language qualifiers that really bring the fuzz — such as “may”, “might”, “some”, “often”, and “possible”) — “which makes it very hard for consumers to understand the actual content of the policy and how their data is used in practice”
The bolstering of the EU’s privacy rules, with GDPR tightening the consent screw and supersizing penalties for violations, was exactly intended to prevent this kind of stuff. So it’s pretty depressing — though hardly surprising — to see the same, ugly T&C tricks continuing to be used to try to sneak consent by keeping users in the dark.
At the time of writing Facebook had not responded to our request for comment.
Commenting in a statement, Monique Goyens, BEUC’s director general, said: “A little over a month after the GDPR became applicable, many privacy policies may not meet the standard of the law. This is very concerning. It is key that enforcement authorities take a close look at this.”
The group says it will be sharing the research with EU data protection authorities, including the European Data Protection Board. And is not itself ruling out bringing legal actions against law benders.
But it’s also hopeful that automation will — over the longer term — help civil society keep big tech in legal check.
Although, where this project is concerned, it also notes that the training data-set was small — conceding that Claudette’s results were not 100% accurate — and says more privacy policies would need to be manually analyzed before policy analysis can be fully conducted by machines alone.
So file this one under ‘promising research’.
“This innovative research demonstrates that just as Artificial Intelligence and automated decision-making will be the future for companies from all kinds of sectors, AI can also be used to keep companies in check and ensure people’s rights are respected,” adds Goyens. “We are confident AI will be an asset for consumer groups to monitor the market and ensure infringements do not go unnoticed.
“We expect companies to respect consumers’ privacy and the new data protection rights. In the future, Artificial Intelligence will help identify infringements quickly and on a massive scale, making it easier to start legal actions as a result.”