Algorithmic Oversight

Social media news algorithms represent the future of news dissemination. However, the need for personalization, driven by advertising revenue, creates ‘filter bubbles’, limiting our exposure to a diversity of information sources. We need to implement software oversight: open-source algorithms auditing newsfeeds for bias, maintaining algorithm secrecy while ensuring objectivity.

This article was first published in The Mint. You can read the original at this link.


Sometime back, I wrote about social media and the new face of journalism. I argued that social media platforms should be entitled to the privileges of the Fourth Estate to ensure that its ability to deliver news to an algorithmically relevant audience is not stifled due to the absence of liability protection. Over the course of the past few weeks, however, the integrity of social media news algorithms have been called into question, denting, somewhat, my thesis that these algorithm-driven news feeds represent the future of news dissemination.

First, there was the Gizmodo report that Facebook used human curators to manipulate the Trending Stories section of their newsfeed, suppressing news stories with a conservative slant and artificially promoting the #BlackLivesMatter campaign. And then, Facebook was in the news again when it blanked out the iconic image of the Napalm Girl on the grounds that it didn’t meet its community standards. Both instances revealed that, at least at Facebook, humans are still an essential part of the ranking process—and that sometimes, their very human biases creep in.

Eli Pariser’s TED talk on filter bubbles showed us that, thanks to customized search, no two people will ever get the same search results on a given subject, forcing us to question whether the sources of information we have come to rely on are limiting our access to information. E-commerce companies like Amazon face similar challenges with their recommendation engines as do news sites like Huffington Post with the articles they suggest.

The problem is that the way things are set up today, there is no escaping the increasing personalization of the web.

Most widely consumed web services provide their core offerings for free, earning revenue through advertising. Since Internet advertising works best when narrowcast, the more accurately companies can target their advertisements, the greater is their revenue. Consequently, almost all Internet companies want to learn as much as possible about their users so as to be able to serve up to them advertisements most accurately tailored to their needs.

To do this, they observe every aspect of your behaviour online, tracking things you like and those you don’t, reinforcing their understanding of you by serving up more and more of what you want to see until eventually you are surrounded by a bubble of information that their machines have determined is most suited for you. They do this for no reason other than to perfect their understanding of which advertisements you would most likely click on. For as long as advertising remains the revenue focus for Internet businesses, we will have to continue to contend with these bubbles, living with this rose-tinted algorithmically determined view of the world.

Lawmakers have begun to take note of the issue. The commerce committee of the US senate is opening a formal inquiry into Facebook’s curation process, asking whether, in fact, it manipulated the Trending Stories section of the page. If this investigation gathers steam, it could pave the way for other governments initiating similar inquiries into the secret sauce that goes into news algorithms.

This will not be a good thing. It is critical that the algorithm that determines what news is displayed, whether it be EdgeRank on Facebook or PageRank on Google, remains secret. The moment the elements of these algorithm become public knowledge, we will get into a race to the bottom as news sites start to tailor their content to respond to the algorithm in order to artificially improve their rank.

That said, as more and more people rely on these websites as their primary source of news, it is important to ensure that the information provided is presented in a balanced and comprehensive manner that eliminates, as far as possible, human or organizational bias.

Software oversight

I have, for some time now, been evaluating the concept of software oversight—trying to find ways to use technology wherever possible, as a substitute for human regulators. This problem of online newsfeed regulation seems like the ideal place to put this to test.

It should be possible to develop algorithms designed to oversee the ranking techniques used by Internet companies to ascertain whether they are presenting the news in an unbiased manner. These oversight algorithms could be designed and released as open source code that is open to the public for inspection. Done right, this form of software-based review could effectively substitute the role that regulators play, flagging instances where the filter bubble distorts reality unacceptably and testing tweaks to the ranking algorithm for conformity with established standards.

Arguably, it would be in the interests of Internet companies to submit themselves to this sort of algorithmic oversight, secure in the knowledge that, while their algorithm remains secret, their search results are being independently audited and certified to be objective.