Big Data, Financial Inclusion and Privacy for the Poor

Dr. Katherine Kemp, Research Fellow, UNSW Digital Financial Services Regulation Project
30 Aug 2017

Financial inclusion is not good in itself.

We value financial inclusion as a means to an end. We value financial inclusion because we believe it will increase the well-being, dignity and freedom of poor people and people living in remote areas, who have never had access to savings, insurance, credit and payment services.

It is therefore important to ensure that the way in which financial services are delivered to these people does not ultimately diminish their well-being, dignity and freedom. We already do this in a number of ways – for example, by ensuring providers do not make misrepresentations to consumers, or charge exploitative or hidden rates or fees. Consumers should also be protected from harms that result from data practices, which are tied to the provision of financial services.

Benefits of Big Data and Data-Driven Innovations for Financial Inclusion

“Big data” has become a fixture in any future-focused discussion. It refers to data captured in very large quantities, very rapidly, from numerous sources, where that data is of sufficient quality to be useful. The collected data is analysed, using increasingly sophisticated algorithms, in the hope of revealing new correlations and insights.

There is no doubt that big data analytics and other data-driven innovations can be a critical means of improving the health, prosperity and security of our societies. In financial services, new data practices have allowed providers to serve customers who are poor and those living in remote areas in new and better ways, including by permitting providers to:

  • extend credit to consumers who previously had to rely on expensive and sometimes exploitative informal credit, if any, because they had no formal credit history;
  • identify customers who lack formal identification documents;
  • design new products to fit the actual needs and realities of consumers, based on their behaviour and demographic information; and
  • enter new markets, increasing competition on price, quality and innovation.

But the collection, analysis and use of enormous pools of consumer data has also given rise to concerns for the protection of financial consumers’ data and privacy rights.

Potential Harms from Data-Driven Innovations

Providers now not only collect more information directly from customers, but may also track customers physically (using geo-location data from their mobile phones); track customers’ online browsing and purchases; and engage third parties to combine the provider’s detailed information on each customer with aggregated data from other sources about that customer, including their employment history, income, lifestyle, online and offline purchases, and social media activities.

Data-driven innovations create the risk of serious harms both for individuals and for society as a whole. At the individual level, these risks increase as more data is collected, linked, shared, and kept for longer periods, including the risk of:

  • inaccurate and discriminatory conclusions about a person’s creditworthiness based on insufficiently tested or inappropriate algorithms;
  • unanticipated aggregation of a person’s data from various sources to draw conclusions which may be used to manipulate that person’s behaviour, or adversely affect their prospects of obtaining employment or credit;
  • identity theft and other fraudulent use of biometric data and other personal information;
  • disclosure of personal and sensitive information to governments without transparent process and/or to governments which act without regard to the rule of law; and
  • harassment and public humiliation through the publication of loan defaults and other personal information.

Many of these harms are known to have occurred in various jurisdictions. The reality is that data practices can sometimes lead to the erosion of trust in new financial services and the exclusion of vulnerable consumers.

Even relatively well-meaning and law-abiding providers can cause harm. Firms may “segment” customers and “personalise” the prices or interest rates a particular consumer is charged, based on their location, movements, purchase history, friends and online habits. A person could, for example, be charged higher prices or rates based on the behaviour of their friends on social media.

Data practices may also increase the risk of harm to society as a whole. Decisions may be made to the detriment of entire groups or segments of people based on inferences drawn from big data, without the knowledge or consent of these groups. Pervasive surveillance, even the awareness of surveillance, is known to pose threats to freedom of thought, political activity and democracy itself, as individuals are denied the space to create, test and experiment unobserved.

These risks highlight the need for perspective and caution in the adoption of data-driven innovations, and the need for appropriate data protection regulation.

The Prevailing “Informed Consent” Approach to Data Privacy

Internationally, many data privacy standards and regulations are based, at least in part, on the “informed consent” – or “notice” and “choice” – approach to informational privacy. This approach can be seen in the Fair Information Practice Principles that originated in the US in the 1970s; the 1980 OECD Privacy Guidelines; the 1995 EU Data Protection Directive; and the Council of Europe Convention 108.

Each of these instruments recognise consumer consent as a justification for the collection, use, processing and sharing of personal data. The underlying rationale for this approach is based on principles of individual freedom and autonomy. Each individual should be free to decide how much or how little of their information they wish to share in exchange for a given “price” or benefit. The data collector gives notice of how an individual’s data will be treated and the individual chooses whether to consent to that treatment.

This approach has been increasingly criticised as artificial and ineffectual. The central criticisms are that, for consumers, there is no real notice and there is no real choice.

In today’s world of invisible and pervasive data collection and surveillance capabilities, data aggregation, complex data analytics and indefinite storage, consumers no longer know or understand when data is collected, what data is collected, by whom and for what purposes, let alone how it is then linked and shared. Consumers do not read the dense and opaque privacy notices that supposedly explain these matters, and could not read them, given the hundreds of hours this would take. Nor can they understand, compare, or negotiate on, these privacy terms.

These problems are exacerbated for poor consumers who often have more limited literacy, even less experience with modern uses of data, and less ability to negotiate, object or seek redress. Yet we still rely on firms to give notice to consumers of their broad, and often open-ended, plans for the use of consumer data and on the fact that consumers supposedly consented, either by ticking “I agree” or proceeding with a certain product.

The premises of existing regulation are therefore doubtful. At the same time, some commentators question the relevance and priority of data privacy in developing countries and emerging markets.

Is data privacy regulation a “Western” concept that has less relevance in developing countries and emerging markets?

Some have argued that the individualistic philosophy inherent in concepts of privacy has less relevance in countries that favour a “communitarian” philosophy of life. For example, in a number of African countries, “ubuntu” is a guiding philosophy. According to ubuntu, “a person is a person through other persons”. This philosophy values openness, sharing, group identity and solidarity. Is privacy relevant in the context of such a worldview?

Privacy, and data privacy, serve values beyond individual autonomy and control. Data privacy serve values which are at the very heart of “communitarian” philosophies, including compassion, inclusion, face-saving, dignity, and the humane treatment of family and neighbours. The protection of financial consumers’ personal data is entirely consistent with, and frequently critical to, upholding values such as these, particularly in light of the alternative risks and harms.

Should consumer data protection be given a low priority in light of the more pressing need for financial inclusion?

Some have argued that, while consumer data protection is the ideal, this protection should not have priority over more pressing goals, such as financial inclusion. Providers should not be overburdened with data protection compliance costs that might dissuade them from introducing innovative products to under-served and under-served consumers.

Here it is important to remember how we began: financial inclusion is not an end in itself but a means to other ends, including permitting poor and those living in remote areas to support their families, prosper, gain control over their financial destinies, and feel a sense of pride and belonging in their broader communities. The harms caused by unregulated data practices work against each of these goals.

If we are in fact permanently jeopardising these goals by permitting providers to collect personal data at will, financial inclusion is not serving its purpose.


There will be no panacea, no simple answer to the question of how to regulate for data protection. A good starting place is recognising that consumers’ “informed consent” is most often fictional. Sensible solutions will need to draw on the full “toolkit” of privacy governance tools (Bennett and Raab, 2006), such as appropriate regulators, advocacy groups, self-regulation and regulation (including substantive rules and privacy by design). The solution in any given jurisdiction will require a combination of tools best suited to the context of that jurisdiction and the values at stake in that society.

Contrary to the approach advocated by some, it will not be sufficient to regulate only the use and sharing of data. Limitations on the collection of data must be a key focus, especially in light of new data storage capabilities, the likelihood that de-identified data will be re-identified, and the growing opportunities for harmful and unauthorised access the more data is collected and the longer it is kept.

Big data offers undoubted and important benefits in serving those who have never had access to financial services. But it is not a harmless curiosity to be mined and manipulated at the will of those who collect and share it. Personal information should be treated with restraint and respect, and protected, in keeping with the fundamental values of the relevant society.

This post was authored by Dr. Katherine Kemp, Research Fellow at UNSW Digital Financial Services Regulation Project.  She presented as an expert speaker at the Responsible Finance Forum in Berlin this year.      

Dr.  Kemp’s post originally appeared on IFMR Trust’s site in August 2017.



Colin J Bennett and Charles Raab, The Governance of Privacy (MIT Press, 2006)

Gordon Hull, “Successful Failure: What Foucault Can Teach Us About Privacy Self-Management in a World of Facebook and Big Data” (2015) 17 Ethics and Information Technology Journal 89

Debbie VS Kasper, “Privacy as a Social Good” (2007) 28 Social Thought & Research 165

Katharine Kemp and Ross P Buckley, “Protecting Financial Consumer Data in Developing Countries: An Alternative to the Flawed Consent Model” (2017) Georgetown Journal of International Affairs (forthcoming)

Alex B Makulilo, “The Context of Data Privacy in Africa,” in Alex B Makulilo (ed), African Data Privacy Laws (Springer International Publishing, 2016)

David Medine, “Making the Case for Privacy for the Poor” (CGAP Blog, 15 November 2016)

Lokke Moerel and Corien Prins, “Privacy for the Homo Digitalis: Proposal for a New Regulatory Framework for Data Protection in the Light of Big Data and the Internet of Things” (25 May 2016)

Office of the Privacy Commissioner of Canada, Consent and Privacy: A Discussion Paper Exploring Potential Enhancements to Consent Under the Personal Information Protection and Electronic Documents Act (2016)

Omri Ben-Shahar and Carl E Schneider, More Than You Wanted to Know: The Failure of Mandated Disclosure (Princeton University Press, 2016)

Productivity Commission, Australian Government, “Data Availability and Use” (Productivity Commission Inquiry Report No 82, 31 March 2017)

Bruce Schneier, Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World (WW Norton & Co, 2015)

Daniel J Solove, “Introduction: Privacy Self-Management and the Consent Dilemma” (2013) 126 Harvard Law Review 1880


Regulatory Sandboxes: Potential for Financial Inclusion?

Ivo Jenik
30 Aug 2017

Many regulators need to address innovations that could advance financial inclusion without incurring major risks. Regulatory sandboxes have emerged as a tool that has potential. A regulatory sandbox is a framework set up by a regulator that allows FinTech startups and other innovators to conduct live experiments in a controlled environment under a regulator’s supervision. Regulatory sandboxes are gaining popularity, mostly in developed financial markets. With a few exceptions, the countries with regulatory sandboxes designed them to accommodate or even spur FinTech innovations; typically, they are not designed to focus explicitly on financial inclusion. This raises the question: Could regulatory sandboxes be useful in emerging markets and developing economies (EMDEs) to advance FinTech innovations designed to benefit unserved and underserved customers?

 This question has piqued the interest of the financial inclusion community. For instance, a report that complements the G20 High-Level Principles for Digital Financial Inclusion refers to regulatory sandboxes as a means to balance innovation and risk in favor of financial inclusion. For now, evidence for the effectiveness of regulatory sandboxes is weak. The newness, variability and lack of performance data on sandboxes make it difficult (if not impossible) to measure their impact on financial markets, let alone on financial inclusion. However, our working hypothesis is that regulatory sandboxes can enable innovations that are likely to benefit excluded customers, regardless of whether inclusion is a key objective. FinTech innovations can lead to more affordable products and services, new distribution channels that reach excluded groups, operational efficiencies that make it possible to serve low-margin customers profitably and compliance and risk-management approaches (e.g., simplified customer due diligence and alternative credit scoring).

Three of the 18 countries where regulatory sandboxes have been or are being established — Bahrain, India and Malaysia — have explicitly listed financial inclusion among their key objectives. Other countries may follow suit depending on their policy goals, mandates and priorities. Policy makers who decide to make financial inclusion an integral part of their sandboxes could do so in several ways. For instance, they could favor pro-inclusion innovators with a more streamlined admissions process, licensing fee waivers, or performance indicators that measure innovations’ impact on financial inclusion. By favoring pro-inclusion innovators, regulators could use sandboxes to measure innovations’ potential impact on financial inclusion and tailor policy interventions to increase the benefits and mitigate the risks.

While there are good reasons to explore regulatory sandboxes, policy makers should be prepared to face challenges. Most importantly, operating a regulatory sandbox requires adequate human and financial resources to select proposals, provide guidance, oversee experiments and evaluate innovations. Regulators may lack these resources in many EMDE countries. Therefore, policy makers need to pay attention to details and carefully consider their options. These may include various sandbox designs and other pro-innovation approaches that have been used successfully. For example, the test-and-learn approach enables a regulator to craft an ad hoc framework within which an innovator tests a new idea in a live environment, with safeguards and key performance indicators in place. A wait-and-see approach allows a regulator to observe how an innovation evolves before intervening (e.g., person-to-person lending in China).

Regulatory sandboxes are too new to be fully understood and evaluated. In the absence of hard, long-term data on successful testing, their risks and benefits are speculative, but they deserve further attention. CGAP has conducted a comprehensive mapping of regulatory sandboxes to gain insights into their actual and potential role in EMDEs, particularly regarding financial inclusion. With our findings, to be released next month, we will offer a compass for policy makers to navigate through this complex new landscape. Stay tuned to learn more soon.

This post was authored by Ivo Jenik at CGAP and originally appeared on the CGAP website on August 17, 2017.

The Rise Of Machine Learning And The Risks Of AI-Powered Algorithms

30 Aug 2017

This post originally appeared on The Financial Brand website on August 23, 2017.

Back in the Old Days, you used to have to hire a bunch of mathematicians to crunch numbers if you wanted to extrapolate insights from your data. Not anymore. These days, computers are so smart, they can figure everything out for themselves. But the uncensored power of “self-driving” AI presents financial institutions with a whole new set of regulatory, compliance and privacy challenges.

More and more financial institutions are using algorithms to power their decisions, from detecting fraud and money laundering patterns to product and service recommendations for consumers. For the most part, banks and credit unions have a good handle on how these traditional algorithms function and can mitigate the risks in using them.

But new cognitive technologies and the accessibility of big data have led to a new breed of algorithms. Unlike traditional, static algorithms that were coded by programmers, these algorithms can learn without being explicitly programmed by a human being; they change and evolve based on the data that’s input into the algorithms. In other words, true artificial intelligence.

And this is one area where financial institutions plan on investing heavily. In 2016, almost $8 billion was spent on cognitive systems and artificial intelligence — led by the financial services industry — and that amount will explode to over $47 billion by 2020, a compound annual growth rate of more than 55%, according to IDC.

There are certainly many benefits to using these AI-powered, machine learning algorithms, particularly with respect to marketing strategy. That’s why money is pouring into data sciences. But there are also risks.

Dilip Krishna and Nancy Albinson, Managing Directors with Deloitte’s Risk and Financial Advisory, explain some of these risks and what financial institutions can do to manage through them.

The Financial Brand (TFB): Can you give an example of how financial institutions can use machine learning algorithms?

Dilip Krishna, Managing Director with Deloitte’s Risk and Financial Advisory: One financial institution is using machine learning in the investment space. They are collecting data from multiple news and social media sources and mine that data. As soon as a news event occurs, they use machine learning to predict which stocks will be affected both positively and negatively and then apply those insights in their sales and marketing process.

TFB: With AI and machine learning, algorithms can build themselves. But isn’t this dangerous?

Nancy Albinson, Managing Director with Deloitte’s Risk and Financial Advisory: Certainly the complexity of these AI-powered algorithms and how they are designed increases the risks. Sophisticated technology such as sensors and predictive analytics and the volume of data that is readily available makes the algorithms inherently more complex. What’s more, the design of the algorithms is not as transparent. They can be created “inside the black box”, and this can open the algorithm up to intentional or unintentional biases. If the design is not apparent, monitoring is more difficult.

And as machine learning algorithms become more powerful — and more pervasive — financial institutions will assign more and more responsibility to these algorithms, compounding the risks even further.

TFB: Are regulators aware of the risks AI and machine learning poses to financial institutions?

Dilip Krishna: Governance of these algorithms is not as strong as it needs to be. For example, while rules such as SR11-7 Guidance on Model Risk Management describe how models should be validated, these rules do not cover machine learning algorithms. With predictive models, you build the model, test it, and its done. You don’t test to see if the algorithm changes based on the data you feed it. In machine learning, the algorithms change, evolve and grow; new biases could potentially be added.

We just don’t see regulators talking about the risks of machine learning models, and they really should be paying more attention. For example, in loan decisioning, the data could inform an unconscious bias against minorities that could expose the bank to regulatory scrutiny.

TFB: Do financial institutions really have the technological expertise to pull this off?

Dilip Krishna: Some of this technology — like deep learning algorithms using neural networks — is on the cutting edge of science. Even advanced technology companies struggle with understanding and explaining how these algorithms work. Neural networks can have thousands of nodes and many layers leading to billions of connections. Determining which connections actually have predictive value is difficult.

At most financial institutions, the number of models to manage is still small enough that they can use ad hoc mechanisms or external parties to test their algorithms. The challenge is that machine learning is embedded in business processes so institutions may not recognize that they need to address not just the models but the business processes as well.

TFB: What should financial institutions consider when developing a risk management program around AI and machine learning algorithms?

Dilip Krishna: Financial institutions need to respect algorithms from a risk perspective, and have functions responsible for addressing the risks. Risk management isn’t necessarily difficult, but it’s definitely different for machine learning algorithms. Rather than studying the actual programming code of the algorithm, you have to pay attention to the outcomes and actual data sets. Financial institutions do this a lot less than they should.

Nancy Albinson: Really understand those algorithms you rely on and that have a high impact or a high risk to your business if something goes awry. I agree that it’s about putting a program in place that monitors not just the design but also the data input. Is there a possibility that someone could manipulate the data along the way to make the results a bit different?

Recognize that risk management of these algorithms is a continuous process and financial institutions need to be proactive. There is a huge competitive advantage to using algorithms and it’s possible to entrust more and more decision-making to these complex algorithms. We’ve seen things go wrong with algorithms so financial institutions need to be ready to manage the risk. Those institutions that are able to manage the risk while leveraging machine learning algorithms will have a competitive advantage in the market.

Calculating Your Algorithmic Risk

Deloitte recommends that financial institutions assess their maturity in managing the risk of machine learning algorithms by asking the following questions:

  • Do you have a good handle on where algorithms are deployed?
  • Have you evaluated the potential impact should these algorithms function improperly?
  • Does senior management understand the need to manage algorithmic risks?
  • Do you have a clearly established governance structure for overseeing the risks emanating from algorithms?
  • Do you have program in place to manage risks? If so, are you continuously enhancing the program over time as technologies and requirements evolve?