Ethical implications of using browsing data for assessing credit worthiness

Ethical implications of using browsing data for assessing credit worthiness

This post doesn’t try to answer how browsing history can be used for assessing credit worthiness or how effective this method is. I assume it is possible and that it is accurate enough to be useful for lenders. Instead of answering the technical questions, I am trying to explore the ethical questions associated with this method. I do this by creating hypothetical scenarios and highlighting important questions these scenarios raise. Moreover, this post assumes that browsing history is categorised as personal data (e.g., a person’s phone number or product preferences) rather than non-personal data (e.g., weather).

It might not be okay to sacrifice privacy to decrease the price of a loan.

Imagine a woman approaching a lender. She is employed and has a good credit history (she has paid off her previous loans on time, she has never been unemployed after her graduation, etc.). The lender agrees to give her a loan and makes a proposal: if she agrees to give them access to her browsing history on a regular basis, they will significantly reduce the interest rate on her loan. What should the woman do? With negative unintended consequences, the price might increase. So, how should she calculate the price or weigh the pros and cons of trading privacy for a lower interest rate? Quantifying this “price” is not easy.

This sacrifice could end up hurting a borrower because of who they are rather than what they do.

For example, what if her browsing history suggests that she is pregnant? What if her current employer is known to be sexist and has a history of firing pregnant women? If the lender gets access to this data before signing a contract with the loan’s terms and conditions, how should the lender respond? The lender recognises the unfairness in this situation, but they get paid for assigning more weight to the company’s goals compared to the woman’s goals. If the woman loses her job, the chances that she can make monthly payments on time go down. When each stakeholder in a scenario looks out for themselves, the emergent system may be biased and unfair.

It’s hard to quantify pros and cons.

The previous scenario assumes the lender will participate in the unfair system by rejecting the loan. But what if they don’t? What if they embrace the inherent uncertainty present in any scenario? They might conclude that this hard working woman can find another job or start her own business. They might even empower her by giving her access to resources that will increase her chances of doing so. This lender monitors her browsing history to understand what resources she needs. Are the lender’s intentions at a specific point in time enough to conclude that the loan’s “price” is lower in this scenario (decreased privacy) compared to the baseline scenario (lender does not have access to the borrower’s browsing history)?

Short-term advantages might not outweigh possible long-term disadvantages.

What if this lender has access to lawyers who can use cleverly designed loopholes to sell the borrower’s data to third parties when the company is facing severe financial difficulties? What is the “price” of the loan now? Intentions can change over time. A minor decision can lead to major consequences over time. It might turn out that the woman wasn’t pregnant or she might decide to terminate her pregnancy. But this is not the point. The point is that there are unforeseen system-wide effects that are not taken into account when calculating this price. The previous scenario explores a limited set of possible consequences for the woman. Now imagine that this woman is a part of a large and active network of other people.

There are potential system-wide effects you could be ignoring.

She is a densely connected node with the ability to influence a large number of people. What if others see her example and agree to sacrifice their privacy? But all lenders might not be willing to empathise with their borrowers. All lenders might not be obligated to follow laws that protect the borrowers’ security and privacy. How long will they store the data? How often will they collect the data? What if browsing patterns reveal signs of mental illness? A lender might empathise with struggling pregnant women, but can the same lender empathise with other forms of struggling? Regardless, the previous scenarios assume that the borrowers won’t game the system or unintentionally compromise the quality of the data.

It’s hard to decide whose — borrower or lender — needs should be prioritised.

Most of the questions given above explore a borrower’s worldview. But all lenders might not want to evaluate whether browsing history should be used for assessing credit worthiness from a borrower-first perspective. Increasingly, machine learning models are trained, tested, and validated on data. The larger the datasets, the better the models’ predictions. And predictions with high accuracy can give a lender competitive advantage over other lenders. And this competitive advantage can increase profits and benefit shareholders. Moreover, if lender X is competing with other lenders with data-driven business models, then lender X might not have the luxury to prioritise a borrower’s long-term welfare.

Even if the pros do outweigh the cons, a lender might not be able to ensure the quality of browsing data.

There might be ways to leverage browsing data without hurting borrowers by regulating it and anonymising it. But it is possible the data itself is not good enough for the machine learning algorithm to mine. For example, what if two or more people use the same laptop or phone for browsing the Internet? What if the borrower uses a virtual private network or uses their browser in incognito mode? FinTech, especially companies that target credit markets, often need to implement explainable/interpretable AI models. If the borrowers understand what kind of browsing patterns lead to lower interest rates or other forms of larger rewards, they can exploit this knowledge. This would defeat the purpose of implementing the AI model.


Today, data is equivalent to oil. And data, like fossil fuels, can be misused (surveillance capitalism = climate change). This post looks at only one source of data. Moreover, there are a lot of unanswered questions for both lenders and borrowers. And each of them can answer each question in multiple different ways based on their goals. The consequences of their actions are not limited to themselves; actions today will affect people tomorrow and actions taken in one place will affect people in other places. Technology is not inherently good or evil; it’s what we do with it that makes it good or evil. Seemingly good things can turn out to be evil later and vice-versa. And we can unintentionally design positive feedback loops or vicious cycles that could lead to tipping points that we can’t come back from.

This post was originally published on Medium by Drasti Shah.