The Questions For AI

Interview by Joseph Wakelee-Lynch

MacArthur Fellow Safiya Umoja Noble says artificial intelligence, like other algorithmic technologies, is best approached critically.

Safiya Umoja Noble, an internet studies scholar, is highly regarded for her work on algorithmic harm of commercial search engines. She is the author of “Algorithms of Oppression: How Search Engines Reinforce Racism” and was named a MacArthur Fellow in 2021. Noble holds the David O. Sears Presidential Endowed Chair of Social Sciences and is a professor at the University of California, Los Angeles. She is also the director of the Center on Resilience & Digital Justice at UCLA. Noble is the keynote speaker at “AI: Discernment, Made Here,” an academic symposium on Dec. 1, 2025, that is part of the celebration of the inauguration of Thomas Poon, Ph.D. as the 17th president of LMU. She was interviewed by Joseph Wakelee-Lynch, editor of LMU Magazine.

In your book “Algorithms of Oppression: How Algorithms Reinforce Racism,” which looked closely at Google Search, you make the point that search engines do not produce unbiased results when queried. That point seems relatively accepted since the book’s appearance in 2018. Do you have the same concerns about AI technology?

On the heels of all the things that we have studied and learned about search technology and the way search technologies have been deeply influenced by advertising tech, advertisers, and those who are able to optimize search results, I certainly have deep concerns about large language models. With the scraping of knowledge and information, copyrighted materials, hate speech, and more online being used to train large language models, what’s concerning to me is that there isn’t really a rank order presentation of information that comes back through a chatbot. Instead, you have an anthropomorphized, or human-like, interface that presents information as if it is the best insight, information, or knowledge stemming from all humanity’s production, if you will. People relate to that as incredibly reliable, because, like a search engine, when you are looking for or asking questions about things that are banal, fairly uncomplicated, and pretty straightforward, you are going to get a lot of reliable information. That socializes us to believe that when we ask these kinds of systems more sophisticated questions, more consequential asks, the results will also be stable and reliable. But, in fact, we see that nothing could be farther from the truth. Large language models, the latest research shows, generally are a little bit more than 50 percent reliable. Many of us who have “kicked the tires” on these models find patently false information. And of course we know at their absolute worst they are sycophantic, they encourage children and vulnerable people to harm themselves, and they contribute to the deterioration of social relationships and connection. I think these are some of the most dangerous products that we have in the marketplace right now that need an incredible amount of regulation and attention.


I would’ve expected that because large language models draw on such a massive amount of information that they’d be less likely to repeat the errors of search engines — reifying racist responses to queries about Black girls, for example — that you found in your book “Algorithms of Oppression.” It sounds as if one shouldn’t assume that.

I think that’s right. There was an early chatbot called Tay, which was a Microsoft chatbot. Within about 24 or 48 hours of its release, nazis online had trained it to be racist. Part of what is training these models is the people who are interacting with them. So, you have users and their interactions with the models training the models. It’s not unlike other kinds of algorithmically driven systems, like search or social media: The more you’re looking for particular types of things, the more it will give you those kinds of things. There also are more disclaimers in large language models. So now a large language model will give qualifiers. It won’t necessarily say your query appears to be racist, but it may indicate that you’re asking a question that leads to stereotyping. There’s a kind of training now that clearly lets the product designers off the hook for having to be responsible for whether the model is trained on racism, sexism or discrimination. It’s almost like the product is not discriminating by these factors, but you’re asking a discriminatory question.


You’ve made an interesting point about life in our “data-fied” society by referring to W.E.B. Du Bois’ notion of double-consciousness. Paraphrasing your explanation of duBois’ comment, you explained that DuBois said Black people live with knowing who they are themselves on one hand while living in a society that perceives them as other than that and tells them they’re other than how they know themselves to be. In the case of search engines, algorithms were returning results that portrayed Black people, especially Black girls, in ways that they knew themselves not to be. How does that apply to ways that AI treats information related to Black people?

One of the things that I’ve tried to recently write about using DuBois’ concept of double consciousness is that we live, work, play and engage in systems that “data-fy” us, that compartmentalize us. They mark us, our bodies, and our movements in a variety of ways. Everybody gets marked in these surveillance technologies, and then we are fed back results or we have opportunities constrained, or we are discriminated against, or we are evaluated in ways that are completely different than the way we know ourselves. So, the data comes to represent us more than our actual self represents us. As DuBois remarked, we live with the consciousness of who we are as Black people, and we also know we live in a society that reads us as incomplete, reads us with ideas that are not true, denies us opportunities based on limited, distorted ways of understanding Blackness and Black people. Data systems are similar: They read us and interpret us. So, we’re in a double bind, always moving between how a data set reads us vs. how we read and ourselves. That’s very useful to understand: If we all are reduced to data points that are disparate and connected by Palantir Technologies — our banking data, GPS location, and social media likes — and building a profile about us, that’s distorted and doesn’t tell the whole truth of who we are. I think we want to contend with that and contend with what it means to be a 5 year old who gets predetermined about whether they have the capabilities at 5 to go to college or not. What this ultimately does is reduce our humanity and human agency. We are always working in systems that are based on our past with no space for a different kind of future. To me, the challenge of living in a hyper-datafied society is that we lose redemption and the sense of infinite possibility that anything can happen. We can become anyone we want to be, we can change, we can learn, we can grow, we can transform. But if you’re always just the sum of the data that’s been collected on you and your past, you have many dimensions of who you might be that might be foreclosed.


But does that analogy break down? With his notion of double consciousness, DuBois was describing two different and opposed ways of seeing a person. He witnessed the Reconstruction era, when Black people were told they were less than human by America’s white society, and prior to that they were told they were not only less than human, they were slaves. So, his concept has to do with personhood. But today the makers of AI technologies might say, “We’re not concerned with identity. We’re just collecting data about things people do. This is about behavior; it’s not about defining you as a person.”

DuBois was taking on science: the way in which science and scientific systems misrepresented Black people and used scientific measurement and quantification of Black people’s lives to narrate. He wasn’t just taking on cultural misrepresentations, such as minstrelsy, and the ways of culturally dehumanizing people. He was interested in how something as benign as photographs of Black people or cranial measurements were much more opaque ways of determining who was fully human and who was not — the interpretation of data. I find that really relevant today, because many of these companies also flatten the human experience to data. But, of course, these flattened misrepresentations of people and communities as just data is a way to obfuscate the specifics of people and their lived experiences.

For example, when the 2008 mortgage crisis happened, that, in one way, was about the gamification of the financial markets and the ability to statistically identify people for sub-prime mortgages. On one hand you could say that Wall Street wasn’t specifically looking for African Americans. They were running computational models on American society. But what happened was that in the process of data-fying American consumers and potential borrowers, the long histories of discrimination that are built in our financial markets — red-lining, African Americans not being eligible for the G.I. Bill, being precluded through real estate covenants from buying in certain areas, the invention of the zip code system as a racial classifier — [came into play] during the mortgage crisis. Most people read that as being about data and consumer behavior. But that consumer behavior and data are embedded in the long history of discrimination. It’s really important for those of us who work with data to not reduce those histories and inequalities in society to just being about data. There are so many systems that appear to not be discriminatory but in fact reflect discrimination and inequality in our society. That’s my response to people who think they’re building neutral systems. They mostly don’t have any understanding of the politics of the data, what the data really represents. What you have in the mortgage crisis is the largest wipeout of Black wealth in the history of the United States — all of the gains of Reconstruction and the Civil Rights movement are wiped out. I would say that turning people into data makes it easier to manipulate people and to not have responsibility for discrimination.


The use of AI in university classrooms seems to vary. Some professors prohibit its use, others permit it as long as students are open about how they’ve used it. “Show your work,” in other words. How do you deal with AI in your classroom?

I like to teach my students how to think critically about AI: “What is a data model? What does it represent? What data is missing? What aspects of making data are impossible to capture and quantify?” We also interrogate who makes these systems. We talk about Kenyan data cleaners, coltan miners, people who have to make hardware. We talk about copyright infringement, the environment, and the incredible amounts of energy that these systems take. Rather than talking about these products as simply tools, we understand these products as a set of economic, social, and political choices that involve millions of people, whether they are beneficiaries of using large language models or a cog in the wheel of exploitation toward making it. Those are very rich, fascinating important ideas to talk about with students because they are getting fed the same propaganda that everyone else is in the marketing materials that come out of these big companies — with all the promises and none of the consequences or responsibilities that come along with it. Then they can decide whether to burn down a tree to use ChatGPT to tell them where to have dinner tonight. Rather than having a moral judgement about whether AI is good or bad, I want students to be able to ask, “Is it worth it for me to use it? When is it necessary? Do I understand the logic of how to arrive at a solution if we don’t have AI?” Those critical thinking skills are the best ways to be engaging with these products.


To modify a common analogy, it seems in many ways as if the horse has already gotten out of the barn. But we can’t give up on building fences — the guardrails. What do you think the guardrails to AI technology look like, and will they get built by popular political action, in voting booths, legislatures, and Congress?

I began “Algorithms of Oppression” by saying that artificial intelligence will be a major human rights issue of the 21st century, because it’s more than just a tool. It’s about international labor relations, economies, the environment, the decline of our cognitive capacities, mental health and well-being and that of our children. We are already forced to contend with these issues, but we’re seeing responses. I heard recently about nine new lawsuits against OpenAI, against Character AI. There will always be the risk investors face. The truth is all of our pensions are propping up these dangerous, capricious investments coming out of venture capital. That’s our pension funds that are being used in speculative ways. I think consumers and workers will demand divestment from surveillance, anti-democratic, and authoritarian technologies. That’s, in theory, another powerful front. More and more people are not letting their kids on screens. More and more kids are turning their back on social media. So, there will also be protections that people will try to put in place for their families and communities. Everyone has something to say and has a stake in the kinds of world that we want to live in. Many, many more people are involved in these conversations than ever have been before. That’s very exciting because it means that if we don’t want some of these worst, most dangerous kinds of products, then what do we want? What can we make? What else is possible? I think that there’s a building appetite for that.