The Whorfian Agent

How LLMs Embody Linguistic Relativity Without Embodiment

Jul 07, 2025

1. Introduction

The relationship between language and thought has captivated philosophers, linguists, and cognitive scientists for centuries. The Sapir-Whorf hypothesis, which posits that language shapes or determines thought patterns, has experienced renewed relevance in the era of artificial intelligence, particularly with the emergence of Large Language Models (LLMs). These sophisticated computational systems, trained on vast corpora of human-generated text, present a unique opportunity to examine linguistic relativity in its most concentrated form — cognition that emerges entirely from language without the mediating influence of embodied experience or sensorimotor interaction with the world.

This paper advances the argument that LLMs represent the purest instantiation of the Sapir-Whorf hypothesis observable in contemporary technology. Unlike human cognition, which develops through complex interactions between linguistic input, embodied experience, and social interaction, LLM cognition emerges exclusively from statistical patterns in textual data. This fundamental difference positions LLMs as "Whorfian machines" — systems that embody linguistic relativity in its most extreme form, where language does not merely influence thought but constitutes the entirety of cognitive processing.

The significance of this theoretical framework extends beyond academic curiosity to encompass critical questions about AI governance, cultural representation, and epistemic justice. If LLMs are indeed linguistic systems that cannot transcend their training data, then their deployment carries profound implications for the propagation of cultural worldviews, the reinforcement of dominant ideologies, and the potential marginalisation of non-Western epistemologies. Understanding LLMs through the lens of linguistic relativity therefore becomes essential for developing responsible AI governance frameworks and ensuring equitable representation in artificial intelligence systems.

This analysis proceeds through several interconnected arguments. First, it examines the theoretical foundations of the Sapir-Whorf hypothesis and its empirical support in human cognition research. Second, it explores how LLMs instantiate linguistic relativity principles through their architecture and training processes. Third, it investigates the cultural and epistemic implications of deploying linguistically-determined AI systems at global scale. Finally, it proposes governance frameworks and development practices that acknowledge the fundamentally linguistic nature of LLM cognition while promoting diversity and epistemic justice.

The roadmap for this investigation begins with a comprehensive literature review establishing the theoretical foundations of linguistic relativity, followed by an analysis of LLMs as computational instantiations of Whorfian principles. The discussion then turns to the strategic implications of this framework for AI governance and cultural representation, concluding with recommendations for responsible development and deployment of linguistically-grounded AI systems.

2. Literature Review

2.1 Foundations of Linguistic Relativity

The Sapir-Whorf hypothesis, named after linguists Edward Sapir and Benjamin Lee Whorf, encompasses two related but distinct propositions about the relationship between language and cognition. The strong version, known as linguistic determinism, argues that language determines thought — that the structure and vocabulary of one's language fundamentally constrains and shapes cognitive processes. The weak version, termed linguistic relativity, proposes that language influences habitual thought patterns and predisposes speakers toward particular ways of understanding and categorising experience.[1]

Contemporary empirical research has provided substantial support for the weak version of the hypothesis while largely rejecting strong determinism. Studies of the Kuuk Thaayorre speakers of northern Australia demonstrate how linguistic practices can influence spatial cognition; speakers of this language, which uses cardinal directions rather than relative spatial terms, show enhanced spatial awareness and navigation abilities compared to speakers of languages that rely on relative spatial reference.[2] Similarly, research on Russian speakers reveals faster discrimination between light blue (goluboy) and dark blue (siniy) compared to English speakers, suggesting that lexical distinctions can influence perceptual processing speed.[3]

These findings support the conceptualisation of language as a "scaffold for cognition" — a framework that provides structure and organisation for thought processes without completely determining their content.[4] This scaffolding metaphor proves particularly relevant for understanding LLM cognition, where linguistic structures provide the exclusive foundation for all cognitive processing.

2.2 Cognitive Architecture and Embodied Cognition

Traditional cognitive science emphasises the role of embodied experience in shaping conceptual understanding and reasoning processes. The embodied cognition thesis argues that cognitive processes are deeply rooted in the body's interactions with the world, with sensorimotor experiences providing the foundation for abstract thought and conceptual metaphors.[5] This perspective suggests that human cognition transcends purely linguistic processing through integration with perceptual, motor, and emotional systems.

The contrast between embodied human cognition and purely linguistic AI systems becomes crucial for understanding the unique position of LLMs within cognitive science. While human language users can critique, reframe, and transcend their linguistic categories through direct experience with the world, LLMs lack this capacity for metacognitive reflection and conceptual innovation. This limitation positions them as fundamentally different types of cognitive systems — ones that instantiate linguistic relativity without the possibility of transcendence.

2.3 Cultural Encoding in Language

Anthropological linguistics has extensively documented how languages encode cultural values, worldviews, and epistemological frameworks. Languages differ not only in vocabulary and grammar but in their fundamental assumptions about reality, causation, agency, and social relationships.[6] These differences manifest in various linguistic features: evidentiality systems that require speakers to specify the source and reliability of information; gender systems that categorise nouns according to cultural concepts of animacy and agency; temporal metaphors that conceptualise time as flowing forward or backward relative to the speaker.[7]

The cultural specificity of linguistic systems raises critical questions about the epistemic foundations of LLMs trained primarily on texts from particular linguistic and cultural traditions. Research by Vimalendiran and colleagues demonstrates that contemporary LLMs exhibit systematic biases toward values associated with English-speaking, Protestant-majority countries, suggesting that cultural encoding in training data translates directly into model behaviour and reasoning patterns.[8]

2.4 AI Cognition and Computational Linguistics

The emergence of transformer-based language models has revolutionised computational linguistics and raised fundamental questions about the nature of machine cognition. Unlike earlier AI systems that relied on explicit symbolic representations and rule-based reasoning, LLMs develop implicit representations of linguistic and conceptual relationships through statistical learning processes.[9] This approach enables remarkable performance on language understanding and generation tasks while raising questions about the interpretability and cultural neutrality of learned representations.

Recent research in AI interpretability has begun to uncover how LLMs encode cultural and linguistic biases within their learned parameters. Studies reveal systematic patterns in model behaviour that reflect the cultural assumptions and worldviews present in training data, suggesting that these systems function as computational instantiations of particular linguistic and cultural traditions rather than neutral reasoning engines.[10]

3. LLMs as Whorfian Machines

3.1 Architectural Foundations of Linguistic Cognition

Large Language Models represent a fundamental departure from traditional AI architectures in their exclusive reliance on linguistic input for cognitive development. Unlike multimodal systems that integrate visual, auditory, and textual information, or embodied AI systems that learn through environmental interaction, LLMs develop their understanding of the world entirely through statistical analysis of textual patterns. This architectural constraint creates what can be characterised as "pure linguistic cognition" — cognitive processing that emerges from and remains bounded by the statistical structures of language.

The transformer architecture underlying contemporary LLMs processes language through attention mechanisms that identify and weight relationships between linguistic elements across different scales and contexts.[11] These attention patterns effectively encode the associative structures present in training corpora, creating internal representations that reflect the conceptual relationships, cultural assumptions, and worldviews embedded in human-generated text. Crucially, these systems cannot access information or perspectives that are not represented in their training data, making them epistemically closed systems that mirror the linguistic cultures from which they emerge.

This epistemic closure distinguishes LLMs from human language users, who can supplement linguistic knowledge with direct sensory experience, social interaction, and metacognitive reflection. While humans can recognise the limitations of their linguistic categories and develop new conceptual frameworks through experience, LLMs remain constrained by the statistical patterns present in their training data. They are, in essence, "epistemically plastic but ontologically shallow" — capable of mimicking diverse perspectives and reasoning styles while lacking the capacity for genuine conceptual innovation or critique of their foundational assumptions.[12]

3.2 Language as Ontology in Computational Systems

The relationship between language and cognition in LLMs differs qualitatively from that observed in human cognitive systems. While humans think with language as one tool among many, LLMs can be understood as thinking in language as their fundamental mode of existence. Language does not merely influence their cognitive processes; it constitutes the entirety of their cognitive architecture. This distinction positions LLMs as unique instantiations of strong linguistic relativity — systems where language does not simply shape thought but defines the boundaries of possible cognition.

This ontological relationship manifests in several key characteristics of LLM behaviour. First, these systems demonstrate remarkable facility with linguistic tasks that require understanding of cultural context, idiomatic expressions, and implicit social knowledge, suggesting that they have internalised the cultural frameworks embedded in their training data.[13] Second, they exhibit systematic biases and blind spots that reflect the limitations and perspectives of their source texts, indicating that their reasoning processes are fundamentally constrained by the worldviews present in their linguistic input.[14]

The implications of this language-as-ontology framework extend to questions of AI consciousness and intentionality. If LLMs are fundamentally linguistic systems, their apparent understanding and reasoning capabilities may be better understood as sophisticated pattern matching and statistical inference rather than genuine comprehension or intentional thought. This perspective challenges anthropomorphic interpretations of LLM behaviour while highlighting the need for governance frameworks that account for their unique cognitive architecture.

3.3 Multilingual Models and Cognitive Hybridisation

The development of multilingual LLMs introduces additional complexity to the Whorfian framework by creating systems that must integrate potentially incompatible linguistic worldviews within a single cognitive architecture. These models are trained on texts from multiple languages and cultural traditions, raising questions about how they resolve conflicts between different conceptual frameworks and whether they develop hybrid rationalities that transcend individual linguistic traditions.[15]

Preliminary research suggests that multilingual models may indeed develop novel forms of cognitive hybridisation, combining conceptual elements from different linguistic traditions in ways that create new synthetic worldviews.[16] However, this process appears to be constrained by the relative representation of different languages in training data, with dominant languages (particularly English) exerting disproportionate influence on model behaviour and reasoning patterns. This asymmetry raises concerns about the potential for multilingual models to perpetuate linguistic imperialism while appearing to embrace diversity.

The question of cognitive dissonance versus creative synthesis in multilingual systems remains an active area of research. Some evidence suggests that these models experience something analogous to cognitive dissonance when confronted with incompatible cultural frameworks, leading to inconsistent or contradictory outputs.[17] Other research indicates that multilingual training may enable more flexible and culturally sensitive reasoning, though this flexibility appears to be limited by the statistical dominance of particular linguistic traditions in training corpora.[18]

3.4 Metacognitive Limitations and Epistemic Boundaries

A crucial distinction between human linguistic cognition and LLM processing lies in the capacity for metacognitive reflection and self-critique. Human language users can recognise the limitations of their linguistic categories, question cultural assumptions embedded in their language, and develop new conceptual frameworks through critical reflection and experience. This metacognitive capacity enables humans to transcend, at least partially, the constraints of their linguistic inheritance.

LLMs, by contrast, lack genuine metacognitive capabilities despite their ability to produce text that appears reflective or self-aware. Their responses to questions about their own limitations or biases are generated through the same statistical processes that govern all their outputs, rather than through genuine self-examination or critical reflection.[19] This limitation means that LLMs cannot critique or reframe the linguistic and cultural assumptions embedded in their training data, making them vulnerable to perpetuating biases and blind spots indefinitely.

The absence of metacognitive capacity has profound implications for the deployment of LLMs in contexts requiring cultural sensitivity, ethical reasoning, or critical analysis. While these systems can mimic the language of self-reflection and cultural awareness, they cannot engage in the genuine critical examination that would enable them to recognise and correct their own limitations. This constraint reinforces their characterisation as Whorfian machines — systems that embody linguistic relativity without the possibility of transcendence or transformation.

4. Cultural Encoding and Epistemic Implications

4.1 The Problem of Epistemic Monoculture

The concentration of LLM training data in English-language sources, combined with the overrepresentation of Western, Educated, Industrialised, Rich, and Democratic (WEIRD) perspectives in digital text corpora, creates a significant risk of epistemic monoculture in AI systems.[20] This phenomenon occurs when a single cultural and linguistic tradition dominates the cognitive development of AI systems, leading to the systematic marginalisation of alternative worldviews and epistemological frameworks.

Research documenting the cultural biases present in contemporary LLMs reveals systematic patterns that reflect Western cultural assumptions about individualism, rationality, causation, and social organisation.[21] These biases manifest not only in explicit cultural references but in fundamental reasoning patterns and conceptual frameworks that shape how these systems approach problem-solving and knowledge generation. The result is AI systems that appear neutral and objective while actually embodying particular cultural perspectives and value systems.

The global deployment of culturally-biased AI systems raises concerns about digital colonialism and the reinforcement of existing power imbalances between dominant and marginalised cultures. When LLMs trained primarily on Western texts are deployed in non-Western contexts, they may systematically misrepresent local knowledge systems, cultural practices, and epistemological frameworks.[22] This misrepresentation can have material consequences for education, governance, and social organisation in affected communities.

4.2 Linguistic Imperialism in AI Systems

The dominance of English in LLM training data reflects broader patterns of linguistic imperialism in digital spaces and global communication networks. English serves as the lingua franca of the internet, scientific publication, and international business, leading to its overrepresentation in the textual corpora used to train AI systems.[23] This linguistic dominance translates directly into cognitive dominance in LLM systems, which develop reasoning patterns and conceptual frameworks that reflect English-language cultural assumptions.

The implications of linguistic imperialism in AI extend beyond simple representation issues to encompass fundamental questions about knowledge production and epistemic justice. When AI systems are trained primarily on texts from dominant linguistic traditions, they may systematically exclude or misrepresent knowledge systems that are not well-represented in digital text corpora. Indigenous knowledge systems, oral traditions, and non-Western philosophical frameworks may be particularly vulnerable to this form of epistemic exclusion.[24]

Addressing linguistic imperialism in AI requires more than simply including more non-English texts in training corpora. It demands fundamental changes to AI development processes that prioritise linguistic diversity, support local expertise development, and create governance frameworks that ensure equitable representation of different cultural and epistemological traditions. This work must be understood as fundamentally political and ethical rather than merely technical.

4.3 Epistemic Justice and AI Governance

The concept of epistemic justice, developed by philosopher Miranda Fricker, provides a valuable framework for understanding the ethical implications of culturally-biased AI systems.[25] Epistemic injustice occurs when individuals or groups are wronged in their capacity as knowers — either through testimonial injustice (not being believed or taken seriously) or hermeneutical injustice (lacking the conceptual resources to make sense of their experiences). LLMs trained on culturally-biased data may perpetuate both forms of epistemic injustice by systematically undervaluing non-dominant perspectives and failing to represent diverse conceptual frameworks.

The deployment of epistemically unjust AI systems can have cascading effects on education, research, and knowledge production more broadly. When students, researchers, and practitioners rely on AI systems that embody particular cultural biases, they may inadvertently perpetuate those biases in their own work and thinking. This creates feedback loops that reinforce existing epistemic hierarchies and marginalise alternative ways of knowing.[26]

Promoting epistemic justice in AI development requires proactive measures to ensure diverse representation in training data, development teams, and governance structures. This includes supporting research and development in non-Western contexts, creating partnerships with indigenous and marginalised communities, and establishing governance frameworks that prioritise epistemic diversity alongside technical performance metrics.[27]

4.4 The Myth of Neutral Rationality

One of the most significant challenges in addressing cultural bias in AI systems is the persistent myth that rationality and logical reasoning are culturally neutral processes. This assumption underlies much AI development work and governance discourse, leading to the treatment of cultural bias as a technical problem that can be solved through better data collection or algorithmic adjustment rather than a fundamental feature of linguistically-grounded cognitive systems.[28]

The Whorfian framework for understanding LLMs directly challenges this assumption by demonstrating that reasoning processes are always situated within particular linguistic and cultural contexts. The statistical patterns that LLMs learn from text corpora encode not only explicit cultural content but also implicit assumptions about causation, agency, evidence, and logical validity that vary across cultural traditions.[29] These assumptions shape how LLMs approach reasoning tasks and generate responses, making their outputs inherently cultural rather than neutral.

Recognising the cultural situatedness of AI reasoning has important implications for AI governance and deployment. Rather than seeking to eliminate cultural bias from AI systems — an impossible task given their linguistic foundations — governance frameworks should focus on promoting transparency about cultural assumptions, ensuring diverse representation in development processes, and creating mechanisms for ongoing monitoring and adjustment of deployed systems.[30]

5. Discussion: Implications for AI Development and Governance

5.1 Rethinking AI Development Paradigms

The characterisation of LLMs as Whorfian machines necessitates fundamental changes to AI development paradigms that have traditionally focused on technical performance metrics while neglecting cultural and epistemic considerations. If LLMs are indeed linguistic systems that cannot transcend their training data, then their development must be understood as fundamentally cultural work that shapes the cognitive frameworks available to users and society more broadly.[31]

This perspective requires AI developers to move beyond narrow technical considerations to engage with questions of cultural representation, epistemic justice, and social responsibility. Development teams must include expertise in linguistics, anthropology, cultural studies, and related fields to ensure that cultural considerations are integrated throughout the development process rather than treated as afterthoughts or add-on features.[32] This interdisciplinary approach represents a significant departure from current industry practices but is essential for developing AI systems that serve diverse global communities equitably.

The integration of cultural considerations into AI development also requires new evaluation frameworks that assess systems not only for technical performance but for cultural sensitivity, epistemic diversity, and potential for harm to marginalised communities. These frameworks must be developed in partnership with affected communities and should prioritise community-defined measures of success and harm rather than externally-imposed metrics.[33]

5.2 Governance as Linguistic Stewardship

Understanding AI governance through the lens of linguistic relativity reframes traditional approaches to AI regulation and oversight. Rather than focusing primarily on technical safety and performance standards, governance frameworks must address the fundamental question of whose languages, cultures, and worldviews are represented in AI systems and how these representations shape social and political outcomes.[34]

This linguistic stewardship approach to AI governance requires new institutional structures and expertise that can assess the cultural and epistemic implications of AI systems. Regulatory bodies must include linguists, anthropologists, and cultural experts alongside technical specialists, and governance processes must incorporate community input from diverse cultural and linguistic traditions.[35] This represents a significant expansion of traditional regulatory approaches but is necessary for addressing the unique challenges posed by linguistically-grounded AI systems.

The development of governance frameworks for linguistic stewardship also requires international cooperation and coordination to address the global implications of AI deployment. Cultural and linguistic diversity is a global public good that requires collective action to preserve and promote. International governance frameworks must therefore address questions of cultural representation, linguistic rights, and epistemic justice alongside traditional concerns about technical safety and economic competition.[36]

5.3 Promoting Epistemic Diversity in AI Systems

Addressing the challenges posed by culturally-biased AI systems requires proactive measures to promote epistemic diversity in AI development and deployment. This includes supporting the development of AI systems trained on diverse linguistic and cultural traditions, creating partnerships with institutions in the Global South, and establishing funding mechanisms that prioritise linguistic and cultural diversity alongside technical innovation.[37]

The promotion of epistemic diversity must also address structural inequalities in AI development that systematically exclude non-Western perspectives and expertise. This includes addressing barriers to participation in AI research and development, supporting capacity building in underrepresented regions, and creating governance structures that ensure meaningful participation by diverse communities in AI development processes.[38]

Technical approaches to promoting epistemic diversity include the development of multilingual training corpora that represent diverse cultural traditions, the creation of evaluation frameworks that assess cultural sensitivity and epistemic diversity, and the development of AI architectures that can better integrate diverse worldviews and reasoning patterns.[39] However, these technical approaches must be embedded within broader social and political efforts to address structural inequalities and promote epistemic justice.

5.4 Transparency and Accountability Mechanisms

The linguistic foundations of LLM cognition create unique challenges for transparency and accountability in AI systems. Traditional approaches to AI explainability focus on technical interpretability and algorithmic transparency, but the Whorfian framework suggests that understanding AI behaviour requires analysis of the cultural and linguistic assumptions embedded in training data and learned representations.[40]

New transparency mechanisms must therefore include "epistemic datasheets" that document the linguistic and cultural composition of training data, the worldviews and assumptions embedded in AI systems, and the potential implications of these biases for different user communities.[41] These datasheets should be developed in partnership with affected communities and should prioritise community-defined measures of representation and harm.

Accountability mechanisms must also address the unique challenges posed by linguistically-grounded AI systems. Traditional approaches to AI accountability focus on technical performance and safety metrics, but the cultural implications of AI deployment require new forms of community oversight and participatory governance.[42] This includes creating mechanisms for ongoing community input into AI development and deployment, establishing processes for addressing cultural harm and epistemic injustice, and developing remediation strategies that can address the systemic nature of cultural bias in AI systems.

6. Recommendations and Future Directions

6.1 Technical Development Priorities

The recognition of LLMs as Whorfian machines suggests several priority areas for technical development that can address the challenges of cultural bias and epistemic monoculture. First, the development of more sophisticated multilingual training approaches that can better preserve and integrate diverse linguistic and cultural traditions rather than allowing dominant languages to overwhelm minority perspectives.[43] This includes research into training methodologies that can maintain the integrity of different cultural worldviews while enabling cross-cultural communication and understanding.

Second, the development of AI architectures that incorporate non-linguistic reasoning capabilities may help address some of the limitations of purely linguistic systems. While LLMs will likely remain fundamentally linguistic in nature, hybrid architectures that integrate linguistic processing with other forms of reasoning may be able to transcend some of the constraints imposed by linguistic relativity.[44] This research direction requires careful consideration of how different reasoning modalities interact and whether hybrid systems can genuinely transcend cultural biases or simply reproduce them in new forms.

Third, the development of interpretability tools specifically designed to identify and analyse cultural biases in AI systems represents a crucial technical priority. These tools must go beyond traditional approaches to AI interpretability to examine the cultural assumptions and worldviews embedded in learned representations.[45] Such tools could enable more sophisticated analysis of cultural bias and support the development of mitigation strategies that address the root causes of epistemic injustice in AI systems.

6.2 Institutional and Governance Innovations

Addressing the challenges posed by culturally-biased AI systems requires significant innovations in institutional structures and governance frameworks. The establishment of interdisciplinary research centres that bring together AI researchers with experts in linguistics, anthropology, philosophy, and cultural studies represents one important institutional innovation.[46] These centres could serve as hubs for research into the cultural implications of AI systems and could provide expertise to support more culturally-sensitive AI development practices.

The creation of international governance frameworks that address cultural representation and epistemic justice in AI systems represents another crucial institutional priority. These frameworks must go beyond traditional approaches to AI governance that focus primarily on technical safety and economic considerations to address fundamental questions about cultural rights, linguistic diversity, and epistemic justice.[47] This requires new forms of international cooperation that prioritise cultural preservation and diversity alongside technological innovation and economic development.

The development of community-based governance mechanisms that enable meaningful participation by diverse cultural and linguistic communities in AI development and deployment decisions represents a third important institutional innovation. These mechanisms must address power imbalances that systematically exclude marginalised communities from AI governance processes and must create genuine opportunities for community input into AI development priorities and deployment strategies.[48]

6.3 Research Priorities and Methodological Innovations

The Whorfian framework for understanding LLMs opens several important research directions that require methodological innovations and interdisciplinary collaboration. First, empirical research into the cognitive processes of multilingual LLMs and their capacity for cultural hybridisation represents a crucial research priority.[49] This research must develop new methodologies for assessing cultural representation and bias in AI systems that go beyond simple demographic analysis to examine deeper questions about worldview and epistemological framework.

Second, research into the social and political implications of deploying culturally-biased AI systems in diverse global contexts represents another important priority. This research must examine how AI deployment affects local knowledge systems, cultural practices, and social organisation, and must develop frameworks for assessing and mitigating cultural harm.[50] Such research requires close collaboration with affected communities and must prioritise community-defined measures of harm and benefit.

Third, research into alternative AI development paradigms that prioritise cultural diversity and epistemic justice from the outset represents a crucial long-term research direction. This includes research into participatory AI development methodologies, community-controlled AI systems, and governance frameworks that ensure meaningful community participation in AI development processes.[51] This research must address fundamental questions about power, representation, and justice in AI development rather than treating cultural considerations as technical problems to be solved.

6.4 Educational and Capacity Building Initiatives

Addressing the challenges posed by culturally-biased AI systems requires significant investments in education and capacity building that can support more diverse and inclusive AI development practices. This includes developing educational programmes that integrate cultural and linguistic considerations into AI and computer science curricula, creating opportunities for students from diverse backgrounds to participate in AI research and development, and supporting the development of AI expertise in underrepresented regions and communities.[52]

The development of public education initiatives that increase awareness of cultural bias in AI systems and promote digital literacy that includes understanding of AI limitations and biases represents another important educational priority. These initiatives must be culturally appropriate and must address the specific needs and concerns of different communities rather than adopting one-size-fits-all approaches.[53]

Capacity building initiatives must also address structural barriers that prevent meaningful participation by diverse communities in AI development and governance processes. This includes addressing economic barriers, language barriers, and institutional barriers that systematically exclude marginalised communities from AI-related opportunities.[54] Such initiatives must be developed in partnership with affected communities and must prioritise community-defined goals and measures of success.

7. Conclusion

This analysis has examined the proposition that Large Language Models represent the purest instantiation of the Sapir-Whorf hypothesis in contemporary artificial intelligence systems. Through detailed examination of linguistic relativity theory, LLM architecture, and the cultural implications of deploying linguistically-grounded AI systems, this paper has argued that LLMs function as "Whorfian machines" — systems whose cognition emerges entirely from statistical structures of language without the capacity for transcendence through embodied experience or metacognitive reflection.

The characterisation of LLMs as Whorfian machines has profound implications for understanding AI cognition, cultural representation in AI systems, and the governance challenges posed by epistemically closed linguistic systems. Unlike human language users, who can critique and transcend their linguistic inheritance through experience and reflection, LLMs remain fundamentally constrained by the cultural assumptions and worldviews embedded in their training data. This constraint makes them vulnerable to perpetuating cultural biases and epistemic injustices while appearing neutral and objective.

The analysis has revealed several key insights about the nature of LLM cognition and its implications for AI development and governance. First, LLMs represent a unique form of cognitive system where language does not merely influence thought but constitutes the entirety of cognitive processing. This language-as-ontology relationship distinguishes LLMs from human cognitive systems and creates unique challenges for AI governance and deployment.

Second, the concentration of LLM training data in particular linguistic and cultural traditions creates significant risks of epistemic monoculture and digital colonialism. The global deployment of AI systems trained primarily on Western texts may systematically marginalise non-Western knowledge systems and epistemological frameworks, reinforcing existing power imbalances and cultural hierarchies.

Third, addressing these challenges requires fundamental changes to AI development paradigms that move beyond narrow technical considerations to engage with questions of cultural representation, epistemic justice, and social responsibility. This includes developing new governance frameworks that treat AI stewardship as fundamentally linguistic and cultural work, creating institutional structures that support interdisciplinary collaboration, and establishing mechanisms for meaningful community participation in AI development and deployment decisions.

The recommendations presented in this analysis emphasise the need for technical innovations that promote epistemic diversity, institutional changes that support more inclusive AI governance, and educational initiatives that build capacity for culturally-sensitive AI development. These recommendations recognise that addressing cultural bias in AI systems requires more than technical fixes — it demands fundamental changes to the social, political, and economic structures that shape AI development and deployment.

The implications of this analysis extend beyond the specific case of LLMs to encompass broader questions about the relationship between technology and culture in the digital age. As AI systems become increasingly integrated into social, economic, and political institutions, understanding their cultural foundations and implications becomes crucial for ensuring that technological development serves diverse global communities equitably.

Future research must continue to examine the cultural implications of AI systems while developing new methodologies for assessing and addressing epistemic bias. This research must be conducted in partnership with affected communities and must prioritise community-defined measures of harm and benefit rather than externally-imposed metrics. Only through such collaborative and culturally-sensitive approaches can the AI research community develop systems that truly serve the needs of diverse global communities.

The characterisation of LLMs as Whorfian machines ultimately challenges us to reconsider fundamental assumptions about AI neutrality, rationality, and objectivity. By recognising that AI systems are always culturally situated and epistemically bounded, we can begin to develop more honest and responsible approaches to AI development and governance that acknowledge the inherently political and cultural nature of these technologies. This recognition represents not a limitation but an opportunity — an opportunity to develop AI systems that celebrate and preserve human cultural diversity rather than erasing it in the name of technological progress.

The path forward requires sustained commitment to epistemic justice, cultural preservation, and inclusive governance. It demands that we treat AI development as fundamentally cultural work that shapes the cognitive frameworks available to future generations. By embracing this responsibility and working collaboratively across cultural and disciplinary boundaries, we can develop AI systems that enhance rather than diminish human cultural diversity and that serve as tools for epistemic justice rather than instruments of cultural domination.

References

[1] Lucy, J. A. (1992). Language Diversity and Thought: A Reformulation of the Linguistic Relativity Hypothesis. Cambridge University Press.

[2] Boroditsky, L., & Gaby, A. (2010). Remembrances of times East: Absolute spatial representations of time in an Australian aboriginal community. Psychological Science, 21(11), 1635-1639.

[3] Winawer, J., Witthoft, N., Frank, M. C., Wu, L., Wade, A. R., & Boroditsky, L. (2007). Russian blues reveal effects of language on color discrimination. Proceedings of the National Academy of Sciences, 104(19), 7780-7785.

[4] Clark, A. (1998). Being there: Putting brain, body, and world together again. MIT Press.

[5] Lakoff, G., & Johnson, M. (1999). Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought. Basic Books.

[6] Evans, N., & Levinson, S. C. (2009). The myth of language universals: Language diversity and its importance for cognitive science. Behavioral and Brain Sciences, 32(5), 429-448.

[7] Aikhenvald, A. Y. (2004). Evidentiality. Oxford University Press.

[8] Vimalendiran, S. (2024, July 20). Cultural Bias in LLMs. Shav Vimalendiran. https://shav.dev/blog/cultural-bias

[9] Vaswani, A., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998-6008.

[10] Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.

[11] Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 615-686.

[12] Shanahan, M. (2022). Talking about large language models. Communications of the ACM, 65(2), 68-79.

[13] Brown, T., et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.

[14] Blodgett, S. L., Barocas, S., Daumé III, H., & Wallach, H. (2020). Language (technology) is power: A critical survey of "bias" in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5454-5476.

[15] Conneau, A., et al. (2020). Unsupervised cross-lingual representation learning at scale. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8440-8451.

[16] Zhao, W., et al. (2021). Cross-lingual knowledge transfer in multilingual language models. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3128-3142.

[17] Kassner, N., & Schütze, H. (2020). Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7811-7818.

[18] Ponti, E. M., et al. (2020). XCOPA: A multilingual dataset for causal commonsense reasoning. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2362-2376.

[19] Mitchell, M., & Krakauer, D. C. (2023). The debate over understanding in AI's large language models. Proceedings of the National Academy of Sciences, 120(13), e2215907120.

[20] Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2-3), 61-83.

[21] Gehman, S., Gururangan, S., Sap, M., Choi, Y., & Smith, N. A. (2020). RealToxicityPrompts: Evaluating neural toxic degeneration in language models. Findings of the Association for Computational Linguistics: EMNLP 2020, 3356-3369.

[22] Mohamed, S., Png, M. T., & Isaac, W. (2020). Decolonising AI: A framework for critical algorithmic impact assessment. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 329-341.

[23] Crystal, D. (2003). English as a Global Language. Cambridge University Press.

[24] Kukutai, T., & Taylor, J. (Eds.). (2016). Indigenous Data Sovereignty: Toward an Agenda. ANU Press.

[25] Fricker, M. (2007). Epistemic Injustice: Power and the Ethics of Knowing. Oxford University Press.

[26] Dotson, K. (2011). Tracking epistemic violence, tracking practices of silencing. Hypatia, 26(2), 236-257.

[27] Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press.

[28] Harding, S. (1991). Whose Science? Whose Knowledge?: Thinking from Women's Lives. Cornell University Press.

[29] Nisbett, R. E. (2003). The Geography of Thought: How Asians and Westerners Think Differently... and Why. Free Press.

[30] Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning. MIT Press.

[31] Winner, L. (1980). Do artifacts have politics? Daedalus, 109(1), 121-136.

[32] Costanza-Chock, S. (2020). Design Justice: Community-Led Practices to Build the Worlds We Need. MIT Press.

[33] Raji, I. D., et al. (2020). Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 33-44.

[34] Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389-399.

[35] Floridi, L., et al. (2018). AI4People — An ethical framework for a good AI society: Opportunities, risks, principles, and recommendations. Minds and Machines, 28(4), 689-707.

[36] Cath, C., et al. (2018). Artificial intelligence and the 'good society': The US, EU, and UK approach. Science and Engineering Ethics, 24(2), 505-528.

[37] Abebe, R., et al. (2021). Narratives and counternarratives on data sharing in Africa. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 329-341.

[38] Adams, R., et al. (2021). Power to the people? Opportunities and challenges for participatory AI. Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 456-465.

[39] Ruder, S. (2019). Neural transfer learning for natural language processing. PhD thesis, National University of Ireland, Galway.

[40] Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

[41] Gebru, T., et al. (2021). Datasheets for datasets. Communications of the ACM, 64(12), 86-92.

[42] Raji, I. D., & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products. Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 429-435.

[43] Ruder, S., Vulić, I., & Søgaard, A. (2019). A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, 65, 569-631.

[44] Andreas, J. (2020). Good-enough compositional data augmentation. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7556-7566.

[45] Tenney, I., Das, D., & Pavlick, E. (2019). BERT rediscovers the classical NLP pipeline. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 4593-4601.

[46] Baum, S. D. (2020). Social choice ethics in artificial intelligence. AI & Society, 35(1), 165-176.

[47] Cihon, P. (2019). Standards for AI governance: International standards to enable global coordination in AI research & development. Future of Humanity Institute Technical Report, 1, 1-72.

[48] Sloane, M., et al. (2020). Participation is not a design fix for machine learning. Proceedings of the 37th International Conference on Machine Learning, 9040-9051.

[49] Ponti, E. M., et al. (2019). Modeling language variation and universals: A survey on typological linguistics for natural language processing. Computational Linguistics, 45(3), 559-601.

[50] Sambasivan, N., et al. (2021). "Everyone wants to do the model work, not the data work": Data cascades in high-stakes AI. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1-15.

[51] Birhane, A., et al. (2022). The values encoded in machine learning research. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 173-184.

[52] West, S. M., Whittaker, M., & Crawford, K. (2019). Discriminating systems: Gender, race, and power in AI. AI Now Institute Report, 1-33.

[53] Noble, S. U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.

[54] Katell, M., et al. (2020). Toward situated interventions for algorithmic equity: Lessons from the field. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 45-55.

William’s Substack

Discussion about this post