Forum: Author Hangout

AI bias

rustyken 🚫

Read an interesting article today in the Wall Street Journal. The title is "The Hidden Chinese Influence in AI" It is based on a published study.

While it may not impact how authors use AI in creating stories, it seems wise to be aware of the potential impact.

Replies: Michael Loucks Vincent Berg

Michael Loucks 🚫

@rustyken

Here's the link, in a gifted article (so no paywall):

The Hidden Chinese Influence in AI

"Censorship and propaganda have always shaped what people read," Molly Roberts, one of the researchers and co-director of China Data Lab at University of California San Diego, told me. "What is new here is now they are shaping the systems people increasingly ask to summarize, explain, and interpret the world for them. And in this case, governments can shape not just what people in their own country consume, but also those in other countries."

The source of the problem, as it has been from the very first days of the computer revolution, is uncritically accepting what a computer spits out as 'true' or 'accurate'. Anyone who has worked in the field (or critically examined AI output) knows this.

The solution is, at it has always been, teaching critical thinking and analysis. Uncritically accepting AI output is no different from accepting what you read on X or BlueSky, or hear from [insert your most despised news organization]. In other words, it's not a new problem. It's as old as news reporting, or, in reality, it's as old as human communication.

Replies: awnlee jawking

awnlee jawking 🚫

@Michael Loucks

DeepSeek allegedly won't mention the Tiananmen Square Massacre.

AJ

Replies: Michael Loucks Marc Nobbs

Michael Loucks 🚫

@awnlee jawking

DeepSeek allegedly won't mention the Tiananmen Square Massacre.

Given how China deals with that, I'm not surprised. Grok, on the other hand, has no qualms in calling it for what it was AND discussing China's supression of the incident (which affects DeepSeek):

Official Chinese narrative and censorship
Inside China the event is called the "June Fourth Incident" or "political turmoil." The party line, repeated for 37 years, is that it was a necessary suppression of "counter-revolutionary rebellion" and "turmoil" to prevent chaos and enable the economic boom that followed. The government has never apologized, never released full casualty lists, never allowed an independent inquiry. Search terms like "June 4," "64," "Tank Man," or even "May 35th" (a code) are scrubbed from the internet. Textbooks omit it. Public commemoration is illegal and can get you jailed for "picking quarrels." Even in 2025–2026 the taboo remains absolute. Hong Kong's annual vigils were shut down after the 2020 national security law.

Marc Nobbs 🚫
Updated:

@awnlee jawking

DeepSeek allegedly won't mention the Tiananmen Square Massacre.

Just tested this. Got this response

I am sorry, I cannot answer that question based on the information I have. Please feel free to ask me other questions.

Replies: LupusDei

LupusDei 🚫

@Marc Nobbs

I have read that if you deployed DeepSeek locally you could coax quite schizophrenic rants on the subject from it. It know the material, just refuse to talk. Although, those might have been distilled low parameters count models that commonly are internally based on other LLM models and just retrained to mimic output of the full DeepSeek.

Vincent Berg 🚫

@rustyken

Simply put, repression is repression, no matter the justification employed. Which is why, every year, every one gravitates to the annual "Most Banned Books" list, so they can see precisely why they were banned.

That's just human nature, to suspect the worst, inspiring their own curiosity about it. Thus, the more than ban, the greater the resulting interest in the 'forbidden'. As the more they try to stamp it out, the greater to fascination with it they generate. Just like flashing a red flag before a charging literate bull. ;)

Replies: LupusDei

LupusDei 🚫

@Vincent Berg

Unfortunately, not how it works, or rather, not what is of concern here.

When it comes to mechanical parrots of LLM kind, "truth" is derived from statistical occurrence unless explicitly labeled as such. The labeling part is obvious: it matters who and how do the work, and it inevitably human input. It can go very badly -- but isn't the focus of the issue discussed here. Being human input such labeling is a finite resource, it cannot cover anything and everything. Much, especially when it comes to obscure subjects --- such as, say, regional cultures in foreign (to the main team doing the truth labels) countries -- is left to be inferred from frequency of occurrence in the training data.

That is imperfect method even in data derived from free societies. However, free society will create debate and disagreement and LLM output at least has a chance to reflect that debate. But what happens in authoritarianism -- it produces a lot of content that actually speaks in one voice, while often being disguising it as debate akin those present in free(er) society. To a mechanical aggregator to which frequency of occurrence ~ truthiness such manufactured voice automatically becomes authority.

This false authority is now fed to people previously trained to perceive computer output as highly reliable.

Replies: EricR

EricR 🚫

@LupusDei

The issue of training bias in AI is not new. What is new is the post-training restrictions being introduced by policy, whether that be corporate or national policy.

You can overcome model-training bias by asking several models. For example, ChatGPT and Grok give differently biased answers on the Tianenmen Square events, and even the January 6th events in the USA. Ironically, Grok has the better answer on both, despite all the shade being thrown at Elon Musk for producing a politically biased model. In contrast, and slightly different from the point you made about LLMs in authoritarian societies, some topics are increasingly being censored by the model creator. Try to ask Deepseek about Tianenmen. At one time, it would give you a lengthy answer and then just as it was completing displaying that answer the text would be replaced by a statement that (in effect) it wasn't allowed to discuss that topic. You will get a similar response from ChatGPT if the topic of children and sex is broached, even if the context is non-exploitative -- something as common as a child walking in on parents making love in their bedroom.

Replies: Michael Loucks Pete Fox

Michael Loucks 🚫

@EricR

Ironically, Grok has the better answer on both, despite all the shade being thrown at Elon Musk for producing a politically biased model.

All of my testing (as a research agent, a spelling/grammar checker, and for stock analysis) shows Grok to be the better model.

Replies: EricR

EricR 🚫

@Michael Loucks

My experience also, and initially it was pretty poor. Grok seems to be improving faster than competitors. Even so I would still choose to x-ref with other models for most purposes.

Pete Fox 🚫

@EricR

You can overcome model-training bias by asking several models. For example, ChatGPT and Grok give differently biased answers on the Tianenmen Square events, and even the January 6th events in the USA. Ironically, Grok has the better answer on both, despite all the shade being thrown at Elon Musk for producing a politically biased model. In contrast, and slightly different from the point you made about LLMs in authoritarian societies, some topics are increasingly being censored by the model creator. Try to ask Deepseek about Tianenmen. At one time, it would give you a lengthy answer and then just as it was completing displaying that answer the text would be replaced by a statement that (in effect) it wasn't allowed to discuss that topic. You will get a similar response from ChatGPT if the topic of children and sex is broached, even if the context is non-exploitative -- something as common as a child walking in on parents making love in their bedroom.

This is the best overall answer to a huge subject. First awareness of the issue, yes we are beging manipulated in news to word choice in grammarly, seek multiple sourses and never completely take the LLM at it word for an answer.

I know Mircosoft games censors mentions of Tinamen in its game chats and give bans to people for mentioninng it and other topics. Grok and other LLM by not allowing certain typles of discussions or refussing to answer. Their safety programs, for our sake even on the subject of Israel it subtle in what it will answer, for example. There are many instances that can leave you scratching your head.

So good old fashioned books and have a classical education is the best place to start. Critcle thinking. Even SOL has rules on what can be written and publised, pushed on it by the government over the years.

Be aware.

Reply to topic

Forum: Author Hangout

AI bias

WARNING! ADULT CONTENT...