Why do censorship systems have probems with context?

This one has hit me several times recently, here on Thurrott.com and also on Twitter, but the problem goes much deeper and the way Big Tech, especially, deals with it is also problematic.

We had the discussion about CSAM on another thread yesterday. In Germany, there have been a few cases of Microsoft & Google jumping the gun. They have reported users to the police for alleged child abuse, but instead of freezing their accounts, until the investigation is over, they simply delete the accounts, with no recourse, once you have a signed, legal document from the police that it was a false alarm and you are innocent. That means all photos, documents are gone, but also all purchases of music, films, games, play history and achievments on Xbox, for example, all gone, because the algorithms made a mistake.

We all want to stop abuse, especially of children, but Big Tech can’t just say people are guilty, even if the police and the courts say they aren’t. They shouldn’t be able to destroy your online life, just because they made a mistake.

On X/Twitter a week or so back, a group of Brits, including Iain Thompson, over at The Register, had our posts censored, because we were talking about savoury meat balls cooked in gravy, commonly called f a g g o t s , not to be confused with the bundles of small bits of wood, used to start a fire. I actually protested all of the sanctions against my tweets and they were eventually all published, because they weren’t offensive – at least, only offensive to vegetarians. But, if they are going to use an algorithm to censor speech, why doesn’t it differentiate between the normal use of the word in a normal context and the co-opted, derogatory use of the word?

(And don’t get me started on the English slang for asking someone for a cigarette: asking somebody if they can give you something is called bumming, and the slang for a cigarette is the first three letters of the name for the meat balls above, all totally innocent to an English person).

Likewise, the forum software, here, took offence to me writing about waking up in the morning! It didn’t look at the context, it just said that the past tense of the verb to wake up is bad. Full stop, end of story.

Other words that have normal, civil and often nice uses have been co-opted over the years. For example, if you use the 3 letter word to describe a child as being cheerful, light hearted in an excited way, you are likely to get your post banned.

I had a post about a Platt German (regional dialect) and how ChatGPT messed it up totally, but it got censored on this site, because the Platt word for new is similar to the “n” word, without the “r” on the end. Did it look at the context? Nope, it just assumed I had mispelled the “n” word.

If we are going to rely on these tools and algorithms, whether it be CSAM or censorship alogorithms, shouldn’t we expect that they are accurate and nuanced? What is a normal word in different regions can be used in other regions to cause offense, if used in a certain way. But the algorighms don’t look at context, they just say “verb to wak, past tense = bad”, “meat balls in gravy = bad”, “the word new in a different language = bad”, “the word for happy & carefree = bad”, whether they are being used properly or as an offense.

Windows Intelligence In Your Inbox

Sign up for our new free newsletter to get three time-saving tips each Friday

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Thurrott © 2024 Thurrott LLC