&#$*@%$# Profanity Filtering
I've always known kids to be a lot smarter than the tv commercials try to portray them, and find it highly inplausible that any would fall for that kind of thing... even if such predators do exist, they arent actually a threat...
Quote: Original post by Anonymous Poster
Aren't Internet Predators just an urban legend? A mass hysteria?
Well I started making a few searches to prove my point, as I think I know this is a valid concern. And from what I see on US websites, this IS a mass hysteria! I mean, in my opinion, maybe a hundred children a year are "predated" via internet, which is enough to raise concern when you run a kid-targeted website, but US media claim that 20% of children are offered sexual propositions via internet. Further debunking lower this stat to 3% which still seems a hell lot for me.
So I would say, yes there is a mass-hysteria but no, this is not a urban legend. Just an issue which has been over-hyped by medias. Definitely an issue of concern if I was to run a kid-targeted website.
You could manually inspect lists of candidate bad words to blacklist or whitelist, coming from
a) user submissions
b) automatic searching chat transcripts for words that are similar to existing censored words.
Omae Wa Mou Shindeiru
Then, instead of removing the word for other people, simply replace with with a random unlikely-to-ever-be-inappropriate word. For added humor, select words like "fluffy", "shiny", "pretty", "clown", "flower" etc
As for the method to use, I would use a three-table approach: First is a table of character transformations that turns "$" into "S", "@" into "a" etc; Second is a table that performs letter-group replacements in an attempt to account for typos, misspellings, etc; Third is a list of banned words.
Using the first two tables, you can generate a list of possibilities for each word, and you can then check the dictionary to see if the word is present or not.
You might want to employ phonetic algorithms, such as Double Metaphone, as another dictionary to detect misspellings that allow the same or similar pronunciation.
A more fullproof method would be to simply have a 'white list' instead of a 'blacklist' - a filter based on allowing only specific words. If you can find a fairly complete wordlist, you can combine it with a name list based on census data and then the only thing missing is fictional names / places / etc that you can add manually as desired.
Start with a "hard" list. Every word on that list is censored. If a word is "close" to a word on the hardlist, log it off somewhere and look at the log once in a week to add new words to the list.
Dont go too nazi on the filter; people (especially kids) will always find a way to go around it anyway.
1. Moderators to approve everything
2. Canned chat (i.e. You have a combobox with canned phrases rather than an edit field).
Disney actually used canned chat to their advantage. In their online games, one of the rewards was to give users more canned phrases as a reward.
(my byline from the Gamedev Collection series, which I co-edited) John Hattan has been working steadily in the casual game-space since the TRS-80 days and professionally since 1990. After seeing his small-format games turned down for what turned out to be Tandy's last PC release, he took them independent, eventually releasing them as several discount game-packs through a couple of publishers. The packs are actually still available on store-shelves, although you'll need a keen eye to find them nowadays. He continues to work in the casual game-space as an independent developer, largely working on games in Flash for his website, The Code Zone (www.thecodezone.com). His current scheme is to distribute his games virally on various web-portals and widget platforms. In addition, John writes weekly product reviews and blogs (over ten years old) for www.gamedev.net from his home office where he lives with his wife and daughter in their home in the woods near Lake Grapevine in Texas.
Actually I think I will have a separate filter for leetspeak, which when triggered will send your character to the n00b server.
Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...
George carlin's 2443 dirty words!
might be useful ;)
A game that I worked on (E rated) had a profanity filter that was a huge .txt file full of the "words you cannot say"... "dirtywords.txt". Open up this file and you'll learn every possible way to 1337speek the word "fuck".
Tiger Woods Golf, Grand Theft Auto: San Andreas and Elder Scrolls: Oblivion have all been reprimanded for data that was "hidden" on the disc that was not submitted to the ESRB.
The ESA have now said there is something like an $11,000 fine per-unit-sold if any hidden "graphic" content is found again. It's probably a good idea to encrypt these dirty words files to make sure no kid puts it in their PC and finds it, then shows it to mommy.
Thats all. Now back to our regularly scheduled censoring. :)
Check out my new game Smash and Dash at:
I think it would be interesting to email back the mother who complained, and claim: we have examined the chat logs and discovered that your child was the one who first used that word in our game, in fact another mother is complaining that her son has learned dirty language from speaking with yours. You have been banned, when you teach your children about good language you may submit a request to unban. Have a nice day.
just to see what kind of responce you get