Clean TXT IconClean TXT RobotAlpha Test
Due to popular use, I'm running over my free server quota and so this robot will stop working for most of each day.
Please donate so I can pay for the server time and keep it serving.
Please DON'T add to non-English waves or ones with quick changing gadgets (like Sudoku or counter/timer ones)

A Google Wave robot that cleans up blips and English text.
 
Wave address: cleantxt@appspot.com
 
Updated: 16 Dec 2009 Started: 17 Oct 2009
 
Author: cmdskp+ct@gmail.com
 
Google sandbox wave test here


Check out the FAQ for answers to common questions.

Please be aware that this robot focuses on correcting safer mistakes, not all that can happen - and more things are added and improved frequently.

I may put in a full dictionary, but because this is hosted on Google for free, there are quota limits which means it has to remain very fast and efficient to help everyone.

I'd like to thank the people who have offered positive messages - they are what inspire me to keep CleanTXT alive and working on it!


Before CleanTXT (please note Google Wave inserts extra returns and joins words at times on pasting this!):
if u want to c hwat i has done to ur text - see below. plssssss dont think i ahve a full a spell- hcecker currently. K? i kno abt some typos nd many gd shorts, idk everyhting, but for hte 76 K ima not bad. I can handle temp. 30 c c?








im still new, but i new you might want to try me out. i cant fix everything,but some thinggggs are better than nnone. so i get better but i try to take care with what ppl want (E.g. "i"),so plz help every1 by reporting when I go wwrong. kthx!
After CleanTXT:
Clean TXT IconIf you want to see what I have done to your text - see below. Please don't think I have a full spell-checker currently. Okay? I know about some typos and many good shorts, I don't know everything, but for the 76 K I'm not bad. I can handle temp. 30 C see?



I'm still new, but I knew you might want to try me out. I can't fix everything, but some things are better than none. So I get better but I try to take care with what people want (E.g. "i"), so please help everyone by reporting when I go wrong. Okay, thanks!

Current features:

  • Robot accessible Annotations adapted
    • 'ur' becomes 'Your', colours scaled to fit replacement
    • 'FWIW' becomes 'For what it's worth', colours per word
    • (Note: Wave doesn't currently expose some annotations like bullet points, so they can't be adapted)
  • [CHANGE] Had to remove language detection since it generated too many false positives.
  • Converts some common mistypes in words (e.g. 'hgost', 'taht', etc...)
  • Converts most chat acronyms and shorts to full
  • Corrects certain repeated letters to single or double on context
  • Corrects certain grammar or phonetic spelling errors (e.g. 'Your using', 'I no', etc...)
  • Corrects URL typos in the 'http://www' (Improved detection of www. URL's)
  • Capitalises sentences, separate clauses & month abbreviations
  • COrrects double capitals at the start of sentences except for likely acronyms
  • Inserts a space after commas and semi-colons (except in numbers)
  • Removes empty blips (blank or single non-whitespace character)
    • Including pre-existing ones
    • Those never submitted are not deleted.
  • Reduces repeat empty lines to a maximum of three (Trims off trailing ones)
  • Reduces repeat words to a maximum of three
  • Reduces repeated characters to three (but allows non-text lines of '-', '=', etc.)
  • Leaves single lettered words alone with double quotes (e.g. "i") or when preceded by certain words (e.g. Ctrl r, etc.)
  • Leaves email addresses alone
  • Leaves emoticons intact
  • Leaves bracketed text alone ( supports nested brackets)
  • Leaves source code method calls alone (Java/C++ style)
  • Doesn't capitalise within first sentence in //comments
  • [Optional] 'Approver' gadget () for rating blips, with [optional] automatic adding to new blips

Commands (inside blips):


Note: The commands are deleted automatically from the blip text.


Future versions:

These would be optional settings:

  • Removal of phrases that are just random typings (and not likely anagrams)
  • Removing repeat (or identical) posts already present in a wave
  • Public versus Private wave modes
  • Emoticon text replacement with images (I may leave this to other bots...)