Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Jeff Kinz wrote: | On Tue, Jan 17, 2006 at 10:23:17AM -0500, Christopher Schmidt wrote: | > http://languid.cantbedone.org/ | > http://languid.cantbedone.org/Language-Guess.tgz ... | Why I'm "wowed": | | This tool appears to use some form of statistical analysis based on | how often certain three "character" strings appear. Also, whitespace is | one of the characters. Very nice, and thanks again to Chris. | | Here's a few random lines of the English "strings" file: | t t 45 | be 46 | ld 47 | e a 48 | rs 49 | wa 50 | ut 51 | ve 52 | ll 53 This works better than most people would believe. Some years back, I had a bit of fun with it at a place that I worked. I wrote a litte program to collect these trigraph statistics and fed it a stack of company email memos. Then I wrote another program that generated pseudo-random text with the same statistics. This output got piped to another program that added random punctuation and capitolization with stats from the same source. Another program added email headers and sent the results out to a mailing list. The recipients really loved the results. I heard people reading them to each other, and breaking out laughing. Several ended up on bulletin boards in the hallways. I also tried it with 4-char sequences, and it was interesting that the results weren't much funnier. More of the words were real English words. But even with the 3-char case, almost all the words that came out were pronouncable and looked like they could be English words. I also generated a few man pages with the same programs, using the online unix manuals for the statistics. With those, the 4-char statistics worked better, because they picked up a lot of the unix tech terms and phrases, and mixed them in pseudo-randomly among the pseudo-English words. The joke is only funny for a short time, though, and quickly becomes rather repetitive. Part of the reason that Jabberwocky has been such a success is that it's short. An epic poem in the same style would put you to sleep after a while.
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |