off by one

Categories: pedro | software:alphacount


Fri Sep 09 12:16:14 Now I know my E-S-I's, let's go out and eat some fries!:

So, as I was sitting in class last night, I had a question. "Why is the alphabet we know sorted in 'abc' order? Or why is the (English) alphabetic order 'abcdefghijklmnopqrstuvwxyz'? I figured that it must have something to do with usage; a is very common, x,y, and z are not that common. Clearly though, it is not a strict order by usage or commonality. I thought, why don't we learn the alphabet in order of usage? It would certainly be interesting to see. Then I figured that it must be on the Internet somewhere, or if it wasn't, it would be a pretty easy program to write.

Well, I couldn't find the answer on the 'net, so I wrote a quick Perl script to count the letters in a list of words. I used the Enable2K list, which is a "Official Scrabble Player's Dictionary"-like list of words. It's good for this purpose because it doesn't contain things like, "dog, dogs, and dogs'" which lists purposed towards spell checking often do -- it contains each word once and let's you haggle about suffixes.

On closer inspection, usage really has nothing to do with the alphabetic order, and there were even a few surprises... here it is:

e s i a r n t o l c d u p m g h b y f v k w z x q j

I knew that e, s, i, a, r, n, and t would be way up there. Anyone who watched Wheel of Fortune knows that -- "RSTLN and E", right? But there were some surprises (for me). I didn't realize that m would be in the middle, and I certainly wouldn't have thought of b as being less common than m. I really wouldn't have guessed that j was less common than q.

Here is the software I wrote. It allows you to interchange or add lists easily... but be careful what lists you use.

A-ha! Just now, by searching for "the least common letter in english" I found a more scholarly (and slightly different) version of this information. Their list is:

e a r i o t n s l c u d p m h g b f y w k v x z j q (ask oxford)

e s i a r n t o l c d u p m g h b y f v k w z x q j (tastytronic)

Discuss.


[Main]

Unless otherwise noted, all content licensed by Peter A. H. Peterson
under a Creative Commons License.