The Irvine Phonotactic Online Dictionary or iPhod was developed in the Hickok Lab by my (now former) grad student Kenny Vaden. iPhod provides word frequency, phonotactic probability, neighborhood density, etc. values for a large number of English words, as well as measurements for nonwords. The dictionary is publicly available for research use either by downloading it or simply using the online search that Kenny has recently set up. Check it out at:
Kenny has also set up an iPhod blog to provide a forum for questions and future development of the database.
Here is Kenny's more detailed description of what iPhod does:
The Irvine Phonotactic Online Dictionary (iPhOD) is a resource that was developed at UC Irvine in 2003 for research on phonological processing of words and pseudowords. The database can be used for word and pseudoword selection, in order to control or manipulate sublexical or lexical phonological aspects of stimuli. The IPhOD contains 33,432 words and 815,066 pseudowords with Kucera-Francis word frequencies (1967), CMU Pronouncing Dictionary transcriptions (Weide, 1994), and several values that we derived: phonological neighborhood density, positional probabilities, and second- and third-order phoneme-sequence probabilities. The database is publicly available online to search or download, so other researchers may use it in their studies. If a word or pseudoword is not included in the database, some IPhOD values can be calculated online using input phonological transcriptions. On the website, we describe the motivation for the database, the computations used, and examples of their use in experiments concerned with phonological processes in speech. There is also a blog so users can give us feedback, ask questions, and make suggestions for other interesting phonological measures. http://www.iphod.com