Sunday, March 12, 2006


It's been months since my last post, but not for lack of energy or work, just haven't felt like blogging. But here I am - back, and we'll see how long this goes. So, much progress to report on two electronic dictionary projects:

(1) Hebrew dictionary: my very competent undergraduate independent study student has helped create a couple perl scripts that provide useful data from my electronic Hebrew dictionary. We can now type in a word, and get back all lexical neighbors of that word in the dictionary. This will be quite useful when fishing for experimental items if we want them to match in neighborhood density. Second, together with Andy W., he's written another perl script that will calculcate the uniqueness point for any word entered. What's nice about this script is that it accepts regular expressions, thus allowing me to calculate the average uniqueness point over any subset of the lexicon that can be described with a regular expression. Think of the possibilities! I have already calculates average uniqueness points for each of the seven binyanim (verbal classes) of Hebrew, and they all fall somewhere between the second and first segment from the right edge of the word. In other words, average uniqueness points are (as I have been predicting!) close to the end of the word. This has important consequences for lexical access which we can now begin to explore.

(2) Maltese dictionary: my hard-working graduate research assistant has been working on this for eight weeks now, and we now have a full text-editable Maltese-English dictionary! Nothing like this exists anywhere else for Maltese, so this feels very ground-breaking. Our next step is to turn the document into an xml database to be mined for all sorts of things, much like the Hebrew dictionary: we can calculate segment co-occurrence statistics, neighborhood densities, uniqueness points, etc. And since it's Maltese we can also do things like examine proportions of lexical sub-statra derived from different origins (most importantly, Semitic vs. non-Semitic). Also very exciting!

Meanwhile, a commentary on the somewhat larger picture: in many conversations lately with phonology colleagues worldwide, there seems to be a shared perception of an impending paradigm shift. "Watch out OT, your days seem to be numbered!" is what many of these colleagues are saying, and their voices are becoming louder and louder. It's been pretty clear for awhile now that the branches of linguistics outside of phonology have been awaiting this moment for awhile, and are keen to be able to start talking to phonologists once phonologists realize there's a bigger world out there than the world of formal constraints. In my own day-to-day life as a formally-trained phonologist, it's been exciting to connect with colleagues in psycholinguistics, language documentation and revitalization, computational linguistics, and phonetics, but much of the excitement doesn't involve OT anymore. My advice to those phonologists who are unsure of what to do in the face of this shift: be scientists! don't let your devotion to a particular approach or theory blind you to the reality that science involve progress, inevitably requiring the modification of extant theories, the creation of new theories (or - gasp! - looking back at old theories), or the exploration of data whose methodology and technology may be somewhat unfamiliar.

Some people ask me: Well, Adam, you can dish it out, but can you take it? Do I take my own advice? You bet! This semester, I have the extremely good fortune of not teaching (I could go on and on about how awesome that is!), and along with getting a bunch of papers finished and out there, I am auditing two classes, taught by two amazing colleagues here at the University of Arizona. Natasha Warner's "Statistics for Linguists" class rocks my world every Tuesday and Thursday - I can't believe I wasn't SPSS-literate before (not that I am quite the crack SPSS user yet, but I am working on it). The 2-factor within-subjects ANOVA is what's making me excited this week and next, and despite the streotypical attitude many people have toward statistics, I find it very stimulating. The other class is Ken Forster's seminar in Lexical Access, which is providing me with the opportunity to learn an almost overwhelming amount about different models of lexical access (including, of course, his own), as well as crucial information about experimental design and analysis (which we also cover in stats). So I am feeling a lot like I am being re-trained this semester, and it's terribly fun.

That's all for now, at least on the linguistics side of things. I may post something separately about various goings-on in the city.


