As explained in the preface, this site was motivated by my desire to improve my oral comprehension of Spanish, combined with the conviction that a large part of my difficulties with oral comprehension stemmed from an insufficient understanding of Latin American dialectology: how else could it be that I can understand perfectly the president of the republic, but not the Venezuelan telenovelas stars? Very late in the project I discovered a quote that vindicates that conviction:

It is important for instructors to realize that most students who begin their study of Spanish after puberty will not achieve native-like pronunciation. Rather, the goal, in our view should be intelligibility. As Bongaerts (1999) notes in his studies, the only way for students to reach native-like pronunciation with an age of acquisition beyond puberty were those who had explicit phonetics training.

I must say that in general I find phonology to be greatly underestimated as a tool by teachers of second languages, who tend to focus on grammar and vocabulary after the first week of class.

I admit that my requirements for a methodology to learn Spanish phonology may not have been typical. I wanted:

I was frankly surprised not to discover anything that met those requirements.


Some of the resources I looked at were:

    Works on Latin American Dialectology

  1. Canfield’s Spanish Pronunciation in the Americas. Compact and authoritative, but informal and very difficult to use because of the non-IPA notation.
  2. Lipski’s El español de América. A marvelous work that covers much more than just the phonology of each country. Unfortunately, the phonology portions are informal, and are scattered throughout the book’s 400-plus pages, making their study for my purposes problematic. Even worse, like Canfield Lipski uses a non-IPA notation, which means a laborious and error-prone translation process.
  3. The web site Catálogo De Voces Hispánicas from the Cervantes Institute. A very accesible and polished web site with videos of representative speakers as well as informal descriptions — using the IPA! — of the main phonetic features of each of the major dialects. Unfortunately I discovered this site midway through the project, and have not gone back to see if there is material there that I could use to improve this site.
  4. Piñeros’s Dialotecteca del Español. Of all the resources I ran across, this is the one closest to my vision. It has a systematic description of the phonological rules, along with a mapping from them to geographical areas. Although the rules in the Dialotecteca do not use a formal notation, its informal descriptions are systematic and were easy to incorporate into my treatment. The Dialecteca also has a collection of almost 30 videos of speakers from those geographical areas, and it uses IPA. Best of all, it has recordings of Aesop’s North Wind fable which the IPA uses to illustrate the use of the IPA in the languages of the world; I had been surprised not to find the North Wind passage recorded for each dialect when I started my project, and had in fact started collecting my own recordings. I was very sorry that I stumbled on this site so late in the project.
  5. Resnick’s Phonological Variants and Dialect Identification in Latin-American Spanish is for me a nostalgic work; the enormous quantity of data produced on a line printer in all upper case brings back my early days in computing. I sought out this work early in my project after reading what Canfield says about it:
    Beset by these concerns [the scattered patchwork quilt of Spanish phonology] and by the tendency of some writers to try to create neat zones, Melvyn Resnick (1975) decided to cast aside taxonomic considerations and look at all the available data on phonological varients in the total area.
    Looking back, that should have been a tip-off to the complexity of the project I had undertaken. It would be a fun project analyze this data set using modern methods if it were available in a machine-processable form.
  6. Works on Spanish Phonology

  7. Guitart’s Sonido y Sentido. This solid work covers the phonology of Spanish thoroughly, and includes many useful discussions of lectal variations, although they are not the book’s primary purpose. It uses the IPA, so is much more accessible than Canfield or Lipski. Unfortunately, Guitart is a verbose writer, and the book falls short on the concision front. There are no formal descriptions of the allophonic transformations, which meant that I had to assume the responsibility for generating the rules from Guitart’s unwieldy prose, a task for which I am not particularly qualified.
  8. Whitney’s Spanish-English contrasts : a course in Spanish linguistics. I found this to be the most useful treatment of Spanish linguistics for my purposes, and a joy to read. Whitney’s discussion of phonological rules in Chapter 3 became my guide for writing the formal rules on this site.
  9. Hammond’s The Sounds of Spanish. This is a thorough exposition of the standard Spanish sound system, with a brief chapter of the dialectical variations in Latin America. Unfortunately I could make no use of it because it does not use the IPA and I was unwilling to memorize the 89 symbols that Hammond uses. I was willing to transliterate for Canfield and Lipski, because I had no choice, but the payoff for this book wasn’t worth it. Needless to say, I found his arguments against using the IPA unconvincing.
  10. Face’s Guide to the phonetic symbols of Spanish. This book is an essential reference for anyone forced to use a source that doesn’t use the IPA. My one quibble is that it should have cross reference tables, so that one could look up, for instance, all the non-standard ways that have been used to write a given IPA symbol. I admit to being a standardization bigot, but I do believe that science isn’t possible without shared, common languages, and to me the fact that this book exists is a sad commentary on the immaturity of this field.
  11. Hayes’s Introductory Phonology. This book has another discussion of phonological rules that informed my project.
  12. The many Wikipedia articles on Spanish Phonology. The coverage of these articles leaves something to be desired, but many of the country writeups are quite serviceable.

The Allophonic Rules

I confess that the present site only partially integrates all the inputs discussed above. As I discovered new resources, I tended to adopt them and abandon the old ones in mid-stream. I am not sure that is a serious problem; understandably, there is a lot of overlap in the data, and I have only a mild form of James Murray’s obsession with finding every last nuance of every last piece of data. I am not a professional; I just want something I can use to better understand my telenovelas, and something that will hopefully be of use to others in my situation.

From an implementation perspective, the goal of this part of the project was to compile as large a collection as possible of the phonological rules of Spanish, in a uniform, formal notation. This seems like a straightforward goal, but the truth is that at many points I floundered, uncertain about how to achieve it. This was doubtless because of my relative inexperience in the matters of real linguistics. Perhaps I should have waited until I found some expert help, but my impatience compelled me to forge ahead, with the current web site as the outcome.

Some of the general issues I grappled with are:

  1. What types of rules to include. I started off with an informal notation that included a simple “a → b” shorthand for the transformations. Then I switched to the fuller formalism sketched by Hayes and Whitney. Finally, I went back and added informal rules that are just English prose. I tinkered with the idea of using a feature-set formalism, but couldn’t really make that work. I do however use feature sets as modifiers at several points. I toyed with the idea of teaching myself optimality theory, but decided that would be overkill for my purposes.
  2. How to gracefully merge multiple sources when the level of generalization varies wildly. When Canfield says that in Venezuela “/b/, /d/, /g/ may be pronounced [b], [d], [g] after any non-nasal consonant”, and Guitart says essentially the same thing about El Salvador but doesn’t mention the constraint of the non-nasal consonant, what should one do? Assume that there is a true geographic variation and treat them as two distinct rules? Assume that Guitart simply left out the constraint for brevity’s sake (unlikely though that may seem!), and therefore conclude that there is one rule and that the constraint applies to El Salvador as well? Run the risk of complicating the formalism by finding a way of including the constraint with a meta-level annotation that the status of the constraint in El Salvador is unclear? I did not do anything systematic here, but tried to err in favor of preserving the distinct rules.
  3. How to deal with unspecified contexts. Should they be treated as ”some unknown specific context” or ”all contexts”? When Piñero writes
    La fricativa alveolar sorda se articula usando el apice para formar un pasaje estrecho contra la zona anterior de la cresta alveolar
    does it mean that this variation always occurs, or simply that that for Piñeros’s purposes the exact context specification is too complex, or is irrelevant?

    Sometimes I would insert an empty context, “ / ___”, meaning “occurs systematically everywhere”, into every rule, for consistency’s sake. At other times I would delete all such empty contexts. This is one place where an expert may be able to help.

  4. What to do with sources that don’t use IPA. As explained above, I transliterated for Lipski and Canfield, and boycotted the rest.
  5. How to describe the transformations, in the informal rules, using English alone. I ended up using a systematic set of terms, some of them neologisms as far as I can tell. This set of terms can describe any vertical or horizontal movement within the IPA chart. For horizontal movement, there are bilabialization, labiodentalization, dentalization, alveolarization, postalveolarization, retroflexation, palatalization, velarization, uvularization, pharyngealization, glottalization, and for vertical movement there are defricativization, nasalization, vibrantization, fricativization, latero-fricativization, approximatization, and latero-approximatization. This set of terms may have its infelicities, but it is complete and regular.
  6. What data representation to use. The use of XML to capture the phonological rules was an easy choice to make. For ease of development I have freely mixed XML and non-XHTML HTML, shamelessly exploiting the fact that current browsers can handle this hodge-podge just fine. It means that I can use all the mature facilities of HTML for linking, graphics, etc., yet still have the advantages of semantic markup without resorting to '' kludges. This style usage is supported by at least Firefox and Safari. For my purposes using xmllint to check for well-formedness is fine; validation checking would be much more trouble than it’s worth.

    A surprisingly agonizing issue was what to do about styling links. I am an easily distracted minimalist, and wish there were simply a special key to toggle the display of link styling, so that one could turn it off when trying to actually read the text. I understand the arguments in favor of using different colors for visited and unvisited links, and of underlining, but the truth is that underlining technical formulas looks awful. In the end I settled on using a subtle blue color without underlining.

  7. Whether to create the site in Spanish or English. Ideally there would be both an English and a Spanish version of the site, but instead there’s just one version with a mish-mash of Spanish and English.

The Rule Formalism

A number of issues arose specifically with respect to the formal rules, beginning with what the exact notation for formal rules is. Hayes, Hammond, and Whitney do a fine job of motivating the use of formal rules, and the basic “a → b / ___” format is obvious, but I found no grammar for more complex cases involving alternatives, constraints, usage, comments, etc. The need for handling such cases grew out of a design goal that stretches the notion of a formal rule. I wanted the set of “formal” rules to be able to stand alone, so that it is possible to turn off the display of the “informal” rules (using CSS) and still have a coherent document. That is, it should not be necessary to refer to the informal rules in order to make sense of the formal ones. In the extreme case, an observation such as

The alveolar fricative /s/ tensely grooved and strongly sibilant (cf. highland Mexico and Andes)
must count as a “formal rule”, even though it is not in the basic “a → b / ___” form.

In the end I came to treat this type of problem as a markup issue that could be solved by introducing appropriate XML elements; this allows postponing decisions about the presentation syntax. Using this approach and defining a comment element, the example becomes simply

<formal><com>The alveolar fricative /s/ is tensely grooved and strongly sibilant (cf. highland Mexico and Andes)</com></formal>
This approach feels very natural to me, and solved many other issues such as how to capture the informal rule
En sílaba acentuada, las vocales /e/ y /o/ se producen con mayor abertura, lo cual las convierte en [e̞] y [o̞], respectivamente
in a formal rule. Writing
/e/, /o/ → [e̞], [o̞] /___Sílaba acentuada
is obviously wrong since the transformation is in the accented syllable, not before it. It seems pretty natural to introduce a constraint element:
/e/, /o/ → [e̞], [o̞] /___<constraint>Sílaba acentuada</constraint>
Currently the rules use three such elements: “constraint” for specifying aspects of the phonological environment, “freq” for information about the sociological applicability of the rules (“among the urban, well-educated class”), and “com” for all other comments. All three elements are displayed using angle brackets:
/e/, /o/ → [e̞], [o̞] /___Sílaba acentuada

The need to handle rules that are not in the basic “a → b / ___” form led me to introduce a “NOT” operator that applies to rules:

<formal>NOT(/'ado/ → ['ao] / ___)</formal>
This syntax replaces an earlier notation using a “↛” operator that looks nicer but doesn’t handle the non-basic rules.

I should perhaps explain that I introduced a “§” to represent the nucleus of a syllable. A regular “$” might have worked, but I wasn't sure enough of that to want to throw the distinction away.

One problem I simply gave up on was how to introduce anaphora into formal notation. In a rule like

/ɾ/ → CC / ___C
I would like to be able to capture the fact that the two C’s on the left-hand side of the rule are references to the C on the right-hand side of the rule. I’m sure this problem has been solved, but I haven’t taken the time to research what the solution is.

The Country Samples

In addition to the formal descriptions of the allophonic rules, it seemed obvious to me that examples of their application in a controlled environment would be of great value in learning other dialects. Early in the project I looked for the Aesop’s fable that the IPA uses as a standard sample of the languages of the world, and was surprised to find only two for Spanish: one from Madrid, the other from Buenos Aires. I embarked on a sub-project to record the passage as spoken by my Latin-American acquaintances. Very late in the project I discovered that Piñeros has the fable as recorded by no fewer than 29 different speakers. Unfortunately he does not provide transcriptions for those 29 samples. I have provided transcriptions for most of my samples, but am unsure of their quality because they were produced by an amateur (me!). I think the use of a script to compare the samples so as to highlight the variations in the accents turned out quite well.


I would like to take this opportunity to thank Marta Ortega-Llebaria for her assistance in a moment of need. She pointed me towards Pace’s Guide and Lipski’s El Español en América, without which I would not have been able to make such progress as I have.

	General: what do you do when contexts aren’t specified
	Formal: what if the rule is not a rewrite rule? posnuclear? algebra of rules?
	Notation: not using IPA is a pain
	Jargon unification: there should be a noun for every move up or down, left or right. Velarization, palatalization, etc. Fill in the table.

