Tags: algorithm, appendix, bulges, cognitive technologies, computer programmers, gabbay, histogram, humble effort, jorg siekmann, language processing, linguists, lund university, niche, nlp, phonetics, probability mass, prolog, springer, textbook, valleys,
Briefly Noted
An Introduction to Language the corpus by discounting them, and it
Processing with Perl and Prolog shifts the probability mass it has shaved
to the unseen bigrams.
Pierre M. Nugues
(Lund University)
The formula follows, but the reader ap-
Springer (Cognitive technologies
proaches it equipped with the unforgettable
¨
series, edited by Dov Gabbay and Jorg
Siekmann), 2006, xx+513 pp; hardbound, image of someone shaving excess material
ISBN 978-3-540-25031-9, $109.00, 90.90 off the peaks and bulges of a histogram and
spreading it into the valleys.
This is an ambitious textbook, and, in my
This comprehensive NLP textbook is strongly
opinion, too much for one semester; it must
algorithm-oriented and designed for talented
be used selectively. If nothing is skipped, then
computer programmers who might or might
besides getting a thorough course in natural
not be linguists. The book occupies a mar-
language processing (except phonetics), the
ket niche in between that of Jurafsky and
student is expected to learn both Perl and
Martin (2008) and my own humble effort
Prolog along the way, aided by a 50-page Pro-
(Covington 1994); it resembles the latter in ap-
log handbook in Appendix A. My experience
proach and the former in scope. Perhaps more
is that Prolog is very hard to learn on the
than either of those, Nugues's book is also
fly; in fact, extensive experience with other
useful to working professionals as a hand-
languages may be a disadvantage because
book of techniques and algorithms.
Prolog is so different. Nonetheless, all the in-
Everything is here--everything, that is, ex-
formation needed to learn Prolog is here. Perl
cept speech synthesis and recognition; pho-
is treated more casually because it lends itself
netics receives only a four-page summary.
much more easily to incremental learning.
Those wanting to start an NLP course by cov-
Other minor quibbles are possible. On
ering phonetics in some depth should con-
page 31, automata is used as singular. On page
sider Coleman (2005) as well as Jurafsky and A
69, the treatment of RTF, TEX, and LTEX is so
Martin (2008).
short that the student will come away largely
After a brief overview, Nugues covers cor-
unaware of what each of them is actually de-
pus linguistics, markup languages, text sta-
signed for. This is not pernicious, however,
tistics, morphology, part-of-speech tagging
as the student will realize that he or she is
(two ways), parsing (several ways), seman-
inadequately informed.
tics, and discourse. "Neat" and "scruffy"
Nonetheless, this is an unusually useful
approaches are deftly interleaved and com-
and well-written book, and I plan to recom-
pared. Unification-based grammar, event se-
mend it to my students as well as using it
mantics, and tools such as WordNet and
myself as a handbook.--Michael A. Covington,
the Penn Treebank are covered in some de-
The University of Georgia
tail. The syntax section includes dependency
grammar and even the very recent work of
Nivre (2006), as well as partial parsing and
statistical approaches. Many important algo-
rithms are presented ready to run, or nearly References
so, as Prolog or Perl code. If, for example, Coleman, John. 2005. Introducing Speech and
you want to build a CockeKasamiYounger Language Processing. Cambridge, UK:
parser, this is the place to look for directions. Cambridge University Press.
Explanations are lucid and to-the-point. Covington, Michael A. 1994. Natural
Language Processing for Prolog Programmers.
Here is an example. Nugues is discussing the
Englewood Cliffs, NJ: Prentice Hall.
fact that, if you sample a corpus for n-grams, Jurafsky, Daniel, and James H. Martin. 2008.
some will not occur in your sample at all, but Speech and Language Processing, 2nd edition.
it would be a mistake to consider the unseen Upper Saddle River, NJ: Prentice Hall.
ones to be infinitely rare (frequency 0). Thus Nivre, Joakim. 2006. Inductive Dependency
the counts need to be adjusted: Parsing. Dordrecht, The Netherlands:
Springer.
Good-Turing estimation . . . reestimates
the counts of the n-grams observed in
601
Computational Linguistics Volume 33, Number 4
Language and the Learning Curve: supposed to explain children's nonlinear
A New Theory of Syntactic Development learning curve, as well as their ability to gen-
eralize their lexical knowledge to novel items.
Anat Ninio
However, the details of this central mech-
(Hebrew University of Jerusalem)
anism are left unspecified: it is not clear
Oxford University Press, 2006, xiv+206 pp; how and under which constraints the trans-
hardbound, ISBN 0-19-929981-1/ fer between two items can take place. More-
978-0-19-929981-2, £55.00; paperbound, over, one expects that the interplay be-
ISBN 0-19-929982-X / 978-0-19-929982-9, tween the item-specific knowledge and the
£24.95 transfer of learning be used to explain some
well-studied stages of language learning in
children, including imitation, (over)generali-
zation, and recovery from making overgen-
Ninio proposes a provocative theory of early eralization errors. (This so-called U-shaped
syntactic development. According to her the- learning curve is quite distinct from the non-
ory, children do not create any abstract rep- linear learning curve that gives the book its
resentation of language in the form of rules title.)
or schemas, nor do they develop any sys- Another radical proposal by Ninio is that
tematic linking rules between syntax and semantic similarity plays no role in early syn-
semantics. Instead, they learn a lexicalist syn- tactic development. More specifically, trans-
tax. Syntactic development consists of learn- fer of learning, and therefore generalization,
ing, for each individual word, the potential is based solely on a similarity of form. The de-
predicateargument relations and their ap- velopmental evidence provided for this claim
propriate syntactic realizations. Semantic va- is intriguing. For example, in the priming
lency (the potential semantic relations of a experiments where training on a large set
word with other items) and syntactic valency of verbs that appear in transitive sentences
(the ways to express those relations in sen- helps to elicit transitive usages from chil-
tences) are learned for each word separately. dren on novel nonce verbs, the semantic sim-
The syntactic structure of a sentence is pro- ilarity between the training and the testing
jected from the lexical valency of words by re- verbs does not affect the rate of generaliza-
cursively applying a single binary operation, tion. However, there are many studies show-
Merge, which combines the Head (the predi- ing that both children and adults use some
cate word) to its Dependent (the argument). kind of mapping between form and mean-
Ninio argues that the elimination of the ing in comprehension tasks. For example,
abstract rules from the process of syntactic preferential-looking studies show that infants
development is supported by evidence from choose one novel action over another based
child language research. Examination of the on the form of the sentence introducing the
learning curves (i.e., performance vs. experi- action (e.g., Naigles 1990; Fisher 2002). Sim-
ence) for different syntactic patterns shows ilarly, adult subjects predict certain semantic
no sudden change in children's productiv- properties for the arguments of a novel verb
ity with a particular syntactic pattern, or in based on the form of the sentence the verb
their ability to generalize that pattern to novel appears in (e.g., Gleitman 1990; Kako 2006).
items. At the same time, learning a syntac- In the absence of any abstract rules, these
tic pattern speeds up with experience. The findings can be explained only through a gen-
learning curves have an accelerating nonlin- eralization mechanism that takes into account
ear shape, which suggests that the acquired both syntactic and semantic similarity.
item-specific syntactic forms facilitate the ac- Despite the vagueness of some of the
quisition of the same syntactic form for new suggested mechanisms and occasional in-
items. Ninio suggests that, even though the sufficient analysis, Ninio's book presents a
knowledge of syntax is item-specific, lexical thought-provoking account of syntactic de-
items are not isolated. Instead, they are in- velopment that can be of especial interest
terconnected via transfer, the ability to extend to the computational linguistics community.
what has been learned in one context to new Its proposed view of syntax could be val-
contexts by analogy. The notion of transfer is idated through computational modeling. If
602
Briefly Noted
it turns out that the proposed theory is in- radically transforming the world of computer
deed compatible with empirical data, the systems, networks, devices, applications, and
underlying ideas could be directly applied so on, from the GUI (graphical user interface)
to the representation and use of syntactic paradigm into something which will enable
knowledge in different computational lin- a far deeper and much more intuitive and
guistic applications.--Afra Alishahi, Univer- natural integration of computer systems into
sity of Toronto people's work and lives.
"Jointly, the chapters present a broad
and detailed picture of where natural and
multimodal interactive systems engineering
References stands today. The book is based on select-
Fisher, C. 2002. Structural limits on verb ed presentations made at the International
mapping: The role of abstract structure Workshop on Natural, Intelligent, and Ef-
in 2.5-year-olds' interpretations of fective Interaction in Multimodal Dialogue
novel verbs. Developmental Science,
Systems held in Copenhagen, Denmark, in
5(1):5564.
Gleitman, L. 1990. The structural sources 2002 and sponsored by the European CLASS
of verb meanings. Language Acquisition, project. CLASS was initiated on the request
1:135176. of the European Commission with the pur-
Kako, E. 2006. Thematic role properties of pose of supporting and stimulating collab-
subjects and objects. Cognition, 101:142. oration among Human Language Technology
Naigles, L. 1990. Children use syntax to learn (HLT) projects as well as between HLT
verb meanings. Journal of Child Language, projects and relevant projects outside Europe.
17:357374. The purpose of the workshop was to bring
together researchers from academia and in-
dustry to discuss innovative approaches and
challenges in natural and multimodal interac-
Advances in Natural Multimodal tive systems engineering."--From the editors'
Dialogue Systems preface
Jan van Kuppevelt, Laila Dybkjær,
and Niels Ole Bernsen (editors) The contents of the volume are as follows:
(University of Southern Denmark) "Natural and multimodal interactivity
Springer (Text, speech, and language engineering--directions and needs" by
technology series, edited by Nancy Ide Niels Ole Bernsen and Laila Dybkjær
and Jean V´ ronis, volume 30),
e "Social dialogue with embodied
2005, ix+373 pp; hardbound, conversational agents" by Timothy
ISBN 978-1-4020-3032-4, $179.00 Bickmore and Justine Cassell
"A first experiment in engagement for
humanrobot interaction in hosting
"The chapters in this book jointly contribute activities" by Candace L. Sidner and
to what we shall call the field of natural Myroslava Dzikovska
and multimodal interactive systems engi- "FORM" by Craig H. Martell
neering. This is not yet a well-established "On the relationships among speech,
field of research and commercial develop- gestures, and object manipulation in
ment but, rather, an emerging one in all as- virtual environments: Initial evidence" by
pects. It brings together, in a process that, Andrea Corradini and Philip R. Cohen
arguably, was bound to happen, contribu- "Analysing multimodal communication" by
tors from many different, and often far more Patrick G.T. Healey, Marcus Colman,
established, fields of research and industrial and Mike Thirlwell
development. To mention but a few, these in- "Do oral messages help visual search?"
clude speech technology, computer graphics, by No¨ lle Carbonell and Suzanne Kieffer
e
and computer vision. The field's rapid ex- "Geometric and statistical approaches to
pansion seems driven by a shared vision of audiovisual segmentation" by Trevor
the potential of new interactive modalities of Darrell, John W. Fisher, III, Kevin W.
information representation and exchange for Wilson, and Michael R. Siracusa
603
Computational Linguistics Volume 33, Number 4
"The psychology and technology of talking "Nine (groups of) lecturers contributed to
heads: Applications in language learning" the summer school with courses on evalua-
by Dominic W. Massaro tion of a range of important aspects of text
"Effective interaction with talking animated and speech systems, including speaker rec-
¨
agents in dialogue systems" by Bjorn ognition, speech synthesis, talking animated
¨
Granstrom and David House interface agents, part-of-speech tagging and
"Controlling the gaze of conversational parsing technologies, machine translation,
agents" by Dirk Heylen, Ivo van Es, question-answering and information retriev-
Anton Nijholt, and Betsy van Dijk al systems, spoken dialogue systems, lan-
"MIND: A context-based multimodal guage resources, and methods and formats
interpretation framework in conversational for the representation and annotation of
systems" by Joyce Y. Chai, Shimei Pan, language resources. Eight of these (groups
and Michelle X. Zhou of) lecturers agreed to contribute a chapter to
"A general purpose architecture for the present book. Since we wanted to keep all
intelligent tutoring systems" by Brady the aspects covered by the summer school, an
Clark, Oliver Lemon, Alexander additional author was invited to address the
Gruenstein, Elizabeth Owen Bratt, area of speaker recognition and to add speech
John Fry, Stanley Peters, Heather recognition, which we felt was important
Pon-Barry, Karl Schultz, Zack to include in the book. Although the point
Thomsen-Gray, and Pucktada of departure for the book was the ELSNET
Treeratpituk summer school held in 2002, the decision to
"MIAMM--A multimodal dialogue system make a book was made considerably later.
using haptics" by Norbert Reithinger, Thus the work on the chapters was only
Dirk Fedeler, Ashwani Kumar, initiated in 2004. First drafts were submitted
Christoph Lauer, Elsa Pecourt, and and reviewed in 2005 and final versions were
Laurent Romary ready in 2006."--From the editors' preface
"Adaptive humancomputer dialogue"
by Sorin Dusan and James Flanagan
"Machine learning approaches to human
The contents of the volume are as follows:
dialogue modelling" by Yorick Wilks,
"Speech and speaker recognition evaluation"
Nick Webb, Andrea Setzer, Mark Hepple,
by Sadaoki Furui
and Roberta Catizone
"Evaluation of speech synthesis" by
Nick Campbell
"Modelling and evaluating verbal and
Evaluation of Text and Speech Systems
non-verbal communication in talking
Laila Dybkjær, Holmer Hemsen, animated interface agents" by Bjorn ¨
and Wolfgang Minker (editors) ¨
Granstrom and David House
(University of Southern Denmark "Evaluating part-of-speech tagging and
and University of Ulm) parsing" by Patrick Paroubek
"General principles of user-oriented
Springer (Text, speech, and language
evaluation" by Margaret King
technology series, edited by Nancy Ide
"An overview of evaluation methods
and Jean V´ ronis, volume 37),
e
in TREC ad hoc information retrieval
2007, xxiii+288 pp; hardbound,
and TREC question answering"
ISBN 978-1-4020-5815-4, $149.00
by Simone Teufel
"Spoken dialogue systems evaluation"
"This book has its point of departure in by Niels Ole Bernsen, Laila Dybkjær,
courses held at the Tenth European Lan- and Wolfgang Minker
guage and Speech Network (ELSNET) Sum- "Linguistic resources, development, and
mer School on Language and Speech evaluation of text and speech systems"
Communication which took place at NISLab by Christopher Cieri
in Odense, Denmark, in July 2002. The topic "Towards international standards for
of the summer school was `Evaluation and language resources" by Nancy Ide
Assessment of Text and Speech Systems.' and Laurent Romary
604
Briefly Noted
Text Entry Systems: Mobility, to the book review editor, Graeme Hirst,
Accessibility, Universality Department of Computer Science, Univer-
sity of Toronto, Toronto, Ontario, Canada
I. Scott MacKenzie and
M5S 3G4. All relevant books received will be
Kumiko Tanaka-Ishii (editors)
listed, but not all can be reviewed. Technical
(York University, Toronto and
reports (other than dissertations) will not be
University of Tokyo)
listed or reviewed. Authors should be aware
San Francisco: Morgan Kaufmann that some publishers will not send books
Publishers, 2007, x+332 pp; paperbound, for review (even when instructed to do so);
ISBN 978-0-12-373591-1, $49.95, £28.99, authors wishing to enquire as to whether
41.95 their book has been received for review may
contact the book review editor.
"Advances in technology have helped create
a connected world, and this is none more Readers who wish to be considered as
apparent than in the area of text entry. The book reviewers for the journal should contact
growing popularity and success of textual the book review editor, outlining their
communication has inspired a variety of text qualifications, by e-mail at gh@cs.toronto.edu
entry systems, modalities, and users. There or at the address above.
is a text entry system that will meet the
needs of all kinds of users--a teen sending an
Words and Intelligence II: Essays in Honor
IM with her phone, a businessman checking
of Yorick Wilks
e-mail on his blackberry, or a visually im-
Khurshid Ahmad, Christopher Brewster,
paired woman using a Braille keyboard on
and Mark Stevenson (editors)
her PDA. The capabilities and modalities of
(Trinity College, Dublin and
text entry are widespread and evolving.
University of Sheffield)
"Text Entry Systems is a guidebook on
Springer (Text, Speech and Language
the details and foundations of text entry
Technology series, edited by Nancy Ide
methods, the effectiveness of current modes
and Jean V´ ronis, volume 36),
e
of text entry, advances in technology, and
2007, xiv+279 pp; hardbound,
the creation of new systems. Authorita-
ISBN 978-1-4020-5832-5, $139.00
tive researchers from the fields of HCI,
handwriting recognition, speech recognition,
computational linguistics, natural language The Categorization of Spatial Entities
processing, universal access, industrial de- in Language and Cognition
sign, cognitive science, and image process- Michel Aurnague, Maya Hickmann,
ing all provide their expertise. They address and Laure Vieu
the wide reach of this technology, including (CNRSUniversit´ de ToulouseLe Mirail,
e
design for various languages and accommo- CNRSUniversit´ de Paris VIII, and
e
dations for those with special physical condi- CNRSUniversit´ Paul Sabatier,
e
tions, using specific examples and offering Toulouse III) John Benjamins Publishing
solutions."--From the publisher's announcement (Human Cognitive Processing series,
edited by Marcelo Dascal et al.,
volume 20), 2007, viii+371 pp; hardbound,
ISBN 978-90-272-2374-6, $144.00, 120.00
Publications Received
Books listed below that are marked with a Semi-Supervised Learning
have been selected for review in a future ¨
Olivier Chapelle, Bernhard Scholkopf,
issue, and reviewers have been assigned to and Alexander Zien
each. (Max Planck Institute for Biological
¨
Cybernetics, Tubingen)
Authors and publishers who wish their The MIT Press, 2007, xiii+506 pp;
publications to be considered for review in hardbound, ISBN 978-0-262-03358-9,
Computational Linguistics should send a copy $50.00, £32.95
605
Computational Linguistics Volume 33, Number 4
Chomsky's Universal Grammar: edited by Charles Jones), 2007, xiv+252 pp;
An Introduction hardbound, ISBN 978-1-4039-3232-7, $80.00
V.J. Cook and Mark Newson
(University of Newcastle upon Tyne
Prosodic Orientation in English
¨ ¨
and Eotvos University)
Conversation
Blackwell Publishing, 2007, vii+326 pp;
Beatrice Szczepek Reed
hardbound, ISBN 978-1-4051-1186-7, $89.95;
(University of Nottingham)
paperbound, ISBN 978-1-4051-1187-4,
Palgrave Macmillan, 2006, xiv+331 pp;
$39.95
hardbound, ISBN 978-0-230-00872-4, $80.00
Corpus Linguistics 25 Years on
Language, Discourse, and Social Psychology
Ronberta Facchinetti (editor)
Ann Weatherall, Bernadette M. Watson,
(University of Verona)
and Cindy Gallois
Rodopi (Language and Computers: Studies
(Victoria University of Wellington and
in practical linguistics, edited by Christian
University of Queensland)
Mair et al., volume 62), 2007, v+385 pp;
Palgrave Macmillan (Palgrave advances in
hardbound, ISBN 978-90-420-2195-2,
linguistics, edited by Christopher N.
80.00
Candlin), 2007, xvii+309 pp; hardbound,
ISBN 978-1-4039-9594-0, $90.00; paperbound,
Errors and Intelligence in ISBN 978-1-4039-9595-7, $34.95
Computer-Assisted Language Learning:
Parsers and Pedagogues Contrastive Linguistics: History,
Trude Heift and Mathias Schulze philosophy, and methodology
(Simon Fraser University and University Pan Wenguo and Tham Wai Mun
of Waterloo) (East China Normal University and
Routledge (Routledge series in computer- Nanyang Technological University)
assisted language learning, edited by Continuum, 2007, xii+287 pp; hardbound,
Carol Chappelle), 2007, xviii+283 pp; ISBN 978-0-8264-8634-9, £85.00, $150.00
hardbound, ISBN 978-0-415-36191-0,
$115.00 On the Syntactic Composition of Manner
and Motion
Word Frequency and Lexical Diffusion Maria Luisa Zubizarreta and Eunjeong Oh
Betty S. Phillips (University of Southern California and
(Indiana State University) Korea University)
Palgrave Macmillan (Palgrave studies The MIT Press, 2007, xi+228 pp; paperbound,
in language history and language change, ISBN 978-0-262-74029-6, $32.00, £19.95
606