NACLO 2008 announced: the North American Computational Linguistics Olympiad

Registration is open for the Second Annual North American
Computational Linguistics Olympiad

Please inform high school students in your area of the the second annual
North American Computational Linguistics Olympiad Open competition, which
will be held on February 5, 2007. Students may participate at one the host
sites listed below or in the internet category. The contest targets high
school students, but middle school students may also participate.

Students can register at:

Top scorers in the Open competition will be eligible to compete in the NACLO
Invitational competition in March, 2007. Top scorers in the Invitational
will be eligible to compete in the International Linguistics Olympiad in
Bulgaria in the summer of 2007. Two US teams competed in the International
Computational Linguistics Olympiad in St. Petersburg in 2007 with great
results, achieving the top score in the individual competition and tying for
first place in the team competition.

Brandeis University
Carnegie Mellon University/University of Pittsburgh
Columbia University
Cornell University
Middle Tennessee State University
San Jose State University
University of Michigan
University of Oregon
University of Pennsylvania
University of Toronto
University of Wisconsin/Edgewood college

If you are not listed here, and you would like to host the contest at
your university, contact Lori Levin, lsl at-symbol

In addition, any student may participate in the Internet category by
finding a local high school or university teacher to facilitate the

About Linguistics Olympiads:

The North American Computational Linguistics Olympiad (NACLO) is the
direct descendant of the Olympiad in Linguistics and Mathematics
founded in 1965 in Moscow, Russia. High school students compete by
solving linguistics and logic problems based on natural
languages. This program is credited with introducing thousands of
Russian students to the field of linguistics, many of whom have gone
on to become prominent professional linguists. NACLO includes
traditional Olympiad problems as well as some computational problems.
This is not a competition that deals with computer technology, but
with all aspects of natural language structure and function, including
computational thinking as it relates to natural language processing.

Thank you very much for your help in raising the profile of our
discipline among secondary school students. Please contact any of the
executive team members below if you have any questions or would like
to be involved in some way, including possibly hosting a competition
in your area and/or submitting a problem for future competitions.

Lori Levin – Co-chair
Thomas E. Payne – Co-chair
Dragomir R. Radev – Program chair and team coach

I just won $0.00

I received this by email today. The offer is not valid outside of the US or Puerto Rico :) How many shares of Borders can I get for this much money?


Date: Fri, 2 Nov 2007 20:37:01 -0000
Subject: Dragomir, You’ve Earned $0.00 in Borders Bucks!

Redeem your Borders Bucks now you’ve earned it!

Just print this email and bring it to any Borders, Borders Express, or
Waldenbooks and redeem your $0.00 in Borders Bucks any
time before November 30, 2007.

Borders Rewards Card Number: ***

Cashier must validate Borders Bucks balance at register. Borders Bucks
may be redeemed for eligible purchases at participating stores through
November 30, 2007. Not valid online or outside the United States or Puerto
Rico. Borders Bucks have no cash value and are not applicable to prior

US teams win ILO 2007

Team USA Earns Laurels at International Linguistics Olympiad

American students have won high honors in an international linguistics
competition in St. Petersburg, Russia. The World Champion in the
individual competition is Adam Hesterberg, a 2007 graduate of Garfield
High School, Seattle, WA.

Eight high school students from the USA competed in the Fifth
International Linguistics Olympiad in St. Petersburg, Russia from
August 1 through 4, 2007. The top overall winner in the individual
competition was Adam Hesterberg, of Seattle, WA. Jeffrey Lim of
Arlington, MA received top prize for the best solution to one of the
problems. One US team of four students won the top prize in the team
competition in a tie with a Russian team.

The winners of the team competition were Rebecca Jacobs of Los
Angeles, CA, Joshua Falk of Pittsburgh, PA, Michael Gottlieb of Dobbs
Ferry, NY and Anna Tchetchetkine, of San Jose, CA.

Other American team members were Rachel Zax and Ryan Musa, both of
Ithaca, NY. Rachel Zax is also the top prize winner of the US National
Competition and Ryan Musa is the second prize winner. The US teams
were coached by Dr. Dragomir Radev, of the University of
Michigan. Dr. Lori Levin of Carnegie Mellon University, and Dr. Amy
Troyani of Taylor Allderdice High School, Pittsburgh, PA, also
provided leadership for the teams.

Altogether 16 teams of 4 students each competed, representing 9
different countries — Estonia, Latvia, Bulgaria, Russia, Spain, The
Netherlands, Sweden, Poland and the USA. This is the first time that
teams from the USA have competed in the International Linguistics

The International Linguistics Olympiad is a yearly event originating
in Russia and Bulgaria in which secondary school students compete by
solving linguistics problems, mostly in languages and writing systems
they have never learned. This year there were problems in Braille,
Turkish, Tatar, Georgian, Movima (Bolivia), Hawaiian and Ndom (Papua
New Guinea). See for more information about the
International Linguistics Olympiad.

The US teams were selected from finalists of the North American
Computational Linguistics Olympiad (NACLO) that took place on March
29, 2007. The US participation was sponsored by the National Science
Foundation, the North American chapter of the Association for
Computational Linguistics, Google, and private contributions from
participants, families and individual contributors.

More information about NACLO can be found at
Contact: Thomas E. Payne.
Co-Chair, North American Computational Linguistics Olympiad

My favorite corpora

Here are my favorite corpora:

Enron email
CIA world factbook
DBLP: papers in CS
US congressional speeches
AOL queries
Netflix recommendations
PUBMED: biomedical paper abstracts
ACL Anthology
DOTGOV: download of .GOV
biocreative: biomedical papers
WT100G: 100GB download of the web
Google n-grams
SMS corpus
corpus of paraphrases
multilingual parallel parliamentary proceedings
textual entailment corpus
question answering corpus
summarization corpus
various text classification corpora (Reuters-21578, 20NG)

The smartests cities in the World (from Forbes)
America’s 10 smartest cities,
“ranked them based on the percentage of the population age 25 and over
with at least a bachelors degree”.
1. Boulder, CO
2. Bethesda, MD
3. Ann Arbor, MI
4. Cambridge, MA
5. San Francisco, CA
6. Durham, NC
7. Fort Collins-Loveland, CO
8. Washington, DC
9. Bridgeport, Stamford, and Norwalk, CT
10. San Jose, Sunnyvale, and Santa Clara, CA

I guess New York City and Seattle lose on this criterion.

What bibliometric tool will make my life better

I would like to see some tool that will allow me to manipulate paper references in the following way:

- add a paper in pdf format
- add a reference for which a pdf is not available
- search for papers/references
- manually tag papers by topic and importance
- extract custom bib entries for each topic and export to bibtex and html
- papers can belong to multiple categories
- allow manual and group annotations of papers
- unix based with batch mode capabilities
- incorporate access control
- retrieve papers

I have been unable to find something like this. Any hints?


Lost in Paris

Look at this map of Paris
printed in the Dec. 24 issue of the New York Times.

The Champs-Elysées is shown in the wrong place (what is labeled as
“Champs-Elysées” is actually Avenue de Friedland/Boulevard Haussmann).
The Champs-Elysées is the avenue that links Place de l’Etoile to Place
de la Concorde.

Compare with this map:

or this one:,+france&ll=48.873861,2.294898&spn=0.006007,0.020548&t=h&hl=en

One can only wonder about the reason for this blunder by the New
York Times – perhaps trying to foil the discovery of the Holy Grail :)

Update (Dec. 30) The NYT web site has been updated with a corrected map showing the Champs-Elysées in the right place.