Information Engineering

July 11, 2008

URLs

Filed under: Uncategorized — Administrator @ 1:12 pm

http://www.youtube.com/watch?v=y4smlQlWhHM&feature=related
http://www.youtube.com/watch?v=qQ3u3fTG70Q&feature=related
http://www.youtube.com/watch?v=ZKlxyoPNaFI&feature=related
http://www.youtube.com/watch?v=jbdPUiih020&feature=related
http://www.youtube.com/watch?v=PRb8KKyenSY&feature=related
Ennio Morricone clips

http://blog.mlive.com/annarbornews/2008/05/university_of_michigan_to_hire_1.html?goback=%2Ehom
University of Michigan to hire 25 junior faculty

http://www.salon.com/opinion/feature/2008/05/16/270/print.html
How will Barack Obama get to 270?

http://www.salon.com/mwt/feature/2008/05/14/mooney/index.html
Have we fallen behind our parents?

http://www.superliminal.com/cube/cube.htm
http://games.slashdot.org/article.pl?sid=08/05/13/1454243&from=rss
4-d Rubik cube

http://www.mefferts.com/
Puzzle store

http://googleblog.blogspot.com/2008/05/google-translate-adds-10-new-languages.html
http://translate.google.bg/translate_t
Google Translate Adds 10 new languages

http://feeds.feedburner.com/~r/DataMining/~3/288445391/powerset-launch.html
Powerset launches

http://www.timeshighereducation.co.uk/story.asp?sectioncode=26&storycode=401645&c=1
ISI to start indexing conference proceeedings?

http://rss.slashdot.org/~r/Slashdot/slashdot/~3/291629842/article.pl
Bletchley Park Facing Financial Ruin

http://www.cnn.com/2008/SHOWBIZ/Movies/05/14/fahrenheit.sequel.ap/index.html
Moore plans ‘Fahrenheit 9/11′ sequel

http://www.nytimes.com/2008/05/06/health/research/06dise.html
Redefining Disease, Genes and All

http://www.hulu.com
http://www.joost.com
Online access to TV shows

http://videolectures.net
(link from Vlado)

http://en.wikipedia.org/wiki/B%C3%BCsingen
Buesingen
http://en.wikipedia.org/wiki/Campione_d%27Italia
Campione d’Italia
http://en.wikipedia.org/wiki/Liechtenstein
Liechtenstein

http://www.google.com/intl/en/press/annc/20080512_friend_connect.html
Google Friend Finder

http://apps.facebook.com/manyeyes
ManyEyes from IBM Research

http://nexus.ludios.net
Nexus

http://www.wikileaks.org/wiki/Wikileaks
Wikileaks

http://www.slate.com/id/2190284/entry/0/
So You Want To Be a Scientologist

May 16, 2008

Recent URLs

Filed under: Uncategorized — Administrator @ 12:14 pm

http://jetlagged.blogs.nytimes.com/2007/12/28/the-airport-security-follies/
The Airport Security Follies

http://www.nytimes.com/2007/12/27/obituaries/notable-obits-2007.html
Notable obits, 2007

http://rss.slashdot.org/~r/Slashdot/slashdot/~3/198541004/article.pl
Humans Evolving 100 Times Faster Than Ever

http://www.nytimes.com/2007/12/17/style/17facebook.html
On Facebook, Scholars Link Up With Data

http://www.cnn.com/2007/US/12/19/btsc.tuchman.roadsideprayer/index.html?iref=topnews
About I-35 :)

http://www.nytimes.com/2007/12/20/nyregion/20columbia.html
http://cityroom.blogs.nytimes.com/2007/12/19/city-council-approves-columbia-expansion-plan/
Columbia Expansion Gets Green Light

http://www.iq.harvard.edu/blog/netgov/2007/12/comments_on_computational_soci.html
Conference on Computational Social Science

http://www.cs.umd.edu/hcil/socialaction/
Social Action software

http://slashdot.org/article.pl?sid=07/12/10/1953206&from=rss
http://www.slate.com/id/2179393/
Yahoo! Answers, A Librarian’s Worst Nightmare

http://blogoscoped.com/archive/2008-01-03-n84.html
Google chains :)

http://fish.blogs.nytimes.com/2007/12/23/bound-for-academic-glory/
Bound For Academic Glory?

http://www.latimes.com/news/opinion/la-oe-taylor3jan03,0,2812372.story?coll=la-opinion-rightrail
A sequel with the same ending (about the writers strike)

http://www.nytimes.com/2008/01/06/books/06cohenintro.html
Borges and the Foreseeable Future

https://networkx.lanl.gov/wiki
High productivity software for complex networks

http://www.iacat.uiuc.edu/news/08/0103Illinois.html
Illinois advanced computing Institute funds first three projects:
Synergistic Research on Parallel Programming for Petascale Applications
Next-Generation Acceleration Systems for Advanced Science and Engineering Applications
Cultural Informatics

http://www.newyorker.com/reporting/2008/01/21/080121fa_fact_collins
Friend Game: Behind the online hoax that led to a girl’s suicide

http://www.cnn.com/2008/SHOWBIZ/Movies/01/21/film.razzies.ap/index.html
The Razzies

http://blog.washingtonpost.com/offbeat/2007/12/2007_idiot_of_the_year_nominee.html
Idiot of the year nominees

http://www.cnn.com/2008/SHOWBIZ/Movies/01/22/oscar.complete.list/index.html
Complete list of Academy Award nominees

http://www.queensbp.org/content_web/map_boundaries.htm
Map of Queens neighborhoods

http://www.tebreitenbach.com/proverbidioms.htm
Really cool!

http://www.economist.com/displaystory.cfm?story_id=10279823
Making a hash of it

http://www.pbs.org/wgbh/pages/frontline/kidsonline/
Growing up online

http://www.cs.purdue.edu/homes/dec/essay.topic.generator.html
Essay Topic Generator

http://www.theonion.com/content/news/man_braves_freezing_weather_to
Man Braves Freezing Weather To Cross Parking Lot

http://www.nytimes.com/2008/01/28/realestate/28comm.html
Extreme commuting

http://www.fastcompany.com/magazine/122/is-the-tipping-point-toast.html
Is the tipping point toast?

http://www.usatoday.com/tech/science/mathscience/2008-01-23-fractions_N.htm
Professor: Fractions should be scrapped

http://www.cnn.com/2008/HEALTH/02/01/double.dipping.ap/index.html
Beware the bowl: Double dipping spreads bacteria

http://www.sciencemag.org/cgi/content/abstract/313/5788/824
An Experimental Study of the Coloring Problem on Human Subject Networks

http://www.wired.com/special_multimedia/2008/ff_secretlife_1602
The secret life of a blog

http://www.askapatient.com/
Don’t ask the doctors; ask their patients

http://www.nytimes.com/2008/02/04/technology/04soft.html
Microsoft Adds Research Lab in East as Others Cut Back

http://www.nytimes.com/2008/01/25/education/25endowments.html
Senate Looking at Endowments as Tuition Rises

http://www.nytimes.com/2008/01/26/business/26prep.html
AGE OF RICHES; Elite Prep Schools, College-Size Endowments

http://www.nysun.com/article/70489
Brearley Tops Survey of Private Schools

http://belobog.si.umich.edu/clair/anthology/aclsearch.cgi
Movies in Development Hell

http://www.winterblast.com
Detroit Winterblast

http://www.stanford.edu/~kdevlin/
Keith Devlin’s home page

http://www.maa.org/news/columns.html
Cool math columns

http://www.maa.org/mathtourist/mathtourist_11_15_07.html
Random Walks to Football Rankings

http://www.cogito.org/Interviews/InterviewsDetail.aspx?ContentID=16901
http://www.cogito.org/Articles/ArticleDetail.aspx?ContentID=14461
Interviews related to ILO 2007

http://icpc.baylor.edu/icpc/finals/default.htm
ICPC 2008 finalists (100 teams)

http://www.nytimes.com/2008/02/12/books/12publ.html
At Harvard, a Proposal to Publish Free on Web

http://www.insidehighered.com/news/2008/02/13/openaccess
Harvard Opts In to ‘Opt Out’ Plan

http://chronicle.com/news/article/3943/harvard-faculty-adopts-open-access-requirement
Harvard Faculty Adopts Open-Access Requirement

http://cs.jhu.edu/~jason/fun/grammar-and-the-sentence
The parsing song

http://www.morphthing.com/random
Morphing images

http://www.brazilianartists.net/home/flags/
Art that makes an impact

http://www.nytimes.com/2008/02/14/books/14dumb.html
Dumb and Dumber: Are Americans Hostile to Knowledge?

http://www.youtube.com/watch?v=zSaYnQD7EpY
Shostakovich - Waltz 2 From Jazz Suite 2

http://www.nytimes.com/2008/02/19/books/19robbe-grillet.html
Alain Robbe-Grillet dies

http://www.lemonde.fr/web/infog/0,47-0@2-651865,54-999097@51-999297,0.html
Most popular social network sites per country and region

http://www.cnn.com/2008/LIVING/wayoflife/02/25/religion.survey.ap/index.html
Survey: Americans switching faiths, dropping out

http://www.cnn.com/2008/SHOWBIZ/Movies/02/24/oscar.complete.list/index.html
List of Oscar winners

http://www.technologyreview.com/Biztech/20223/
Between Friends
Sites like Facebook are proving the value of the “social graph.”

http://www.nytimes.com/2008/02/27/nyregion/27cnd-stonybrook.html
$60 Million Gift for Stony Brook

http://www.edge.org/q2008/q08_index.html
The Edge Annual Question 2008
WHAT HAVE YOU CHANGED YOUR MIND ABOUT? WHY?

http://www.forbes.com/2008/01/11/google-carr-computing-tech-enter-cx_ag_0111computing.html
When Google Grows Up

http://www.rollingstone.com/politics/story/18056504/truth_or_terrorism_the_real_story_behind_five_years_of_high_alerts/print
Truth or Terrorism? The Real Story Behind Five Years of High Alerts
http://www.rollingstone.com/politics/story/18137343/the_fear_factory
The Fear Factory

http://www.nature.com/nature/journal/v451/n7179/full/451639a.html
Computational science: A hard statistical view
Bart Selman

http://www.pcmag.com/article2/0,2704,2256955,00.asp
The state of machine translation

http://www.cnn.com/2008/TECH/03/05/ask.makeover/index.html
Ask.com gets a makeover, lays off 40

http://crookedtimber.org/2008/03/06/no-shirt-no-shoes-no-service/
http://s.wsj.net/article/SB120425031647901841.html?mod=most_viewed_leisure24
http://chronicle.com/review/brainstorm/bauerlein/stop-pushing-yourself
The Ivory Tower Leans Left, but Why?
(+ follow ups)

http://www.fly.faa.gov/flyfaa/usmap.jsp
Real-time flight delays (FAA site)

http://www.pnas.org/cgi/content/abstract/104/45/17599
Novelty and collective attention by Fang Wu and Bernardo A. Huberman

http://www.businessweek.com/table/08/0305_h1b.htm
The biggest users of H-1 visas

http://www.bloomberg.com/apps/news?pid=20601103&sid=apfnctfiPQPk&refer=us
Berkeley Raises $1.1 Billion to Keep Professors From Ivy League

http://www.darpa.mil/body/news/2008/hasc3-13-08.pdf
Future projects to be funded by DARPA

http://science.slashdot.org/article.pl?sid=08/03/14/1425247&from=rss
Physics Journal May Reconsider Wikipedia Ban

http://indiapost.com/article/immigration/2310/
Bill Gates slams H-1B visa cap

http://www.core.edu.au/rankings/Conference%20Ranking%20Main.html
FINAL 2007 Australian Ranking of ICT Conferences

http://spectrum.ieee.org/radio?id=2518
Interview with Arthur C. Clarke

http://www.cnn.com/2008/SHOWBIZ/books/03/19/obit.clarke.ap/index.html
Sci-fi guru Clarke to have secular funeral

http://www.chicagotribune.com/news/nationworld/sns-ap-israel-math-riddle,0,1509032,print.story
Math problem solved after 40 years.

http://www.scivee.tv/
Make your research known (Youtube-like web site for scientific research)

http://www.google.com/coop/cse?cx=017841009789079614384%3Akq2rufyow_0
Search engine for Computational linguistics

http://www.time.com/time/business/article/0,8599,1724522,00.html
Don’t text and walk

http://www.ted.com/index.php/talks/view/id/229
Neuroanatomist Jill Bolte Taylor
had an opportunity few brain
scientists would wish for: One morning, she realized she was having a
massive stroke. As it happened — as she felt her brain functions slip
away one by one, speech, movement, understanding — she studied and
remembered every moment. This is a powerful story about how our brains
define us and connect us to the world and to one another.

http://www.nytimes.com/2008/03/23/arts/design/23ouro.html
Nice Tower! Whos Your Architect?

http://www.nytimes.com/2008/03/21/arts/design/21atla.html
What Will Be Left of Gehry’s Vision for Brooklyn?

http://rss.slashdot.org/~r/Slashdot/slashdot/~3/257767517/article.pl
http://www.wired.com/techbiz/it/magazine/16-04/bz_curator
Algorithms Are Terrific. But to Search Smarter, Find a Person.

http://www.brijit.com/

http://feeds.wired.com/~r/wired/index/~3/257341014/new_face_recognition

http://tech.slashdot.org/article.pl?sid=08/03/24/1959201&from=rss
http://www.theglobeandmail.com/servlet/story/RTGAM.20080324.wrgoogle24/BNStory/Technology/home
Google’s latest headache

http://yro.slashdot.org/article.pl?sid=08/03/22/1314253&from=rss
Google Patents Detecting, Tracking, Targeting Kids

http://medicine.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pmed.0050071
It’s the Network, Stupid: Why Everything in Medicine Is Connected

http://www.cnn.com/2008/TECH/03/28/cuba.cellphones.ap/index.html
Ordinary Cubans gain access to cell service

http://www.scienceblog.com/cms/who-wins-nsf-graduate-fellowships-15771.html
NSF fellowships

http://grad-schools.usnews.rankingsandreviews.com/grad/com/search/

http://cty.jhu.edu/imagine/PDFs/Linguistics.pdf

http://www.iht.com/articles/2008/03/31/opinion/edbakoy.php
The idiot of the week

http://www.washingtonpost.com/wp-dyn/content/story/2008/04/03/ST2008040303977.html
AP Language, Computer Courses Cut

http://www.cra.org/govaffairs/blog/archives/000668.html
Reports of AP CS’ Demise are Greatly Exaggerated

http://www.maa.org/mathhorizons/
Math Horizons

http://www.nytimes.com/2008/04/09/technology/techspecial/09store.html
In Storing 1s and 0s, the Question Is $

http://stp.clarku.edu/simulations/
Java Simulations for Statistical and Thermal Physics

http://www.google.com/intl/en/help/features.html
Google Web search features

http://spectrum.ieee.org/mar08/6019
People Who Read This Article Also Read…

http://insidehighered.com/news/2008/04/16/minerva
A Pentagon Olive Branch to Academe

http://www.nytimes.com/2008/04/14/business/media/14link.html
He Wrote 200,000 Books (but Computers Did Some of the Work)

http://www.cnn.com/2008/POLITICS/05/02/evangelicals.ap/index.html
‘An Evangelical Manifesto’ criticizes politics of faith

http://www.cs.princeton.edu/%7Echazelle/pubs/algorithm.html
The Algorithm: Idiom of Modern Science (highfalutin essay by Bernard Chazelle)

http://www.cnn.com/2008/TECH/04/24/close.call.ap/index.html
Humans nearly wiped out 70,000 years ago, study says

http://scienceblogs.com/principles/2008/04/advice_for_the_tenure_track.php
Advice for the tenure track

http://rss.slashdot.org/~r/Slashdot/slashdot/~3/277162889/article.pl
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
Are C and C++ losing ground?

http://www.cnn.com/2008/US/04/26/atheist.soldier.ap/index.html
Atheist soldier claims harassment

http://news.cs.cmu.edu/Releases/demo/334.html
Carnegie Mellon Algorithm Identifies Top 100 Blogs for News

http://www.phdcomics.com/comics.php?f=821
Negation Field

http://zhongwen.com/
Chinese characters and culture

http://users.fmg.uva.nl/lleydesdorff/jcr06/centrality/index.htm
Centrality measures of 7611 journals

November 25, 2007

New URLs

Filed under: Uncategorized — Administrator @ 8:07 pm

http://HTDAW.livedigital.com/blog/100463
We paid for it already - why should we have to pay for it again?
A Step Forward for Open Access
link from the IP list

http://www.cs.ucsd.edu/users/braghava/systems-topic-generator.html
Easy thesis topics :)

http://www.cmu.edu/news/archive/2007/November/nov19_blogalgorithm.shtml
Carnegie Mellon Algorithm Identifies Top 100 Blogs for News

http://www.cogito.org/Interviews/InterviewsDetail.aspx?ContentID=16901
Adam Hesterberg - ILO 2007 individual winner

http://www.cogito.org/Articles/ArticleDetail.aspx?ContentID=17017
23AndMe Will Decode Your DNA for $1,000

http://edition.cnn.com/2007/TECH/11/14/netflix.prize.ap/
Re: the Netflix contest

http://www.newscientist.com/channel/opinion/careers/mg19626312.200-view-from-the-top-peter-norvig-googles-director-of-research.html
Peter Norvig (article from the New Scientist)

November 14, 2007

NACLO 2008 announced: the North American Computational Linguistics Olympiad

Filed under: Bulgaria, Uncategorized, computer science, language, language technologies — Administrator @ 8:07 pm

Registration is open for the Second Annual North American
Computational Linguistics Olympiad

Please inform high school students in your area of the the second annual
North American Computational Linguistics Olympiad Open competition, which
will be held on February 5, 2008. Students may participate at one the host
sites listed below or in the internet category. The contest targets high
school students, but middle school students may also participate.

Students can register at: http://www.naclo.cs.cmu.edu.

Top scorers in the Open competition will be eligible to compete in the NACLO
Invitational competition in March, 2008. Top scorers in the Invitational
will be eligible to compete in the International Linguistics Olympiad in
Bulgaria in the summer of 2008. Two US teams competed in the International
Computational Linguistics Olympiad in St. Petersburg in 2007 with great
results, achieving the top score in the individual competition and tying for
first place in the team competition.

Brandeis University
Carnegie Mellon University/University of Pittsburgh
Columbia University
Cornell University
Middle Tennessee State University
San Jose State University
University of Michigan
University of Oregon
University of Pennsylvania
University of Toronto
University of Wisconsin/Edgewood college

If you are not listed here, and you would like to host the contest at
your university, contact Lori Levin, lsl at-symbol cs.cmu.edu.

In addition, any student may participate in the Internet category by
finding a local high school or university teacher to facilitate the
contest.

About Linguistics Olympiads:

The North American Computational Linguistics Olympiad (NACLO) is the
direct descendant of the Olympiad in Linguistics and Mathematics
founded in 1965 in Moscow, Russia. High school students compete by
solving linguistics and logic problems based on natural
languages. This program is credited with introducing thousands of
Russian students to the field of linguistics, many of whom have gone
on to become prominent professional linguists. NACLO includes
traditional Olympiad problems as well as some computational problems.
This is not a competition that deals with computer technology, but
with all aspects of natural language structure and function, including
computational thinking as it relates to natural language processing.

Thank you very much for your help in raising the profile of our
discipline among secondary school students. Please contact any of the
executive team members below if you have any questions or would like
to be involved in some way, including possibly hosting a competition
in your area and/or submitting a problem for future competitions.

Lori Levin - Co-chair
Thomas E. Payne - Co-chair
Dragomir R. Radev - Program chair and team coach

November 2, 2007

I just won $0.00

Filed under: Uncategorized — Administrator @ 6:39 pm

I received this by email today. The offer is not valid outside of the US or Puerto Rico :) How many shares of Borders can I get for this much money?

Drago

Date: Fri, 2 Nov 2007 20:37:01 -0000
Subject: Dragomir, You’ve Earned $0.00 in Borders Bucks!

Redeem your Borders Bucks now you’ve earned it!

Just print this email and bring it to any Borders, Borders Express, or
Waldenbooks and redeem your $0.00 in Borders Bucks any
time before November 30, 2007.

Borders Rewards Card Number: ***

Cashier must validate Borders Bucks balance at register. Borders Bucks
may be redeemed for eligible purchases at participating stores through
November 30, 2007. Not valid online or outside the United States or Puerto
Rico. Borders Bucks have no cash value and are not applicable to prior
purchases.

October 30, 2007

How many ways to say that the Red Sox won

Filed under: Uncategorized, language technologies, social networks — Administrator @ 4:01 pm

From Google News, 99 titles of news stories about the Red Sox winning the world series for the second time. Here is a network drawn in Pajek using the IDF-weighted cosine similarity between each pair of titles. Two titles are connected if their similarity is above 0.7.

Network

August 7, 2007

US teams win ILO 2007

Filed under: Uncategorized — Administrator @ 11:25 am

Team USA Earns Laurels at International Linguistics Olympiad

American students have won high honors in an international linguistics
competition in St. Petersburg, Russia. The World Champion in the
individual competition is Adam Hesterberg, a 2007 graduate of Garfield
High School, Seattle, WA.

Eight high school students from the USA competed in the Fifth
International Linguistics Olympiad in St. Petersburg, Russia from
August 1 through 4, 2007. The top overall winner in the individual
competition was Adam Hesterberg, of Seattle, WA. Jeffrey Lim of
Arlington, MA received top prize for the best solution to one of the
problems. One US team of four students won the top prize in the team
competition in a tie with a Russian team.

The winners of the team competition were Rebecca Jacobs of Los
Angeles, CA, Joshua Falk of Pittsburgh, PA, Michael Gottlieb of Dobbs
Ferry, NY and Anna Tchetchetkine, of San Jose, CA.

Other American team members were Rachel Zax and Ryan Musa, both of
Ithaca, NY. Rachel Zax is also the top prize winner of the US National
Competition and Ryan Musa is the second prize winner. The US teams
were coached by Dr. Dragomir Radev, of the University of
Michigan. Dr. Lori Levin of Carnegie Mellon University, and Dr. Amy
Troyani of Taylor Allderdice High School, Pittsburgh, PA, also
provided leadership for the teams.

Altogether 16 teams of 4 students each competed, representing 9
different countries — Estonia, Latvia, Bulgaria, Russia, Spain, The
Netherlands, Sweden, Poland and the USA. This is the first time that
teams from the USA have competed in the International Linguistics
Olympiad.

The International Linguistics Olympiad is a yearly event originating
in Russia and Bulgaria in which secondary school students compete by
solving linguistics problems, mostly in languages and writing systems
they have never learned. This year there were problems in Braille,
Turkish, Tatar, Georgian, Movima (Bolivia), Hawaiian and Ndom (Papua
New Guinea). See www.ilolympiad.spb.ru/ for more information about the
International Linguistics Olympiad.

The US teams were selected from finalists of the North American
Computational Linguistics Olympiad (NACLO) that took place on March
29, 2007. The US participation was sponsored by the National Science
Foundation, the North American chapter of the Association for
Computational Linguistics, Google, and private contributions from
participants, families and individual contributors.

More information about NACLO can be found at www.namclo.org.
Contact: Thomas E. Payne. tpayne@uoregon.edu
541-342-6706
Co-Chair, North American Computational Linguistics Olympiad

July 27, 2007

The International Linguistics Olympiad

Filed under: computer science, language — Administrator @ 6:35 pm

The International Linguistics Olympiad starts on Tuesday. I am
leaving on Sunday. The two US teams consist of eight amazingly smart
students.

http://www.ilolympiad.spb.ru/part.html

Some other references:

NAMCLO 2007:
http://namclo.linguistlist.org/

ILO 2007:
http://www.ilolympiad.spb.ru/

May 6, 2007

The North American Linguistics Olympiad

Filed under: Uncategorized — Administrator @ 7:02 pm

Results and problem sets are here:
http://www.namclo.org.

My favorite corpora

Filed under: Uncategorized, computer science, language technologies — Administrator @ 7:01 pm

Here are my favorite corpora:

Enron email
CIA world factbook
DBLP: papers in CS
US congressional speeches
AOL queries
Netflix recommendations
IMDB
PUBMED: biomedical paper abstracts
Wikipedia
ACL Anthology
DOTGOV: download of .GOV
biocreative: biomedical papers
WT100G: 100GB download of the web
Google n-grams
webfreq
SMS corpus
Citeseer
DMOZ
corpus of paraphrases
multilingual parallel parliamentary proceedings
textual entailment corpus
question answering corpus
summarization corpus
various text classification corpora (Reuters-21578, 20NG)
Peekaboom

December 29, 2006

My favorite movies

Filed under: Uncategorized — Administrator @ 8:22 pm

No changes in 2006 to my top 30 list.

Activities of an associate prof.

Filed under: Uncategorized, computer science, higher education — Administrator @ 8:19 pm

I created a list of activities that occupy an associate professor’s 70-hour work week. I am sure that I missed many more items.

http://tangra.si.umich.edu/clair/clair/activities.txt

What bibliometric tool will make my life better

Filed under: Uncategorized — Administrator @ 8:19 pm

I would like to see some tool that will allow me to manipulate paper references in the following way:

- add a paper in pdf format
- add a reference for which a pdf is not available
- search for papers/references
- manually tag papers by topic and importance
- extract custom bib entries for each topic and export to bibtex and html
- papers can belong to multiple categories
- allow manual and group annotations of papers
- unix based with batch mode capabilities
- incorporate access control
- retrieve papers

I have been unable to find something like this. Any hints?

Drago

The smartests cities in the World (from Forbes)

Filed under: Uncategorized, higher education — Administrator @ 8:19 pm

http://www.forbes.com/entrepreneurs/2006/12/14/boulder-education-cities-ent_cx_ee_1215smartcities.html
http://www.forbes.com/entrepreneurs/2006/12/14/boulder-education-cities-ent_cx_ee_1215smartcities_slides.html
America’s 10 smartest cities,
“ranked them based on the percentage of the population age 25 and over
with at least a bachelors degree”.
1. Boulder, CO
2. Bethesda, MD
3. Ann Arbor, MI
4. Cambridge, MA
5. San Francisco, CA
6. Durham, NC
7. Fort Collins-Loveland, CO
8. Washington, DC
9. Bridgeport, Stamford, and Norwalk, CT
10. San Jose, Sunnyvale, and Santa Clara, CA

I guess New York City and Seattle lose on this criterion.

December 26, 2006

Lost in Paris

Filed under: Uncategorized — Administrator @ 11:52 am

Look at this map of Paris
http://travel.nytimes.com/2006/12/24/travel/24hours.html
printed in the Dec. 24 issue of the New York Times.

The Champs-Elysées is shown in the wrong place (what is labeled as “Champs-Elysées” is actually Avenue de Friedland/Boulevard Haussmann). The Champs-Elysées is the avenue that links Place de l’Etoile to Place de la Concorde.

http://www.nytimes.com/imagepages/2006/12/22/travel/escapes/22hour_map.html

Compare with this map:

http://www.frommers.com/images/destinations/maps/jpg-2006/62_thebestofparisin1day.jpg

or this one:

http://maps.google.com/maps?q=paris,+france&ll=48.873861,2.294898&spn=0.006007,0.020548&t=h&hl=en

One can only wonder about the reason for this blunder by the New York Times - perhaps trying to foil the discovery of the Holy Grail :)

Update (Dec. 30) The NYT web site has been updated with a corrected map showing the Champs-Elysées in the right place.

November 23, 2006

A message from the spelling police or “Riding the subway with Verlaine”

Filed under: language — Administrator @ 11:27 pm

NYC subway cars occasionally feature poetry excerpts on the inside walls. Some are great. I was very pleased to see the beginning of Verlaine’s “Automn Song” (”Chanson d’Automne”). Unfortunately, the spelling police discovered a typo: “saglots” instead of “sanglots”. Here is the full text of this wonderful poem:

Chanson d’Automne

Les sanglots longs
Des violons
De l’automne
Blessent mon coeur
D’une langueur
Monotone.

Tout suffocant
Et blême, quand
Sonne l’heure,
Je me souviens
Des jours anciens
Et je pleure;

Et je m’en vais
Au vent mauvais
Qui m’emporte
Deçà, delà
Pareil à la
Feuille morte.

How to name email attachments

Filed under: higher education — Administrator @ 11:21 pm

Here is a suggestion. If you send a homework assignment or a resume as an attachment, please consider that the person receiving it (an instructor or potential employer) is likely to get such submissions from other people as well. If you name your submission “HW1.tar.gz” or “resume.pdf”, chances are that your recipient will have other files with the same name. It is much better to name your file with some identifying information about yourself, e.g., your name or user id, e.g., “HW1-CS499-johnson.tar.gz” or “Alice.Smith.resume.pdf”.

November 11, 2006

Text compression as proxy for AI

Filed under: computer science, language, language technologies — Administrator @ 1:40 pm

A very interesting challenge:

http://cs.fit.edu/~mmahoney/compression/rationale.html

The goal is to compress Wikipedia losslessly. Intuitively, some semantics aware compressor would do really well here. The problem is that no one seems to know how to build one. The best entries so far are all string-based (e.g., http://www.compression.ru/ds/).

EU wants Bulgarians to change the way they speak

Filed under: Bulgaria, language — Administrator @ 1:32 pm

According to http://www.novinite.com/view_news.php?id=72473 and http://www.novinite.com/view_news.php?id=72419, the EU wants the pronunciation of EURO in Bulgarian to be made consistent with the latinized pronunciation (”euro”) instead of the currently adopted “evro”. What’s next? Change Sofia’s spelling to Sophia and Bulgaria’s pronunciation in Bulgarian to “bulgaria”?

October 29, 2006

The ACL wiki

Filed under: language technologies — Administrator @ 7:47 pm

The ACL wiki is now reality. A large portion of the existing ACL Universe will be folded into the Wiki and the “Universe” will likely disappear :)

October 6, 2006

The netflix challenge

Filed under: computer science, language technologies, social networks — Administrator @ 9:29 pm

According to CNET, Netflix is offering $1M if you manage to improve their movie recommendation system.

I hope that many other organizations announce such contests.

More links:

http://hunch.net/?p=231
http://rss.slashdot.org/~r/Slashdot/slashdot/~3/31168783/article.pl
http://feeds.feedburner.com/~r/oreilly/radar/rss10/~3/31208774/netflixs_personalization_conte_1.html

September 27, 2006

Information Extraction for DHS

Filed under: computer science, language technologies — Administrator @ 8:24 pm

Slashdot has a story about a new project for information extraction for homeland security:

http://it.slashdot.org/article.pl?sid=06/09/25/0111231&from=rss

It further links to these two sites:

http://www.eurekalert.org/pub_releases/2006-09/cuns-sfa092206.php
http://blogs.zdnet.com/emergingtech/?p=364

September 24, 2006

Microsoft wants to patent verb conjugation

Filed under: language, language technologies — Administrator @ 6:44 pm

From the corpora list and from Slashdot:

http://rss.slashdot.org/~r/Slashdot/slashdot/~3/19728499/article.pl

I know of papers on automatic verb conjugation that are 15+ years old. I am not sure what Microsoft is trying to accomplish here.

September 18, 2006

List of topics related to work in Clair

Filed under: Uncategorized, computer science, higher education, language technologies — Administrator @ 11:03 pm

I have tried to prepare a list of topics that members of CLAIR will find useful. Any comments are welcome.

September 2, 2006

List of skills for NLP/IR PhD students

Filed under: Uncategorized, computer science, higher education, language technologies — Administrator @ 5:58 pm

I decided to compile a list of skills that can be used to gauge progress in one’s research career in NLP/IR.

Here is what I figured out:

http://tangra.si.umich.edu/clair/PHD-LIST.

Any comments?

Drago

New large web corpora available

Filed under: Uncategorized, language technologies — Administrator @ 5:38 pm

Finally some useful corpora from the big search companies.

http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html
N-gram corpus from Google

http://www.aolsearchdatabase.com
query logs from AOL (controversial). See also http://www.ugcs.caltech.edu/~dangelo/aol-search-query-logs/

List of NLP evaluations

Filed under: Uncategorized, language technologies — Administrator @ 10:51 am

I had to quickly compile a list of existing NLP evaluations. Each of these includes a standardized task description, a corpus, and evaluation software.

NP bracketing http://www.cnts.ua.ac.be/conll99/npb/
Chunking http://www.cnts.ua.ac.be/conll2000/chunking/
Clause ident. http://www.cnts.ua.ac.be/conll2001/clauses/
NER http://www.cnts.ua.ac.be/conll2002/ner/
semantic roles http://www.lsi.upc.edu/~srlconll/st04/st04.html
dep. parsing http://nextens.uvt.nl/~conll/
summarization http://duc.nist.gov
pp attachment
parsing
MT http://www.nist.gov/speech/tests/mt/
WSD http://nextens.uvt.nl/~conll/
IE in biology http://biocreative.sourceforge.net/
entailment http://www.pascal-network.org/Challenges/RTE2/
QA http://trec.nist.gov

There are many other tasks, e.g., the KDD cup.

The new CSE building at the University of Michigan

Filed under: Uncategorized, administrative, computer science, higher education — Administrator @ 10:51 am

The new CSE building at UM:

http://www.mlive.com/news/aanews/index.ssf?/base/news-18/115425421627800.xml&coll=2

Graph-based methods for NLP (and IR)

Filed under: Uncategorized, computer science, language technologies — Administrator @ 10:51 am

Rada Mihalcea and I recently organized a tutorial and a workshop on Graph-based methods for NLP (and IR) at HLT-NAACL 2006 in Brooklyn.

Google to open lab in Ann Arbor

Filed under: Uncategorized, computer science — Administrator @ 10:34 am

Google has decided to open a lab in Ann Arbor. Two of the main directions of work will be targeted ads and library scanning.

http://www.nytimes.com/2006/07/11/technology/11google.html
http://www.mlive.com/newsflash/michigan/index.ssf?/base/news-35/115259956475940.xml&storylist=newsmichigan
http://www.mlive.com/newsflash/michigan/index.ssf?/base/business-9/1152621861309560.xml&storylist=newsmichigan
http://www.freep.com/apps/pbcs.dll/article?AID=/20060710/NEWS99/307100004/1122
http://www.mlive.com/news/aanews/index.ssf?/base/news-18/11527152409090.xml&coll=2
http://www.freep.com/apps/pbcs.dll/article?AID=2006607120342
http://www.mlive.com/news/aanews/index.ssf?/base/news-18/115444325633730.xml&coll=2

August 2, 2006

Web courses related to my research interests

Filed under: computer science, language technologies — Administrator @ 4:33 pm

I have collected a list of course web pages that are relevant to CLAIR.

My goal was to list courses that tend to:

(1) are taught by some of the best people in the respective areas
(2) make their reading lists and notes publicly available
(3) cover the state of the art in topics relevant to clair

Here is the result:

http://tangra.si.umich.edu/clair/clair/courses.html

Please send me suggestions for other sites to add. I am particularly
looking for more good courses on Machine Translation, Statistical NLP,
Text Mining, Information Retrieval, Biological NLP, and Graph/Network
Analysis.

April 28, 2006

Tenure discussions

Filed under: higher education — Administrator @ 7:04 pm

U. Michigan is planning to switch to a tenure cycle with a 10-year cap (instead of the current and more standard 6-7 cycle).

http://www.provost.umich.edu/reports/flexible_tenure/contents.html
http://insidehighered.com/news/2006/02/28/michigan
http://www.detnews.com/apps/pbcs.dll/article?AID=/20060226/SCHOOLS/602260344/1026

Under the proposal, each department may end up having its own rules about the tenure process, including allowing or not a second try if the first one fails and also allowing or not tenure track faculty to go early.

Nothing is decided yet. Watch this space for updates.

This reminds me to post the first entry in my list of “Ten most useful blogs”. This would be a free and very informative site on Higher Education:

http://insidehighered.com/

Here is an interesting recent link about tenure from this site:

http://insidehighered.com/news/2006/04/25/tenure

More blogs from my top ten will follow (in no particular order) in future postings.

Resuming the blog

Filed under: administrative — Administrator @ 7:01 pm

After a long hiatus, I am planning to restart the blog.

I-LIST and DR-LIST have been very successful. It turns out that I sent out 235 messages to I-LIST and 239 messages to DR-LIST since the last posting in this blog. Sending email from elm is much easier for me than logging on to the blog server and editing html pages. At the same time, maintaining a blog presence is also quite important to reach out to new audiences.

D.

June 6, 2005

The Machine Translation evaluation

Filed under: computer science, language technologies — Administrator @ 7:05 pm

Wow! Less than a year since Google hired Franz Och, they seem to be doing really great in Machine Translation:
http://www.csmonitor.com/2005/0602/p13s02-stct.html
.

June 1, 2005

PhD institutions of faculty in the top 10 CS departments

Filed under: computer science — Administrator @ 2:47 pm

An interesting survey about the PhD institutions of CS faculty in top 10 CS departments.

This particular page shows that out of 138 assistant and associate
profs. at the top 11 CS depts (with known PhDs in CS), 82 hail from
the top five: Berkeley, MIT, Stanford, CMU, and Cornell.

May 26, 2005

Physicists, sociologists, and linguists

Filed under: social networks — Administrator @ 8:44 pm

Eszter Hargittai recently wrote about “Isolated Social Networkers” in
her blog.

Her claim (inspired by some earlier discussions on the INSNA SOCNET
mailing list) is that physicists working on social networking problems
rarely cite the relevant prior work in sociology. She includes a
diagram by Lin Freeman that supports this claim in a graphical form.

I am personally of the opinion that both sides of the picture have
contributed significantly to the field and should not be calling each
other names but that’s not the point of my posting. Reading Eszter’s
story, I couldn’t help remembering a discussion from a few years ago
between a group of physicists in Italy (Benedetto et al.) and Joshua
Goodman (a computer scientist at Microsoft Research).

Benedetto et al. had published a paper (”Language Trees and Zipping“) in a good Physics journal
(Physical Review Letters) in which they showed a compression-based
method for identifying patterns in text and other sequences.

According to Goodman

“I first point out the inappropriateness of publishing a Letter
unrelated to physics. Next, I give experimental results showing that
the technique used in the Letter is 3 times worse and 17 times
slower than a simple baseline, Naive Bayes. And finally, I review
the literature, showing that the ideas of the Letter are not
novel. I conclude by suggesting that Physical Review Letters should
not publish Letters unrelated to physics.”

Benedetto et al’s rebuttal appeared in Arxiv.org

21 db ops | served in 1.402 seconds | Powered by WordPress