Corpus InfoThe ACL Anthology Network The AAN corpus includes three networks, paper citation, author citation and auth or collaboration. The paper citation network (paper-citation-network.txt) is a directed network composed of nodes labeled with paper ids which correspond to in dividual papers (acl-metadata.txt). The author citation network (author-citation-network.txt), a directed network, is compiled from the paper network and the metadata file. For each citation in the paper network, where paper A cites paper B, and for each author in paper A, an edge is created for that author to each author in paper B. The author collaboration network (author-collaboration-network.txt), an undirected network, is composed of authors where, for each paper in the paper citation network, an edge is created between each collaborator for that paper.Here is a small README which lists all the data included in this release There are a few scripts included in the release which are useful for creating the networks, computing network statistics, etc. Some of the scripts use Clairlib , which can be downloaded here. The Clair library is a suite of open-source Perl modules intended to simplify a number of generic tasks in natural language processing (NLP), information retrieval (IR), and network analysis (NA). VerificationFor verfication purposes we require you submit an email for download.Your email will not be sold or used for anything other than possible information about updates to the package you download. Return to Clair home page. |