Difference between revisions of "2015Q1 Reports: Info Officer"
(8 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
− | [[http://aclweb.org/adminwiki/index.php?title=2014Q3_Reports:_Information_Officer | + | [[http://aclweb.org/adminwiki/index.php?title=2014Q3_Reports:_Information_Officer Link to 2014 Q3 Report]] |
[[http://aclweb.org/adminwiki/index.php?title=2014Q1_Reports:_Info_Officer Link to 2014 Q1 Report]] | [[http://aclweb.org/adminwiki/index.php?title=2014Q1_Reports:_Info_Officer Link to 2014 Q1 Report]] | ||
[[http://aclweb.org/adminwiki/index.php?title=2013Q3_Reports:_Information_Officer Link to 2013 Q3 Report]] | [[http://aclweb.org/adminwiki/index.php?title=2013Q3_Reports:_Information_Officer Link to 2013 Q3 Report]] | ||
Line 6: | Line 6: | ||
The Information Officer (IO) portfolio includes integration of the different ACL-wide activities that are related to information dissemination; including the Anthology, website, wiki, portal and archive. Plans include provide integration of logins (through OpenID and OAuth; IN PROGRESS); update our information services to be updated and professionally-designed (PLANNED). | The Information Officer (IO) portfolio includes integration of the different ACL-wide activities that are related to information dissemination; including the Anthology, website, wiki, portal and archive. Plans include provide integration of logins (through OpenID and OAuth; IN PROGRESS); update our information services to be updated and professionally-designed (PLANNED). | ||
− | Long-term goals for the costs of the information services to be sponsored, movement of the aclweb.org infrastructure to a more modern webhost (DONE), accessibility and long-term maintenance of the aclweb.org and other sites, and to be cost-neutral through sponsorship by corporate interests. | + | The Information Officer does not manage nor do the day-to-day operations for the Anthology, ACL Wiki or website, but has the purview to dictate policy for it. The Anthology is managed separately by Min-Yen Kan and the primary ACL Wiki is managed by Peter Turney. For other information services (other Wikis, the website and the portal), the current webmaster is in charge, but needs to have strong direction set by the Information Officer. |
+ | |||
+ | Long-term goals for the costs of the information services to be sponsored, movement of the aclweb.org infrastructure to a more modern webhost (DONE), hiring of a new responsive webmaster (DONE), accessibility and long-term maintenance of the aclweb.org and other sites, and to be cost-neutral through sponsorship by corporate interests. | ||
== IO Overview == | == IO Overview == | ||
− | '''Budget'''. The IO [http://aclweb.org/adminwiki/index.php?title=ACL_Resolutions#May_7.2C_2013 has budget] to oversee part-time manpower allocated to help improve our association's websites, which includes maintenance, upgrading, migrating and backup. So far, we have incurred costs of 2, | + | '''Budget'''. The IO [http://aclweb.org/adminwiki/index.php?title=ACL_Resolutions#May_7.2C_2013 has a budget] to oversee part-time manpower allocated to help improve our association's websites, which includes maintenance, upgrading, migrating and backup. So far, we have incurred costs of 2,940, with less than 1000 additional in committed manpower for outstanding payments to our new webmaster and for our new web host. This is well in line with our projections. |
− | |||
− | |||
+ | '''DOIs/CrossRef'''. We will be assigning document object identifiers (DOIs) on a trial basis for conference proceedings for past 2014 workshops, as a run-up pilot to NAACL and ACL. We hope to have a standard operating procedure in place for DOI assignation for our conference proceedings by the end of the year. | ||
'''Thompson Reuters Web of Knowledge / Elsevier Scopus Indexing''': We know a good portion of our membership relies on impact assessment for promotion, ranking and tenure. We are trying to tackle this problem by investigating whether our materials can be indexed by major citation indices, namely Thompson Reuter's Web of Knowledge (Wok) and Elsevier's Scopus. We have initiated discussions and provided materials since Oct 2014 to WoK, but they have not been forthcoming with any status or updates with respect to our queries. We continue to attempt to remind them on a monthly basis, but unfortunately, we have not be able to get any response. | '''Thompson Reuters Web of Knowledge / Elsevier Scopus Indexing''': We know a good portion of our membership relies on impact assessment for promotion, ranking and tenure. We are trying to tackle this problem by investigating whether our materials can be indexed by major citation indices, namely Thompson Reuter's Web of Knowledge (Wok) and Elsevier's Scopus. We have initiated discussions and provided materials since Oct 2014 to WoK, but they have not been forthcoming with any status or updates with respect to our queries. We continue to attempt to remind them on a monthly basis, but unfortunately, we have not be able to get any response. | ||
− | Elsevier requires that journals have an ISSN (largely reserved for serials / journals) | + | Elsevier requires that journals have an ISSN (largely reserved for serials / journals), an issue we are currently investigating. |
Note that the onus of journal indexing (both CL and TACL) is the responsibility of the respective journals. Currently, CL is indexed by Elsevier, but TACL is not indexed by either service. | Note that the onus of journal indexing (both CL and TACL) is the responsibility of the respective journals. Currently, CL is indexed by Elsevier, but TACL is not indexed by either service. | ||
Line 23: | Line 24: | ||
We would appreciate help from our membership who have been successful at approaching either indexing service in helping to get the agencies to index our conference proceedings. | We would appreciate help from our membership who have been successful at approaching either indexing service in helping to get the agencies to index our conference proceedings. | ||
+ | == Anthology == | ||
+ | |||
+ | The ACL Anthology is a digital archive of research papers in computational linguistics, sponsored by the CL community, and freely available to all. We employ a Creative Commons Attribution Non-Commercial, Share-Alike license for materials published by ACL. This makes our content usable by the general public with attribution to the ACL (although it is not mandatory for any user to inform us of their use of our materials). Dual licensing for a fee is presumably possible (although not exercised currently). The Information Office does not manage nor do the day-to-day operations for the Anthology but has the purview to dictate policy for it. | ||
+ | |||
+ | The Anthology now contains over 34,000 (up from 31,500 papers in the last report in Q3) The [http://aclanthology.info new ACL Anthology] is now active and will be switched to the primary Anthology site around ACL this year, as we have had some time to sort out problems with the site. However, we know a portion of our membership will want to still use the older version, so we are going to maintain both sites at least until the end of 2015. | ||
+ | |||
+ | An achievement for the Anthology is the semi-updated statistics on the most accessed papers and authors from the Anthology. We hope to automate this information and propagate this information into the pages for the papers and authors so to provide additional data for authors to argue for their impact. We have preserved the web log data for the new Anthology so as to be able to run other analytics when interests from members of our community can utilize the logs to create better services for ourselves. | ||
− | + | We hope to interest the ACL community in volunteering time to help the Anthology improve. We will likely discuss this through circulation to the Exec at a later date. | |
− | + | While the new Anthology is live, it lives on a university virtual machine in Singapore, and will not likely scale to provide adequate bandwidth when faced with the full access from the ACL membership and general public. We are investigating which service to take our work towards as it likely requires a VPS account as we need to install certain software and libraries that usually requires root privileges. We hope to work this migration in before the Q3 report. | |
− | + | Finally, we recognize that the ACL Anthology has become a significant asset for the ACL, manifesting its central role in the NLP/CL research communities. It is of too much import to have a single editor be responsible for the policymaking of the Anthology. We hope the Exec will endorse the call for a steering committee to provide the necessary oversight for the Anthology. | |
− | '''Mailing List.''' The Anthology mailing list's (http://groups.google.com/group/acl-anthology) membership pool has grown, now consisting of | + | '''Mailing List.''' The Anthology mailing list's (http://groups.google.com/group/acl-anthology) membership pool has grown, now consisting of 533 members (up from 469 from a year ago, and 479 from the last report 6 months ago). This is an announcement-only list, where we notify members of newly listed released materials online. |
'''Plans.''' A key thrust this year will be to start assigning DOIs, as part of the ACL's initiative to take DOIs under our control. | '''Plans.''' A key thrust this year will be to start assigning DOIs, as part of the ACL's initiative to take DOIs under our control. | ||
Line 36: | Line 44: | ||
A second thrust is to other forms of scientific knowledge that we are interested in archiving. These include software, datasets and videos. The procedures for integrating these with START and the submission process need to be worked out, and the space requirements for these services assessed. For the time being, we will concentrate on videos. | A second thrust is to other forms of scientific knowledge that we are interested in archiving. These include software, datasets and videos. The procedures for integrating these with START and the submission process need to be worked out, and the space requirements for these services assessed. For the time being, we will concentrate on videos. | ||
− | A third thrust | + | A third thrust will be to incorporate the results of the R50 workshop into the Anthology, and allow third-party applications to automatically annotate articles with new metadata and papers in the Anthology, as they come available. Such an API will raise the visibility of the Anthology as a object of study, complementing our earlier work to make the Anthology's text a corpus. |
− | We have long term plans to work on these other following | + | We have long term plans to work on these other following issues which are smaller in scope than the above major thrusts: |
* A previous discussion (with Ken Church) proposed that we create a single bibtex file for all Anthology materials. The beta Anthology can generate such information fairly easily with its database backing; we plan to have this file available soon (before the ACL 2014 conference). | * A previous discussion (with Ken Church) proposed that we create a single bibtex file for all Anthology materials. The beta Anthology can generate such information fairly easily with its database backing; we plan to have this file available soon (before the ACL 2014 conference). | ||
− | * collaboration with START and aclpub (also may involve the Conference Officer's work) | + | * To create a XML representation of all of the metadata that is used to create the Anthology site. |
+ | * collaboration with START and aclpub (also may involve the Conference Officer's work) to integrate users of their system and to obtain LaTeX and abstracts for indexing and preservation. | ||
* collaboration with ELRA with respect to use of the LRE Map and ISLRNs, and voluntarily helping them with scanning backlog archives into a digital form. | * collaboration with ELRA with respect to use of the LRE Map and ISLRNs, and voluntarily helping them with scanning backlog archives into a digital form. | ||
− | == Web Site == | + | == Web Site and Hosting Provider == |
− | The [http://www.aclweb.org/ ACL website] continues to serve as the primary online resource for the organization. It contains the main ACL site, an ACL Wiki which serves as a resource to the general computational linguistics community, an ACL Admin wiki used to store and maintain ACL specific resources such as reports, handbooks, and policies as well as an | + | The [http://www.aclweb.org/ ACL website] continues to serve as the primary online resource for the organization. It contains the main ACL site, an ACL Wiki which serves as a resource to the general computational linguistics community, an ACL Admin wiki used to store and maintain ACL specific resources such as reports, handbooks, and policies as well as an Exec wiki reserved for the use of ACL Exec officers. We also maintain mirrors of individual ACL conference websites, membership email lists for ACL announcements and a listing of resolutions of the ACL Exec Committee. |
− | + | Our main website is in Drupal 7, and our current hosting provider is Bluehost, on their shared Pro hosting (about $20 per month), which has seen fairly good response times. We have migrated some mailing lists from Drago's U Michigan address since the last report. | |
− | + | == Portal == | |
− | The | + | The [http://www.aclweb.org/portal ACL Portal] was created to provide a web-based platform to house facilities for the benefit of members. The Portal currently serves little function other than maintaining a list of current members and a payment gateway for membership. We are currently working towards integrating the Portal into the website's functionality, now that both systems are run on a common platform (Drupal 7). Integration will involve upgrading existing custom modules developed by Ben Phelan (the previous developer) for the Portal to Drupal 7; this is still ongoing work. Pranav, our current webmaster, is working towards these goals since last report; we are still waiting on him to finish the integration (stalled). |
− | + | We still need to manage spam registrations on a weekly basis as the Portal allows anyone to register an account (and get a webpage listing their profile, a target for spammers to get an "endorsed" hyperlink from the Portal). An open issue will be to lock down new registrations to the Portal in effort to combat spam. | |
− | + | == Agenda == | |
− | We are now working in parallel on consolidating the ACL Website and the ACL Portal, | + | We are now working in parallel on consolidating the ACL Website and the ACL Portal, which is our primary goal for our current webmaster. We also hope to resume our work on the establishment of a central login for ACL services (something akin to a "ACL Account" a la Google or Facebook). We are planning to use OpenID and OAuth, which would allow members to link their ACL account with other (i.e., Google, LinkedIn, Twitter, Microsoft/Hotmail) services; such that one could use login credentials from those services for ACL use. |
− | |||
− | |||
− | |||
− | + | The Information Officer is a position linked with one of the At-Large positions on the ACL Executive board. It is up for election and candidates are welcome to contact me for details about the duties. |
Latest revision as of 17:35, 18 July 2016
[Link to 2014 Q3 Report] [Link to 2014 Q1 Report] [Link to 2013 Q3 Report] [Link to 2013 Q1 Report]
The Information Officer (IO) portfolio includes integration of the different ACL-wide activities that are related to information dissemination; including the Anthology, website, wiki, portal and archive. Plans include provide integration of logins (through OpenID and OAuth; IN PROGRESS); update our information services to be updated and professionally-designed (PLANNED).
The Information Officer does not manage nor do the day-to-day operations for the Anthology, ACL Wiki or website, but has the purview to dictate policy for it. The Anthology is managed separately by Min-Yen Kan and the primary ACL Wiki is managed by Peter Turney. For other information services (other Wikis, the website and the portal), the current webmaster is in charge, but needs to have strong direction set by the Information Officer.
Long-term goals for the costs of the information services to be sponsored, movement of the aclweb.org infrastructure to a more modern webhost (DONE), hiring of a new responsive webmaster (DONE), accessibility and long-term maintenance of the aclweb.org and other sites, and to be cost-neutral through sponsorship by corporate interests.
IO Overview
Budget. The IO has a budget to oversee part-time manpower allocated to help improve our association's websites, which includes maintenance, upgrading, migrating and backup. So far, we have incurred costs of 2,940, with less than 1000 additional in committed manpower for outstanding payments to our new webmaster and for our new web host. This is well in line with our projections.
DOIs/CrossRef. We will be assigning document object identifiers (DOIs) on a trial basis for conference proceedings for past 2014 workshops, as a run-up pilot to NAACL and ACL. We hope to have a standard operating procedure in place for DOI assignation for our conference proceedings by the end of the year.
Thompson Reuters Web of Knowledge / Elsevier Scopus Indexing: We know a good portion of our membership relies on impact assessment for promotion, ranking and tenure. We are trying to tackle this problem by investigating whether our materials can be indexed by major citation indices, namely Thompson Reuter's Web of Knowledge (Wok) and Elsevier's Scopus. We have initiated discussions and provided materials since Oct 2014 to WoK, but they have not been forthcoming with any status or updates with respect to our queries. We continue to attempt to remind them on a monthly basis, but unfortunately, we have not be able to get any response.
Elsevier requires that journals have an ISSN (largely reserved for serials / journals), an issue we are currently investigating.
Note that the onus of journal indexing (both CL and TACL) is the responsibility of the respective journals. Currently, CL is indexed by Elsevier, but TACL is not indexed by either service.
We would appreciate help from our membership who have been successful at approaching either indexing service in helping to get the agencies to index our conference proceedings.
Anthology
The ACL Anthology is a digital archive of research papers in computational linguistics, sponsored by the CL community, and freely available to all. We employ a Creative Commons Attribution Non-Commercial, Share-Alike license for materials published by ACL. This makes our content usable by the general public with attribution to the ACL (although it is not mandatory for any user to inform us of their use of our materials). Dual licensing for a fee is presumably possible (although not exercised currently). The Information Office does not manage nor do the day-to-day operations for the Anthology but has the purview to dictate policy for it.
The Anthology now contains over 34,000 (up from 31,500 papers in the last report in Q3) The new ACL Anthology is now active and will be switched to the primary Anthology site around ACL this year, as we have had some time to sort out problems with the site. However, we know a portion of our membership will want to still use the older version, so we are going to maintain both sites at least until the end of 2015.
An achievement for the Anthology is the semi-updated statistics on the most accessed papers and authors from the Anthology. We hope to automate this information and propagate this information into the pages for the papers and authors so to provide additional data for authors to argue for their impact. We have preserved the web log data for the new Anthology so as to be able to run other analytics when interests from members of our community can utilize the logs to create better services for ourselves.
We hope to interest the ACL community in volunteering time to help the Anthology improve. We will likely discuss this through circulation to the Exec at a later date.
While the new Anthology is live, it lives on a university virtual machine in Singapore, and will not likely scale to provide adequate bandwidth when faced with the full access from the ACL membership and general public. We are investigating which service to take our work towards as it likely requires a VPS account as we need to install certain software and libraries that usually requires root privileges. We hope to work this migration in before the Q3 report.
Finally, we recognize that the ACL Anthology has become a significant asset for the ACL, manifesting its central role in the NLP/CL research communities. It is of too much import to have a single editor be responsible for the policymaking of the Anthology. We hope the Exec will endorse the call for a steering committee to provide the necessary oversight for the Anthology.
Mailing List. The Anthology mailing list's (http://groups.google.com/group/acl-anthology) membership pool has grown, now consisting of 533 members (up from 469 from a year ago, and 479 from the last report 6 months ago). This is an announcement-only list, where we notify members of newly listed released materials online.
Plans. A key thrust this year will be to start assigning DOIs, as part of the ACL's initiative to take DOIs under our control.
A second thrust is to other forms of scientific knowledge that we are interested in archiving. These include software, datasets and videos. The procedures for integrating these with START and the submission process need to be worked out, and the space requirements for these services assessed. For the time being, we will concentrate on videos.
A third thrust will be to incorporate the results of the R50 workshop into the Anthology, and allow third-party applications to automatically annotate articles with new metadata and papers in the Anthology, as they come available. Such an API will raise the visibility of the Anthology as a object of study, complementing our earlier work to make the Anthology's text a corpus.
We have long term plans to work on these other following issues which are smaller in scope than the above major thrusts:
- A previous discussion (with Ken Church) proposed that we create a single bibtex file for all Anthology materials. The beta Anthology can generate such information fairly easily with its database backing; we plan to have this file available soon (before the ACL 2014 conference).
- To create a XML representation of all of the metadata that is used to create the Anthology site.
- collaboration with START and aclpub (also may involve the Conference Officer's work) to integrate users of their system and to obtain LaTeX and abstracts for indexing and preservation.
- collaboration with ELRA with respect to use of the LRE Map and ISLRNs, and voluntarily helping them with scanning backlog archives into a digital form.
Web Site and Hosting Provider
The ACL website continues to serve as the primary online resource for the organization. It contains the main ACL site, an ACL Wiki which serves as a resource to the general computational linguistics community, an ACL Admin wiki used to store and maintain ACL specific resources such as reports, handbooks, and policies as well as an Exec wiki reserved for the use of ACL Exec officers. We also maintain mirrors of individual ACL conference websites, membership email lists for ACL announcements and a listing of resolutions of the ACL Exec Committee.
Our main website is in Drupal 7, and our current hosting provider is Bluehost, on their shared Pro hosting (about $20 per month), which has seen fairly good response times. We have migrated some mailing lists from Drago's U Michigan address since the last report.
Portal
The ACL Portal was created to provide a web-based platform to house facilities for the benefit of members. The Portal currently serves little function other than maintaining a list of current members and a payment gateway for membership. We are currently working towards integrating the Portal into the website's functionality, now that both systems are run on a common platform (Drupal 7). Integration will involve upgrading existing custom modules developed by Ben Phelan (the previous developer) for the Portal to Drupal 7; this is still ongoing work. Pranav, our current webmaster, is working towards these goals since last report; we are still waiting on him to finish the integration (stalled).
We still need to manage spam registrations on a weekly basis as the Portal allows anyone to register an account (and get a webpage listing their profile, a target for spammers to get an "endorsed" hyperlink from the Portal). An open issue will be to lock down new registrations to the Portal in effort to combat spam.
Agenda
We are now working in parallel on consolidating the ACL Website and the ACL Portal, which is our primary goal for our current webmaster. We also hope to resume our work on the establishment of a central login for ACL services (something akin to a "ACL Account" a la Google or Facebook). We are planning to use OpenID and OAuth, which would allow members to link their ACL account with other (i.e., Google, LinkedIn, Twitter, Microsoft/Hotmail) services; such that one could use login credentials from those services for ACL use.
The Information Officer is a position linked with one of the At-Large positions on the ACL Executive board. It is up for election and candidates are welcome to contact me for details about the duties.