NLPL steering group meeting jf 2017-11-09
NLPL steering group meeting
Time: 2017-11-09
- 9 - 12 CET
Place: Radisson Blu Airport Hotel, OSL, Gardemoen, Norway
Invited:
- Tomasz Malkiewicz, NeIC (PO)
- Joakim Nivre, Uppsala University
- Jörg Tiedemann, University of Helsinki
- Martin Matthiesen, CSC-IT Center for Science Ltd
- Stephan Oepen, University of Oslo
- Anders Søgaard, University of Copenhagen
- Filip Ginter, University of Turku
- Gunnar Bøe, UNINETT Sigma2 AS
- Bjørn Lindi, NeIC (PM)
Present:
- Tomasz Malkiewicz, NeIC (PO)
- Joakim Nivre, Uppsala University
- Jörg Tiedemann, University of Helsinki
- Martin Matthiesen, CSC-IT Center for Science Ltd
- Stephan Oepen, University of Oslo
- Anders Søgaard, University of Copenhagen
- Filip Ginter, University of Turku
- Bjørn Lindi, NeIC (PM)
Absent:
- Gunnar Bøe, UNINETT Sigma2 AS
NLPL-SG 17-35 The Agenda for the SG meeting
- 09:00 NLPL SG 17-35. Attendance and agenda (5’)
- 09:05 NLPL SG 17-36. SG responsibilities according (15')
- 09:20 NLPL SG 17-37. Status of the project - including status of NLPL task forces (30’)
- 09:50 NLPL SG 17-38. A new NLPL partner? Computer Science Department @ IT University Of Copenhagen (10')
- 10:05 NLPL SG 17-39. Review of personnel situation (15’)
- 10:15 NLPL SG 17-40. Status report to NeIC board (10')
- 10:25 Coffee break (5')
- 10:30 NLPL SG 17-41. Mid-term evaluation in 2H/3H 2018 (10')
- 10:40 NLPL SG 17-42. Priorities in 2018 (1h10')
- 11:50 NLPL SG 17-43. Next meetings (5’)
- 11:55 NLPL SG 17-44. AOB (5’)
Agenda item 17-38 and 17-39 was switched
NLPL SG 17-36. SG responsibilities according
Presenting NeIC. NeIC's funding comes from
- 1/3 projects
- 1/3 Nordforsk
- 1/3 National e-Infrastructure providers
The Steering Group responsibilities according to the [1]
Currently the project is in the execution phase. In 6 months the Steering Group needs to discuss how to continue the project, by formulating a follow-up project or how to end the project. By ending the project, the Steering Group needs to discuss how project results can be transferred, see item NLPL SG 17-41.
NLPL SG 17-37. Status of the project - including status of NLPL task forces
The table gives the status of the project. All milestones for M12 are in progress.
Milestone | Milestone Description | lead | month 6 | month 12 | month 14 | month 18 |
---|---|---|---|---|---|---|
A1.1 | Project Report Year one | PM | In progress | |||
A2.1 | Setup of collaboration infrastructure | PM | DONE(M2) | |||
A2.2 | Update of collaboration infrastructure | PM | In progress | |||
A3.1 | Trial environment for portable, modular installation | PM | In progress | |||
A3.2 | Survey needs and use of emerging technologies | PM | In progress | |||
A3.3 | Facilitate access to resources at Sigma2 and CSC | PM | In progress | |||
A3.4 | Cost-benefit Analysis of the laboratory | PM | In progress | |||
B1.1 | Install Moses Release 3.0 and support tools | UoH | DONE | |||
B1.2 | Moses Development Environment | UoH | In progress | |||
B1.3 | Moses Documentation and tutorials | UoH | In progress | |||
B2.2 | MT data sets and documentation | UoH | In progress | |||
B3.1 | Helsinki NMT system with documentation | UoH | In progress | |||
C1.1 | Dependency Parsing Data version 1 | UU | In progress | |||
C2.1 | Dependency Parsing Parses version 1 | UU | In progress | |||
C3.1 | Dependency Parsing Parsing tutorial | UU | In progress | |||
D1.1 | Clarification of applicable licensing schemes | UoT | DONE (M3) | |||
D1.2 | Relevant data sets installed with license management infrastructure | UoT | Will not be done | |||
E1 | Pre-trained embeddings for ENG,DAN,FIN,NNO,NOB,SWE | UiO | DONE | |||
F1,1 | Extrinsic Evaluation Data First Batch | UiO(UoC) | DONE | |||
F2.1 | Extrinsic Evaluation Code for First Batch | UiO(UoC) | DONE | |||
G1.1 | Running OPUS Server | UoH | DONE | |||
G1.2 | Mirror OPUS data | UoH | In progress | |||
H1 | Winter School | UiO | due M15 | |||
H2 | Web site | UiO | DONE (M3) | |||
H2.2 | Position paper on NoLaiDa | UiO | DONE (M6) |
The table gives the status of the project. All milestones for M12 are in progress.
We have established task forces for Infrastructure and for Outreach. A few more are needed to cover all milestones. The following task forces are suggested:
Task force | Area covered | Status |
---|---|---|
Infrastructure | A Technical Infrastructure | Established |
Parsing | C Dependency Parsing | |
Data | D Large Corpora, E Embeddings | |
Translation | B Machine Translation, G Parallel Corpora and OPUS | |
Outreach | H Community Building | Established |
The project will not use Slack anymore. Teams are free to still use it. Email and Google Hangout/Zoom video meetings are channels for communication.
NLPL SG 17-38. A new NLPL partner? Computer Science Department @ IT University Of Copenhagen
The Machine Learning Research Group at the IT University of Copenhagen is now a partner of the project.
The Person Months spreadsheet has been revised:
- The University of Copenhagen's part is reduced to a total of 0.6, 1.2, and 1.2 in 2017, 2018, and 2019,respectively;
- The IT University will have the following contributions (fractions of a PM) 0.4, 1.2, and 1.2 in 2017, 2018, and 2019, respectively;
- University of Oslo has gotten increased their share of work in 2017 1.4
NLPL SG 17-39. Review of personnel situation
The revised Person Month budget is available on the NeIC wiki:PM budget approved
Personnel list available at the NeIC wiki: Project Personnel
NLPL SG 17-40. Status report to NeIC board
The PM give a brief status of the project to the NeIC Board, by setting "traffic lights" for the project's results goals, cost/resource goals and time goal. The status is given to each The NeIC board meeting.
The following was reported to the NeIC Board meeting in September: NLPL Cost/Resource goal is changed from green to yellow: The Project Partner University of Copenhagen has reported that they currently do not have personnel who can contribute to the project. At earliest UoC can work on the project from next summer - one year after schedule.
For the upcoming NeIC board meeting in December, the PM will report: NLPL Cost/Resource goals is changed from yellow to green: The IT University (ITU), Copenhagen, Denmark has been included in the project. ITU and UoC will split the work earlier agreed upon by UoC. The personnel issue is resolved.
NLPL SG 17-41. Mid-term evaluation in 2H/3H 2018
There will be a mid-term evaluation of the project by the NeIC board, either in June or September, next year. PM or PO will present the future direction of the project. The Steering Group must work beforehand on a proposal on how to proceed. The next Steering Group F2F should be held before the mid-term evaluation. This is very important since it will impact the continuation of the project, see NLPL SG 17-43
NLPL SG 17-42. Priorities in 2018
Do a cost-benefit analysis of the NLPL for the current year
- What has the involvement in NLPL cost you?
- What are the immediate benefits, if any, and what are the benefits for 2018?
Group1:
- Stephan: ~2PMs, experience with different system (Taito)
- Filip: Project was bureacracy heavy at the beginning, getting better now. Plus: Setting up infrastructure as end (and not as side effect)
- Martin: More work than expected, but broader perspective on infrastructure issues as benefit
Group2:
* Access to taito and able is a real benefit. * Some of the things we do in the project, would we have done anyway, but there is som extra work to comply with the project. * It is benefit that software/infrastructure are equal on taito and abel. * There is mixed experience with moving around from system to system, which as been the case previously * OPUS has a more permenant "home" - better infrastructure for the OPUS service now than previous (benefit). * Having baselines available from a common place is useful * Increased visibility for NLP-research when applying for compute time
Do a emerging technology analysis
- What are the technologies you would like to see available/used in the project
* GPUs to be investigated further. * Exchange of experiences, CPU/GPU comparison (FG: factor 10)
- What are the consequences for the project if the identified technologies are incorporated in the project
* Enabling new technologies.
Elaborate the milestones for 2018
These are the milestones due in a year:
Milestone | Milestone description | lead | Usefulness | Risk |
---|---|---|---|---|
A.1.1 | Project Report Year two | PM | 5 | 1 |
B.1.1 | Moses Development environement update | UoH | 3 Useful as baseline, for teaching; Users from Oslo using Taito; but: MT moved on to neural networks | 1 |
B.2.2 | Updated MT data sets and documentation | UoH | ||
B.3.2 | Helsinki NMT system updates | UoH | 5 Important emerging technology | |
B.4.1 | Documented Helsinki NMT Baselines | UoH | 4-5 NMT is new technology | |
B.4.2 | Documented SMT Baselines | UoH | 3-4 | |
C.1.2 | Dependency Parsing version 2 | 5 | 1 Software exists. | UU |
C.2.2 | Dependency Data version 2 | UU | 5 | 1 Data exists. |
C.3.2 | Dependency Parsing tutorial version 2 | UU | 5 | 2 does not yet exist, but funds available. |
D.2.2 | Common-Crawl-derived corpora for at least five languages | UoT | 5 Used by dozens of teams in the CoNNL-Shared Task 2017 (100+ registered users) | 1 |
E.2 | Updated Embeddings, including additional languages | UiO | 4 Downloadable, should be made available in Taito/Abel | |
G.3.1 | Web services and their documentation | UoH | 5 | 1 Exists. |
H.1.2 | Winter School ’19 | UiO | 3-4 We know more after WS 18 | 3 Funding, Organization, Visibility, Involvement |
One way to discuss the milestones for 2018 is to identify more specific targets under each milestone, keeping the benefits and the emerging technologies in mind. Each group member then characterise, each target by usefulness and risk. Use a character from the set {1,2,3,4,5} for usefulness, where 1 = not so useful and 5 = very useful. Do the similar for risk, where 1 = low risk and 5 = high risk. Summaries and average the result and use it for sorting possible targets on usefulness and/or risk. Discuss the list.
The overall conclusion is that the current work plan has milestones with high usefulness, but very low risk. Looking into '18, the project should reach all its targets for next year.
NLPL SG 17-43. Next meetings
- Tuesday, January 30, 13:00–14:30 at Skeikampen, Norway during the Winter School '18
- Tuesday, May 15, 9:00–12:00 @ Arlanda or Uppsala
NLPL SG 17-44. AOB
No item