NeIC Conference 2013: Report on "Data Services and Technologies"
"What business are we in? The changing face of research, service requirements and national responses" (Andrew Treloar)
Jonas report
Check cable connection.... what busiss are we really in Service requirements inftastructure responses
Quote from theodor levitt.
network do not exist for themselves We are all in the data business. (Or everything is information)
100TB all journal literature size in context
eReserach infrastructure req Create/capture Store describe identify
register discover access exploit
AU nordic similarities.
AU national data service. ANDS AU gov initiative as part of NAT COLL RES Infra
ANDS Enable transformation of:
data that are unmanaged to structured collection that are: managed
disconnected - connected invisile to findable.
ANDS activit services
plan data mgt plan tools and resource Create/capture 69 data capture projects at 23 uni store working closely with nat res data store infra describe 23 inst metadata stores proj nat voc service.
identigy datacite DOI. regeister org for AU register repository interchange format collections and eervices rif-cs based on iso 2146:2010
acess enforce by underlying data store expoit 25 inst focussed proj to dem valu of comb data advocate be the vocie of data work with goc and res funders to change settings in favor of data share
RDA the reasearch data alliance
is a new internatonal org
formng tp facilitate specific , short term efforts
working groups and interest groups
Conclusion
we are all in the data bus res need data serv
ands.org.au
rd-alliance.org
ANDS do not talk directly to users. question about budget.
ANDS change the budget model. they starting to go to univeristy. they say we do not have any money. We have som bodys that can help. We have some service that you can adapt.
Michaels report
We're all in the data business.
Arif Jinja 2010, Learned Publishing doi 10.1087/20100308
eResearch infrastructure requirements:
- create/capture
- store
- describe
- identify
- register
- discover
- access
- exploit
ANDS: 30 staff no storage
andrew.treloar.net
datacite Repository interchange format RIF-CS, based on ISO2146:2010
RDA
researchers need data services from their infrastructure providers a number of services can best be provided on a regional or national level
"Future e-Infrastructure Requirements for the EISCAT facilities" (Ian McCrea)
Jonas report
europe next gen radar for upper atmosphere and geospace study.
replace dish-based radar system with multiple large phased antennas.
Network req
archiving data warehouse all data t two independent sites at least cont access even if one site offline support both european
data provisiong challenges data product and formats need to be well-defined much more meta-data will be needed.
take home messaged
eiscat_3d is very challenging in terms of data rate coming from distributed sites much of our challenge is to get the data rate down to something we can transport, which needs hoearchical in site procc
we also need resilient multi site high volume storage with compute capacite for higer leverl data products.
Michael's report
presentation of eiscat-3d
30 Tb/s 80 PB/day
challenge of in array data processing
take-home messages:
- e3d challenging in terms of data rate
- much of challeng is how to bring down data rate so that it can be transported
-> on-site processing
- also need multi-site high-volume storage, with compute capacity for
higher-level data products
"Nordic Storage Opportunities" (Gerd Behrmann)
Gerd Behrmann, Nordic Storage Opportunities
dcache guy
which one is best? dcache or irods? similar to: which is better, word or excel?
irods: indexing, workflows
dcache: file sharing service
3 questions:
- is there a use case for cross-community shared resource iRODS instances?
- does it make sense to layer irods on top of dcache?
drawback: dcache distributed storage, data flows out of one irods node
- are there use cases for shared storage resources for big data?
Erik Lindahl: infrastructure frequently get stuck as large development projects in nordic countries. Does this happen elsewhere?
"A national archive for digital research data" (Andreas Jaunsen)
Jonas' report
A national archive for digital research data Andreas Jaunsen
Norstore link publication to cristin service. national archive
proj area data storage allocte bu application basic resource for proccessind data aim to provide tighter coupling between computer res and data serv offer alternative point of access for non-traditional user groups sharing data using access control or public access
open access to public-funded res long term 10 y or more federated auth feide req the user to desc their data according to a meta-data schema
aim to use doi identif will use NLOD liceense based on CC by default , data should be pub without major restrictions web-based user interface non-interactive and community specific access points will be developed after initial
long term provisioning and access to data:
sustainable competence on data:
user-funded service and cost effective sustainable operation.
jacko: single meta data scheme. Is it realistic to use the same for many fields.
A they do not dwell in to specific
Michael's report
NorStore (Notur is for computing)
largest user: earth science 73% life science 2nd at 13%
1.6PB disk 3PB tape aim to double capacity every 2 yrs
project areas + data archive area some computing attached to project area
Project area: data storage allocation by application feide authenticated project management interface (MAS) basic resources for processing data sharing using access control or open access
Archive: public access to public-funded research 10yr data service federated id (feide) requires metadata doi on the roadmap web-based ui data can not be modified or deleted (traceability) link data to publication
key messages, opportunities for nordic collab:
- long-term provisioning and access to data
- sustainable competence on data
- user-funded services and cost-effective sustainable operation
question: many initiatives at national level, research is international. Maybe collaborate w/ other countries?
will not expose irods to users federated aai, no local accounts
Jacko: realistic to have same metadata scheme for all disciplines? ...
"EUDAT - Towards a Collaborative Data Infrastructure - A Nordic Perspective?" (Damien Lecarpentier)
Jonas' report
Damien Lecarpentier collaboratice data infra a framework for the future
common data services?
EUDAT should investigate the Q
18 month
safe replications
data curation and access optimization
data staging.
simple store.
metadata cataloug.
Nordic has to join forces to create a critical mass.
what role for neic
neic can act as a discussion platform providing a forum for nordic e-infrastructure stakeholders to meet.
neic can act as a integrator by sponsoring joints pilots between nordic communities and e-infra
Michael's report
EUDAT - Towards a Collaborative Data Infrastructure - A Nordic Perspective? Damien Lecarpentier
selected services after 18 months:
- safe replication (data curation and access optimization)
- data staging (dynamics replication to HPC)
- simple store
- metadata catalogue
- AAI
no direct involvement of Nordic researchers yet can NeIC foster the participation of the Nordic representatives in EUDAT?
research in Nordic countries is efficient and well supported but each Nordic country is small put together, the Nordics can compete with the largest countries
What role for NeIC?
- Collecting and analyzing RIs e-infrastructure requirements
fostering dialogue and cooperation between RIs
- can NeIC act as an integrator by sponsoring joint pilots?
- Nordics stronger together