NeIC Conference 2013: Report on "Data Services and Technologies"

From neicext
Jump to navigation Jump to search


"What business are we in? The changing face of research, service requirements and national responses" (Andrew Treloar)

Jonas report

Check cable connection.... what busiss are we really in Service requirements inftastructure responses

Quote from theodor levitt.

network do not exist for themselves We are all in the data business. (Or everything is information)

100TB all journal literature size in context

eReserach infrastructure req Create/capture Store describe identify

register discover access exploit

AU nordic similarities.

AU national data service. ANDS AU gov initiative as part of NAT COLL RES Infra

ANDS Enable transformation of:

data that are unmanaged to structured collection that are: managed

disconnected - connected invisile to findable.

ANDS activit services

plan data mgt plan tools and resource Create/capture 69 data capture projects at 23 uni store working closely with nat res data store infra describe 23 inst metadata stores proj nat voc service.

identigy datacite DOI. regeister org for AU register repository interchange format collections and eervices rif-cs based on iso 2146:2010

acess enforce by underlying data store expoit 25 inst focussed proj to dem valu of comb data advocate be the vocie of data work with goc and res funders to change settings in favor of data share


RDA the reasearch data alliance is a new internatonal org formng tp facilitate specific , short term efforts

working groups and interest groups

Conclusion

we are all in the data bus res need data serv


ands.org.au rd-alliance.org

ANDS do not talk directly to users. question about budget.

ANDS change the budget model. they starting to go to univeristy. they say we do not have any money. We have som bodys that can help. We have some service that you can adapt.


Michaels report

We're all in the data business.

Arif Jinja 2010, Learned Publishing doi 10.1087/20100308


eResearch infrastructure requirements:

  • create/capture
  • store
  • describe
  • identify
  • register
  • discover
  • access
  • exploit

ANDS: 30 staff no storage

andrew.treloar.net

datacite Repository interchange format RIF-CS, based on ISO2146:2010

RDA

researchers need data services from their infrastructure providers a number of services can best be provided on a regional or national level


"Future e-Infrastructure Requirements for the EISCAT facilities" (Ian McCrea)

Jonas report

europe next gen radar for upper atmosphere and geospace study.

replace dish-based radar system with multiple large phased antennas.

Network req

archiving data warehouse all data t two independent sites at least cont access even if one site offline support both european

data provisiong challenges data product and formats need to be well-defined much more meta-data will be needed.

take home messaged

eiscat_3d is very challenging in terms of data rate coming from distributed sites much of our challenge is to get the data rate down to something we can transport, which needs hoearchical in site procc

we also need resilient multi site high volume storage with compute capacite for higer leverl data products.

Michael's report

presentation of eiscat-3d

30 Tb/s 80 PB/day

challenge of in array data processing

take-home messages:

  • e3d challenging in terms of data rate
  • much of challeng is how to bring down data rate so that it can be transported

-> on-site processing

  • also need multi-site high-volume storage, with compute capacity for

higher-level data products


"Nordic Storage Opportunities" (Gerd Behrmann)

Gerd Behrmann, Nordic Storage Opportunities

dcache guy

which one is best? dcache or irods? similar to: which is better, word or excel?


irods: indexing, workflows dcache: file sharing service

3 questions:

  • is there a use case for cross-community shared resource iRODS instances?
  • does it make sense to layer irods on top of dcache?

drawback: dcache distributed storage, data flows out of one irods node

  • are there use cases for shared storage resources for big data?

Erik Lindahl: infrastructure frequently get stuck as large development projects in nordic countries. Does this happen elsewhere?

"A national archive for digital research data" (Andreas Jaunsen)

Jonas' report

A national archive for digital research data Andreas Jaunsen

Norstore link publication to cristin service. national archive

proj area data storage allocte bu application basic resource for proccessind data aim to provide tighter coupling between computer res and data serv offer alternative point of access for non-traditional user groups sharing data using access control or public access

open access to public-funded res long term 10 y or more federated auth feide req the user to desc their data according to a meta-data schema

aim to use doi identif will use NLOD liceense based on CC by default , data should be pub without major restrictions web-based user interface non-interactive and community specific access points will be developed after initial

long term provisioning and access to data:

sustainable competence on data:

user-funded service and cost effective sustainable operation.

jacko: single meta data scheme. Is it realistic to use the same for many fields.

A they do not dwell in to specific


Michael's report

NorStore (Notur is for computing)

largest user: earth science 73% life science 2nd at 13%

1.6PB disk 3PB tape aim to double capacity every 2 yrs

project areas + data archive area some computing attached to project area

Project area: data storage allocation by application feide authenticated project management interface (MAS) basic resources for processing data sharing using access control or open access

Archive: public access to public-funded research 10yr data service federated id (feide) requires metadata doi on the roadmap web-based ui data can not be modified or deleted (traceability) link data to publication

key messages, opportunities for nordic collab:

  • long-term provisioning and access to data
  • sustainable competence on data
  • user-funded services and cost-effective sustainable operation

question: many initiatives at national level, research is international. Maybe collaborate w/ other countries?

will not expose irods to users federated aai, no local accounts

Jacko: realistic to have same metadata scheme for all disciplines? ...

"EUDAT - Towards a Collaborative Data Infrastructure - A Nordic Perspective?" (Damien Lecarpentier)

Jonas' report

Damien Lecarpentier collaboratice data infra a framework for the future

common data services?

EUDAT should investigate the Q


18 month safe replications data curation and access optimization data staging.

simple store.

metadata cataloug.

Nordic has to join forces to create a critical mass.

what role for neic

neic can act as a discussion platform providing a forum for nordic e-infrastructure stakeholders to meet.


neic can act as a integrator by sponsoring joints pilots between nordic communities and e-infra

Michael's report

EUDAT - Towards a Collaborative Data Infrastructure - A Nordic Perspective? Damien Lecarpentier

selected services after 18 months:

  • safe replication (data curation and access optimization)
  • data staging (dynamics replication to HPC)
  • simple store
  • metadata catalogue
  • AAI

no direct involvement of Nordic researchers yet can NeIC foster the participation of the Nordic representatives in EUDAT?

research in Nordic countries is efficient and well supported but each Nordic country is small put together, the Nordics can compete with the largest countries

What role for NeIC?

  • Collecting and analyzing RIs e-infrastructure requirements

fostering dialogue and cooperation between RIs

  • can NeIC act as an integrator by sponsoring joint pilots?
  • Nordics stronger together


Panel Discussion

Session Summary

Lessons Learned

Future Directions

Opportunities