Best practices for big data handling in biobanking workshop 2015

From neicext
Jump to navigation Jump to search

Workshop name: Best practices for big data handling in biobanking workshop 2015
Organizer: Davit Bzhalava, NIASC, KI
Expected output: Recommendations on how to build efficient and sustainable e-infrastructures for biobanking data
When: 9-11 November 2015
Where: Hässelby slot, Stockholm, Sweden

This workshop will bring together Nordic national biobankers and e-infrastructure service providers for sensitive data processing, sharing their experiences on how to build and use efficient and sustainable e-infrastructures for biobank research, data management and storage, and setting up and maintaining associated ecosystem of workflows, pipelines, and bioinformatics software.

The event is funded by NeIC.

Agenda

Day 1 - From freezers and bio specimen storage to computation and data storage - Nordic biobank needs

(11:30 - 17:00)

  • 11:00-11:30 Arrival
  • 11:30-12:30 Lunch

Session 1: Nordic biobank infrastructures

  • 14:15-14:30 Coffee Break

Session 2: Genomic Data - Bioinformatics analysis pipelines/tools and resources needed (users perspectives)

Session 3: Large scale computational infrastructures

  • 16:00-17:00 Discussions
  • 18:00 Dinner

Day 2 - Big data infrastructure, emerging technologies and Nordic biobank needs

(09:00 - 17:00)

  • 09:00-09:10 Davit Bzhalava Welcome, summary of day 1 and goals of the day 2
  • 09:10-09:25 Sigurd Gartmann - Introduction to some of today's technologies

Session1: Emerging technologies

  • 10:25-10:50 Coffee Break
  • 10:50-11:05 Karan Singh - Ceph Storage
  • 11:05-11:30 Discussions - Configurations and deployments (ansible/chef/puppet
  • 11:30-12:30 Lunch

Session2: Computer infrastructures and scalable big data infrastructure for biobanks

  • 13:50-14:10 Coffee Break
  • 16:00-17:00 Discussion (including Buying hardware, security, network, agreements)
  • 18:00 Dinner

Day 3 - Integration of biobanks and IT-infrastructure

(09:00 - 12:00)

  • 10:50-11:00 Coffee Break
  • 10:50-11:30 Discussions and planning for the larger workshop (place, dates, topics, participants, type: open/closed etc).
  • 11:30-12:30 Lunch and Departure

Report

The workshop titled "Best practices for big data handling in biobanking" took place in Stockholm on Nov 9-11, 2015. The workshop consisted of presentations and discussion sessions, and was visited by a total of 23 people distributed over the following countries:

Country Participants
Sweden 9
Finland 5
Norway 5
Denmark 3
Poland 1

Participants:

Name Institution Country
Ann-Charlotte Sonnhammer it.uu se
Antti Pursula csc fi
Anu Jalanko thl fi
Bartlomiej Wilkowski ssi dk
Bengt Persson icm.uu se
Einar Ryeng ntnu no
Fredrik Tingstedt ki se
Gard Thomassen usit.uio no
Jaakko Leinonen csc fi
Jim Dowling kth se
Joakim Dillner ki se
Joel Hedlund nsc.liu se
Karan Singh csc fi
Kimmo Pääkkönen helsinki fi
Kristian Hveem ntnu no
Mads Melbye ssi dk
Niclas Jareborg bils se
Oddgeir Lingaas Holmen ntnu no
Piotr Bała icm.edu pl
Sigurd Gartmann ntnu no
Suyesh Amatya ki se
Victor Yakimov ssi dk
Zurab Bzhalava ki se

Findings

The workshop found that:

  • This workshop was useful for giving the participants a good overview of needs and challenges for IT solutions for Nordic National Biobanks & National Biobank infrastructures.
  • Community of Nordic national biobanks, national biobank infrastructures and e-infrastructure service providers have many common interests, a good spirit of collaboration and what most important understanding that there is much to gain from engaging in collaborations in these areas of common interest.
  • The challenges in this area are varying in scope and complexity, and have different appropriate levels of engagement. They can for example be addressed by:
    • Getting the necessary service directly from one of the Nordic national e-infrastructure providers. In this case, NeIC can assist either by simply putting the parties in contact or by more active collaboration, as necessary.
    • NeIC Tryggve taking them on as pilot use cases. Tryggve pilot use cases are limited in time and are of reasonable size. Costs for resource use in Tryggve pilot use cases are covered by Tryggve.
    • Setting up NeIC collaborative projects, under the NeIC co-funding principles.
  • To provide a road map for closer collaboration between Nordic national biobanks, national biobank infrastructures and e-infrastructure service providers 3 working groups were established:
    • Group to formulate a joint use case on IT needs for the Nordic National Biobanks & National Biobank infrastructures. Coordinator of this group will be leader of one of the BBMRI nodes in Nordic countries;
    • Group for open source development of optimizing most commonly used genomic data analysis, biobanking software to efficiently utilize parallel computing hardware and deliver results in a reasonable time. This working group will be coordinated by Davit Bzhalava.
    • Group on Registromics, to develop platform and IT solutions that will enable to connect multiple outcome registries with different biomedical data sources and will enable to conduct reversed data mining. This working group will be coordianed by Bartlomiej Wilkowski.
  • These working groups will be composed of researchers and IT staff, who will together ensure that the use cases are founded in real researcher needs and are technically feasible with the resources available.
  • The progress of these working groups will be followed up on the NIASC annual meeting January 12 in Copenhagen.
  • NeIC will collaborate with NIASC to organize a larger, open workshop on these topics later in the spring.