BMS project proposal workshop minutes

From neicext
Jump to navigation Jump to search

A common theme for ideas submitted to this workshop is a need to store, process and share sensitive data. Many suggestions also involve alleviating incompatibilities among tools and data de-facto standards. Discoverability, searchability and authirized access are also frequently mentioned. Most proposed solutions suggest increasing harmonisation of systems and data (improved standards, increased adoption). Virtualized services are also frequently mentioned as a means to facilitate access to tools and data.

Infrastructure for interpretation of variation data from NGS data, Mauno Vihinen

  • Sequencing is rapidly evolving and we are closing the 1000 dollar genome at fast pace (in fact exomes can already be obtained well below this price).
  • The bottle neck utilizing the NGS data is currently in the interpretation of the variants (i.e: what pathologic conditions they cause, and how they can best be treated).
  • This is genomic information, i.e. sensitive data.
  • For interpretation of variation effects computational tools are needed.
  • We need to create a platform that can produce the best possible prediction and keep pace with sequencing output.
  • This platform would likely be best implemented on a PaaS, with secure, authorized access to sensitive storage.

LIMSaaS Roxana Merino

  • LIMS is what labs use to enter data about bio samples into a computer.
  • Keeps track of info, metainfo and where to get the actual samples.
  • There are many LIMS (Inhouse coded < opensource < commercial).
  • All are expensive. All are incompatible. The more feature you want the more expensive ythe LIMS gets.
  • Researchers would benefit from harmonisation.
  • We should provide a LIMS free of charge.
  • Then people would use it, and harmonising and sharing would be enabled.
  • We should also have GUID (global unique ID) for bio-resources (sequencing machines, etc...). This would get recognition for resource providers. This idea comes from BRIEF.
  • The researcher could do experiment design in the LIMSaaS, put data and results in the LIMS, share through LIMS.

Q/A:

  • Q (eivind) How to do customization?
  • A Like normal. Install a LIMS and use it.
  • Q (gudmund) What is the Nordic added value?
  • A Standardisation in the Nordics. Availability for researchers. Reduced data loss. More openness.

LIMSaaS continued, Juha Muilo

  • We should develop a computing platform as a service, instantiatable in any Nordic country.
    • Secure: MFA, VPN / lightpath connections, security log review...
    • Scalable
  • Discoverability: expose non-sensitive metadata for public search.
  • SaaS such as LIMSaas can be built on top of a generic PaaS for BMS.
  • Many services could be built on top of such a PaaS.
  • Step one would be requirements collection for a sensitive data PaaS.

Tryggve Tommi Nyrönen

  • A Nordic secure platform for storage and computation on sensitive data.
  • Already at the Pb level.
  • Core component: computational storage. 100Tb storage with software stack that changes every hour.
  • Examples: Galaxy, chipster...
  • Need virtualisation to support this.
  • Root permissions could be given to trusted partners (DTU, USIT...)
  • Future challenges: AAI, big reference data, networking, tool standards infrastructure...

Q/A:

  • Q (ola) Collaborating with big reference data is a challenge.
  • A 1Gb lightpath is sufficient for handling the 1000 genome project updates.
  • A (fredrik) Amazon provides this.
  • A (joel) Inputfile caching has been solved, for example in the ARC middleware (or a simple http cache).
  • Comment (eivind) There is overlap between proposals. Fist step would be to elucidate what these are and translate to jobs, and delegating these.

Supply compute power to companies, Emil Rydza

  • DTU is constructing a secure hardware/software solution that can be offered to third parties.
  • Idea is at an early stage.
  • The nordic perspective is to offer this idea and experience to other Nordics.

Q/A:

  • Q (gard) How can you sell compute funded by tax money to industry.
  • A Dunno just yet.

Certification for data and system maintainers, Dejan Vitlacil

  • Regarding sensitive data, trust is key.
  • Legal issues need to be investigated thoroughly.
  • Data centres should have certification that show that they are trustworthy, have certain security protocols, etc...

Q/A:

  • Q (fredrik) This is an obvious thing for NeIC to do. Boring but necessary work.
  • A Yes.
  • Q Is this EUDAT work?
  • A Too big; headache. Nordics is more manageable.

David Silverstein

  • We need more than imaging data to understand the brain; we need meta-analysis, simulation.
  • Different techniques give different types of data (fMRI: spacial resolution, MEG: temporal resolution).
  • Combining data from these sources would improve simulation.
  • There are interoperability issues.
  • Need to store data, results and models.

Q/A:

  • Q (gudmund) What is the Nordic added value.
  • A Increased access to data; models are genrally underdetermined.
  • Q (gard) Is this considered sensitive data? In Norway: yes. fMRI images are high-resolution enough to recognize faces.
  • A (ann-charlotte) In Sweden: yes. This falls under Personuppgiftslagen (PUL).
  • A This may be anonymizable through aggregation.