NeIC Conference 2013: Report on "Workshop: Science Gateways"
Jump to navigation
Jump to search
- two objectives for bioportal 2.0: reproducibility and shareability (file sharing, workflow support, workflow sharing)
- Galaxy setup @ UiO: DB on "external" machine, auth schemes (FEIDE + local accounts), Apache server
- tipp 1: don't change Galaxy's DB schema --> might incur high effort to keep up to date with frequent Galaxy releases
- tipp 2: adding modules/components is tricky --> do install them as system modules
- customization 1: SSL to DB
- customization 2: run apps on cluster using DRMAA interface to Slurm
- QA
- relation to hyperbrowser (Galaxy based) ? Hyperbrowser is running in a simpler (single system) setup.
- load / sizing of machine ? VM for the portal only lightly loaded; apps running on cluster
- interface to several execution resources ? not considered yet
- modularity ? not so good
- exploring Galaxy for NeIC ? sharing experience, but domain specific installations
- Galaxy community conference, June 30 - July 2, Oslo, http://wiki.galaxyproject.org/Events/GCC2013
"Science Gateways in climate research"
- data gateways / portals: traceability, transparency, comprehensive metadata
- support access with a large variety of client tools (browser, cmd line, ...)
- Earth System Grid Federation: p2p paradigm, federation protocols
- seems concepts are well thought through (& and implemented ?), but there were a bit difficult to grasp due to the unconventional presentation style
- QA: essentially no discussion
"Fido - Providing a secure and convenient gateway to packaged HPC jobs"
- design goal: SECURITY (all is evil)
- usability: KISS, constant customer contact, base functionality without login
- claim: Galaxy was not designed to run in a web-open environment
- QA: effort spent ? 6 PM; funding ? no answer
(Speaker comment: funding question unintentionally dodged due to amnesia, so sorry about that. Short answer is funding was by SRC through SNIC and NSC, but was allocated as the result of an application to SRC by SeRC, so there. Long answer can be pursued at own expense.)
"Science Gateways and their enabling technologies from EGI and SCI-BUS" (Robert Lovas)
- science gateway primer, EGI AppDB
- key attributes of gateways: types of applications, parallelization, workflow execution, processing on different DCIs, scheduling, error handling, process provenance, share data/knowledge
- SCI-BUS develops a Liferay-based framework for constructing science gateways
- QA: workflow languange ? based on portlets (???)
Discussion Points
- security
- effort (to develop, to maintain)
- funding
- customizability / modularity
- joint services within NeIC
Session Summary
- different objectives lead to different approaches / emphasis of the implementations
- missing (my personal view ;):
- a precise analysis of requirements / use cases (scope, not in scope, maybe in scope)
- matching the analysis with constraints (environments, resources, timescale, ...)
- convincing selection of technology
- plan to scale (e.g. start simple, how to grow/scale in features and size)
Lessons Learned
- KISS
- Constant Customer Contact = CCC = Chaos Computer Club (sorry couldn't resist ;)
- share experience / benefit from expertise
- Design for scalability, prototype fully featured. There is no point prototyping something that does not do what you want.
- Do security Right, and do it already at the prototype stage, when it is cheap to rework it. You will not be able to do it "later".
Future Directions
- Galaxy community conference, June 30 - July 2, Oslo, http://wiki.galaxyproject.org/Events/GCC2013
- address missing information (see #Session_Summary)
- report on real usage, in terms of roles in actual audience, as well as raw traffic.
- do tutorials (on use, on installations, on customizations). This echoes the sentiments of conference speaker Stephan Oepen from Natural Language Processing in UiO.