Glenna2/Team-Meeting-2018-02-13

From neicext
Jump to navigation Jump to search


Meeting Feb 13th 2018 at 10:00-11:30 CET

Present: Raymond, Ingemar, Kalle, Jukka Nousiainen, Dan

Channels: Google Hangouts: https://plus.google.com/hangouts/_/g4g5nyl5glc66wqgbqk4yjvb4ua

1. Review of last meeting and News

  • Jukka is a new face.

2. Glenna2 IaaS discussion

  • Using Galera, separate location in Bergen just to make sure it's up
  • Two regions
  • Should we have coordinated security response in the Nordics?

Spectre and Meltdown

  • Kalle: with Meltdown it was really hard to find out what the actual impact is and real mitigations
  • Raymond is in the Glenna channel on NeIC Slack. Could use for

IF: most headache with the fixes they've sent problems wth latest Ubuntu kernel.

New fix on the way but only for Skylake

RK: problems with Dell firmware upgrades no problems with CentOS upgrade

KH: a lot of work to boot all of our customer images.

RK: found it easier to wait

KH: took five days to get response that KVM is not affected

KH lot's of customer communication

IH: not a lot of questions from users ( maybe because they are so few)

RK: most of customers run on local storage. We do not do live migrations

IH: SSC runs Ceph as backend which makes it easy to live migrate

KH: have you had problems live migrating VM:s

IH: not often, but sometimes. It has happened that VM:s go down due to live migration

KH: seldom do live migration. We have most things local

IH: It's normal to have live migration but you can live migrate again

JN: how about the OpenStack evacuate command. In Newton live migration seems very stable

IH: for instances that are very active live migratin may be tricky

Nova evacuate on shared storage

RK: problems using OS command line tool.

cPouta object storage story: http://pouta.blog.csc.fi/2018/02/admin-stories-implement-object-storage.html

Live migration and patching in SSC:

http://snic-science-cloud-operators-documentation.readthedocs.io/en/latest/sysadmin_common/

SSC runs Newton clients

RK: we will launch object store

RK: we use adio and run Newton. Started

RK: puppet 5 is too new, so we upgraded to puppet4.

  • most of the challenges did not have to do with puppet4.
  • Were using Calico for network

JN: are you using rackstack?

RK: puppetmaster needs more resources but handles a bigger workload.

IH: SC on Puppet5 not a lot of problems to go from 4 to 5 SSC uses ansible to deploy openstack

KH: CSC ansible to configure base layer and Puppet to set up openstack

JN: did you create a greenfield puppet server

RK: we nuked the puppet server and started fresh. Used Foreman for installation.

RK: got gnocci running do this part first and after that Ceilometer

IH: Ceilometer is very chatty can easily overload storage backend

RK: discussing using separate message queue. Using sensor for monitoring.

IH: late march upgrade to queens

Next meeting

  • 6th of March same time same place.