Copied the content of the minutes below for reference.
Can't recall if everyone is on the wg-â List. If not, please add
= Attendees =
Xen: Ian Jackson, Lars Kurth
Credativ: Felix Geyer, Michael Sprengel, Yogesh Patel
= Tasks in the Marlborough colo, by ticket =
CLOSED
ACTION: Ian to commit to our git and maybe fold into existing spreadsheet
ACTION: Ian to file a ticket with Harddrive info and then Yogesh can send a price
list
OPEN
ACTION: Ian Cambpell to procure new rails
ACTION: Lars to order the HW
Ian: a couple of new tickets were created, but everything is in hand
= Other itty-bitty bits =
CLOSED
ACTION: Credativ to send report
OPEN / NEW
New ticket 65860: Password manager
ACTION: Ian J to enroll Ian Campbell and Birin Sanchez's PGP keys.
Credativ report that the ticket system web UI can only grant
web access to tickets by a particular submitter, which would
not be so useful. It might be worth moving the ticket queue
to a VM in Rackspace.
Nearly done, but not yet close. The idea is that we have a generic e-mail address.
ACTION: credativ to create an email alias info@xxxxxxxxxxxxxx (add Lars, Ian Jackson).
Password will be sent to that email address
ACTION: Lars to review time report
Note from Lars: this looks sensible, but need to check how much we spend, such
that we can stay within budget
ACTION: Felix will ensure that this is sent at the beginning of each month
ACTION: Lars to send a note to the list that we are planning to kill that bugzilla
and see whether anyone needs the data. Could do a R/O archive view.
ACTION: Ian to follow up ticket 70055 (various VMs that we may not need, also see
below)
= Test colo network access =
OPEN
ACTION: Ian to file a ticket to look at the firewall for the two main hosts in
the COLO as Ian does not have a lot of confidence in it
= Monitoring =
Ian: asks whether he can access password monitoring
Felix: Password is in the password repository
Felix: A list of services that are up and which ones have problems
ACTION: Ian to check whether this is set up as expected
DONE:
ACTION: Credativ to create sub-tickets for each machine that they've
been given access to and communicate when done to us.
ACTION: Credativ to raise tickets for VMs where it is not obvious what they
are used for and for Ian/Lars to comment. Please add Lars and Ian
to the ticket. - see 70055
ACTION: Credativ to investigate after Ian J has provided access
ACTION: Credativ to set up a new VM for the monitoring daemon and
cause it to email Credativ
ACTION: Install satellite agents into info VM on the COLO and alter
the connection, such that will talk to the monitoring at
the relevant Rackspace VM.
ACTION: Credativ to consult Lars (CC Ian J) about communicating
downtime
OPEN:
We lack individual tracking of which Rackspace VMs are properly set up.
ACTION: Creative to add root@xxxxxxxxxxxxxx
Several of the Rackspace VMs are squeeze. They need to be upgraded (the monitoring
agent is not available in squeeze).
This is now planned, but we have come across a number of unexpected issues.
Felix: suggested to delay the upgrades of the other VMs, as we will hit the same
issue as for xenbits.
ACTION: Credativ to contact RAX support to avoid reboot issues we have seen with
xenbits. Propose new schedule
ACTION: Lars to tell people we will delay the upgrades
The Rackspace VMs lack the Rackspace agent. This agent would provide
an improved view in the Rackspace control panel.
IN PROGRESS
ACTION: Credativ to install the Rackspace agent on the VMs.
Started testing it, but have to deploy across all VMs . Only the admin box has
the monitoring agent installed
= Backups =
OPEN - no significant progress. See minutes from last time.
From last set's minutes (for tracking purposes): ...
We discussed a variety of possible approaches. Martin suggested that
we could perhaps back up the Rackspace VMs to the colo, and perhaps
vice versa.
The colo contains a number of service hosts (mostly VMs) most of whose
relevant state is configuration rather than data. But also a
PostgreSQL database, currently 6Gby, growing at ~~~3Gb/yr, which could
be streamed using the Postgres replication protocol (also providing a
read-only view for reporting etc.)
ACTION: Credativ to investigate after Ian J has provided access to
the colo, and make a proposal
Martin: not sure whether we have enough diskspace 1-2-1.
Ian: could probably add more diskspace if needed
Martin: it is not clear what the back-up is for
Ian: Suppose the COLO rack catches fire or a power spike
Martin: Not all of the COLO VMs need to be backed up as they look
like development VMs
Ian: At the moment our config management is very poor. The OSSTEST VM
contains a lot of stuff which does not need to be backed up
ACTION: Ian to set out some requirements by email and send to Martin
Ian: There are 4 x 1TB hard disks per machine in the COLO and we seem
to be not using it. There is no usage of DRDB.
ACTION: Creative to double check