[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Wg-test-framework] Minutes - Credativ/Xen 2015-12-17
Attendees: Xen: Ian Jackson Credativ: Martin Zobel-Helas, Felix Geyer, Yogesh Patel I think I got everything but please let me know if not. Tasks in the Marlborough colo, by ticket ---------------------------------------- 65869 Cubietruck disks (in ARM crate) Now sorted out by Yogesh and Ian Campbell. It appears that simply reseating connectors has fixed the problems. Nodes handed back; Ian C is running recommissioning flights on some of them. ACTION: Credativ: close ticket (none) Rack rails for ARM crate (This was discussed on IRC, included in these minutes for completness) One of the rails (left side, seen from the front) seems not to run properly, and Yogesh found a ball bearing ball on the machine below. The machine is stable right now (not at risk of collapsing). ACTION: Ian Cambpell to procure new rails 65871 3 machines suffering from boot order problems Have been removed from the rack by Yogesh. ACTION: Yogesh to try to deliver to All-net 66150 Colo access list This is all sorted out ACTION: Credativ: close ticket 67351 rimava0 failure (Also discussed on IRC) We discovered that the labels on rimava0 and rimava1 were not consistent with the documentation and software config; we swapped the labels to avoid changing the software. We also discovered that the layout document was not accurate. (More about this later.) Mysteriously rimava0 started working again, possibly due to PSU cable being reseated (felt slightly loose, says Yogesh). ACTION: Credativ: close ticket 67602 Colo rack inventory Ian J asked Yogesh to inventory the physical contents of the rack including the PDU connections, so that we can correct discrepancies with our documentation. Action (now done): Yogesh to email list to Ian J. ACTION: Credativ: close ticket ACTION: Ian to commit to our git and maybe fold into existing spreadsheet (none) Discussion of state of our rack Yogesh said he had seen better, but also seen worse. He advised that he didn't see the need to spend a lot of time redoing and neatening the wiring. The serial connectors on rimava[01] had not been screwed in (see above), which Yogesh corrected. The others probably aren't screwed in either, but we are not going to do that proactively as it probably risks more disruption. (none) Ticket workflow for colo tickets Yogesh asked if he should poll the Xen/Credativ ticket queue to look for relevant work. Martin said that Credativ staff in Germany would be looking at that queue, so there was no need for Yogesh to poll the queue: relevant tickets would be assigned to Yogesh as necessary. After the discussion of the Marlborough colo was completed, we excused Yogesh. Other itty-bitty bits --------------------- 65860 Password manager This is now set up. From the Xen end, only Ian J is currently configured as an encryption recipient. ACTION: Credativ: close ticket ACTION: Ian J to enroll Ian Campbell and Birin Sanchez's PGP keys. (none) Next meeting 14th of January at the same time Action (now done): Martin/Felix to tell Yogesh Many people will be away over parts of the Christmas and New Year period. (none) Ticket system web access Credativ report that the ticket system web UI can only grant web access to tickets by a particular submitter, which would not be so useful. It might be worth moving the ticket queue to a VM in Rackspace. ACTION: Felix/Martin to investigate (none) Report of hours used in support contract Ian J has not received this report (but maybe wasn't supposed to, as Lars is the contract contact). ACTION: Martin to talk to David Brauner (CC'd on contract mails) to check the email was sent. If it was sent and Ian J wants a copy, or this needs chasing, Ian J can liase directly with David. (none) Admin VM has no DNS name The primary DNS zone is xenproject.org, in the standard place (in /etc) in the VM mail.xenproject.org. The reverse DNS is controlled via the RS panel. ACTION: Felix/Martin to add a DNS name (and update the reverse DNS) We discussed revision control: currently the zonefile is in git by virtue of etckeeper. At some point we may want to move it to the gitolite in the admin VM. But not right now. Test colo network access ------------------------ Credativ have not been properly introduced to the test colo service machines, which ought to be subject to backup and monitoring. ACTION: Ian J to make sure Credativ have appropriate access, and to send an introductory email Monitoring ---------- We lack individual tracking of which Rackspace VMs are properly set up. ACTION: Credativ to create sub-tickets for each machine that they've been given access to The wheezy+ VMs that Credativ have access to now have the monitoring agent installed. There isn't anything talking to them though yet. ACTION: Credativ to set up a new VM for the monitoring daemon and cause it to email Credativ Several of the Rackspace VMs are squeeze. They need to be upgraded (the monitoring agent is not available in squeeze). We need to coordinate the downtime with the community users. We mostly have existing channels for that, which depend on the service, and, which Lars (and perhaps Ian J) will be able to advise on. ACTION: Credativ to consult Lars (CC Ian J) about communicating downtime ACTION: Credativ to then make appropriate plans for upgrading The Rackspace VMs lack the Rackspace agent. This agent would provide an improved view in the Rackspace control panel. ACTION: Credativ to install the Rackspace agent on the VMs. Not all of the Rackspace VMs have been properly handed over to Credativ ACTION: Ian J to check the machine list and previous emails, determine the state of all the remaining VMs, gain access as necessary, and hand them over to Credativ (or delete), as applicable The test colo service machines (dom0's and VMs) ought to be subject to monitoring too. There was discussion of whether this should happen in the dom0, or the infrastructure VM. Ian J preferred the use the infrastructure VM. Of course the new monitoring VM at Rackspace would need to be able to notice if the colo went dead. ACTION: Credativ to investigate after Ian J has provided access Backups ------- We discussed a variety of possible approaches. Martin suggested that we could perhaps back up the Rackspace VMs to the colo, and perhaps vice versa. The colo contains a number of service hosts (mostly VMs) most of whose relevant state is configuration rather than data. But also a PostgreSQL database, currently 6Gby, growing at ~~~3Gb/yr, which could be streamed using the Postgres replication protocol (also providing a read-only view for reporting etc.) ACTION: Credativ to investigate after Ian J has provided access to the colo, and make a proposal Ian. _______________________________________________ Wg-test-framework mailing list Wg-test-framework@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/wg-test-framework
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |