Monday, 29 August 2016

Flying visit to ISC 16

Earlier this year I took a flying visit to ISC in Frankfurt. And I mean a flying visit, I flew out of Birmingham at 6 something, landing into Frankfurt before I'd normally be in work. For some mad reason (OK family commitments), I needed to be back the same day, so ended up getting home just before midnight the same day.

Arriving into the conference centre, I was met by the Lenovo guys and collected my badge. There was a little officiousness getting in as I didn't have a badge, but the collection point was inside the area I needed a badge for! They then didn't want to let me into the show floor, but the Lenovo guys swiftly got past that and I spent some great time talking with a few of the Lenovo engineers. It was great to finally meet Vinod, who works on the design of the water cooled systems and great to see the sample system with a production ConnectX-4 card (unlike our home-brew versions). We talked over changes for the Broadwell design and also kicked around some ideas for future developments. Its great to see companies like Lenovo sending their design engineers out to the shows and not just marketing people. The engineers really want to talk to and meet customers to listen to what they are doing with the systems.

My agenda was pretty packed with exec meetings, people like Michael Kagan, the Mellanox CTO, and one of the Lenovo VPs. I also popped into the Women in HPC event briefly before heading into a planning meeting for the Spectrum Scale user group to plan for the end of 2016 and into 2017.

I also caught up with Intel, John mentioned that by the end of the week, they are seeing the same faces, so having only flown in for the day, it was great to see a fresh face. (Though not sure how fresh faced  I was having been up at silly o'clock)!

A catch up with Ross Keeping, who I knew from the IBM Manchester labs, but is now with ARM. It was great to catch up and find out what ARM is doing to tool the ecosystem for HPC and enterprise computing. There's some interesting stuff coming down the line there!

Finally rounding the day of with a beer with the Mellanox and DDN guys before heading out for dinner with DDN. Overall the day was tiring, but worthwhile making and maintaining some great contacts. My impression was that the show area was smaller than I expected it to be, but maybe I'm just swayed by the sheer scale of SC in the USA!

It was certainly mad in travel terms! Thanks must go to OCF, Lenovo and DDN for supporting my flying visit out to ISC 16!

This is why we choose Mellanox Infiniband

A short while back, Mellanox asked us to make a short video on why we use Mellanox Infiniband. The process of making the video was interesting, it cause a little disruption in our data centre as we had to work out how to switch the lights out to get some of the shots (not easy when they are motion sensor activated!). But it looks good and I'm amazed how fast they turned around the shoot into a fully edited, subtitled and approved video, just goes to show they don't just make fast networks!

If you were at ISC 2016 in Frankfurt, you may have seen the video playing on the Mellanox giant video wall about every 8 minutes!

When they asked, we were more than happy to make a video with them, I like their technology, I like the ConnectX-4 card as it has a lot of acceleration features for the private cloud solutions we are delivering!

Projects projects, so many projects!

Things have been a bit quiet on my blog for the past few months, there's been a lot going on behind the scenes getting a number of projects up and running.

I'm hoping to post in the next few weeks in some more detail on these, but things implementing flash for Spectrum Scale metadata on FlashSystems 900 (which was easy BTW!), we've also been deploying couple of large IBM TS4500 libraries, and I do mean a couple - fitting into on of our data centres was a challenge though. Being the first UK site to implement Lenovo's warm water cooled systems (phase 2 has just gone in and I think we may well be the world's first direct cooled Broadwell system!) has been fun, challenging at times, and coupled with rolling out SwitchIB-2 from Mellanox has kept us on our toes! A couple of DDN storage arrays have gone in to support our life-sciences projects and we've been building a new private cloud, designed from the ground up for research requirements.

I've also had a couple of film crews in for various bits an pieces!

But life would be dull if we didn't have challenges and interesting things to do and its great that we are able to work with some of the world's best technology providers and that they want to work with us! I think it shows how ambitious we are trying to be in Research Computing at Birmingham to push hards to provide outstanding facilities to support our research community!

Thursday, 10 March 2016

Connecting data intensive instruments ... enter the Brocade VDX fabric

I promise not to mention Spectrum Scale (much) in this post. We have a big chunk of Scale storage which is for research data and coupled to our compute resources for researchers, but we also have a lot of data intensive instruments which we need to get data off and onto the Scale storage.

The number of data intensive instruments we are seeing is increasing rapidly and we need to provide them with secure isolation whilst allowing them to get data out of their facilities and into our central storage.

Our Scale storage servers each have multiple 10GbE connections onto our research data network, but getting the data across the campus network has its issues. We've therefore implemented a research data network based on Brocade VDX fabric switches running a VCS fabric. Mostly we run VDX 6740 and 6740T for edges of the research network, but we'll be adding some 6940-144 in the next few months (up to 144 10GbE ports in 2U!). One of the things we really don't want to do is have to push traffic via a firewall from data intensive instruments, but be able to route it, ideally just switching it where we can and our VDX fabric design allows us to do this so keeping speed up across the network. So much so, we are hoping researchers will move away from a local staging server in the lab and just stream data directly to our central storage.

We've moved to using 6740T for edge switches as it presents copper ports, we do get the option to use the QSPF+ ports with Mellanox QSA adapters so we can plug in 10GbE optics, this is nice as it means we don't need to fork our for 40GbE optics to uplink to the 6740T, but can just use 10GbE LR optics. For now we are running edge ports at 1GbE, but we can POD license up to 10GbE, though if the building copper will sustain this is a whole different question!

The VDX fabric switches act to a certain extent like a giant stack which we can span right across campus (up to 48 switches) and they will even run in metro mode. They also have pretty low port to port latency. One of the very nice features we like is the ISL capability of the switches, pretty much we can add arbitrary ISLs between switches without worrying about loops or having to configure the ports as trunks. OK, there are a few considerations to think about, like spanning port groups, not that this breaks it, but spanning port groups acts like two ISLs in a traditional trunk, but within a port group ISL traffic is sprayed at layer 1 level, meaning we won't see one half of an ISL full whilst the second half is underused. And adding new links is pretty much a case of plugging in optics and fibre.

I mentioned we don't need to worry about ISLs looping between switches. The VCS fabric blows away the traditional tree structure of a network and pretty much any switch can be connected to any other in the fabric. There are a few limitations, like you can't build a ring of more than 6 switches, but the ability to have traffic flow pretty much anywhere between switches reduces our management and design headaches. If an ISL fails between two switches, and there is another path over the fabric, the switch will just kick over to using that instead.

Being enterprise switches we get all sorts of nice features like edge loop detection across the whole fabric which means we don't need nasty protocols like spanning tree. As a fabric, management is pretty simple, one of the switches assumes a VIP as the master of the cluster for configuration.

They are also Layer 3 devices and can run VRRPe etc for HA routing, add into this the ability to use short path forwarding and we can pretty much prevent traffic tromboning if the switch is a member of the VRRP group ... even though it doesn't host the "default gateway IP", it can short cut the routing. Pretty neat!

There are also nice features like variable port buffers allowing us to reallocate port buffers if needed, for example if we have a few big storage servers and edge devices, then we can steal buffers from the edge device links and allocate to the storage server ports.

Throw in some "basic" functionality like port on demand, 40GbE breakout and all in all its a pretty good switch for our data intensive applications!

In related work, we are developing a research compute cloud and will be hooking the VDX fabric into this, whether we use the VTAP functionality, not sure, but the OpenStack ML2 plugin is probably going in, and we'd really like to be able to connect compute VMs hosted in our cloud for research applications directly into research facilities over a low latency, high speed network. Can we encourage researchers to move away from lab based compute resource? Well, only time will tell!

If you are interested in how we are building the network and the design, get in touch and I'm happy to have a discussion on this!

Wednesday, 27 January 2016

Installing new technology to deliver HPC

January has seen a busy start to the year! And not from any of the major investments we had approved at the end of 2015!
No, this has come from our deployment of new technologies for HPC systems, in the past week or two, we've taken delivery of a direct water cooled HPC rack (well half a rack).
We're not afraid to try out new technology to help deliver our services, whether that is moving to the latest IBM Spectrum Scale release or building cloud services on OpenStack.
This has however probably been our longest new technology development and is entirely a hardware solution. The plan came following a visit to Lenovo's (then just opened) site in Raleigh, NC where the demo'd their direct cooled HPC technology, wind on a few months and we were considering our options for new systems instead of SandyBridge based iDataPlex. Our strategic framework with OCF and Lenovo meant the options were realistically next scale, but a few calculations and an air cooled data centre with no ability to provide rear door heat exchangers and so a 7.5kW meant we were seriously looking at the WCT hardware. The spec of the standard WCT system wasn't what we really wanted, so we approached Lenovo about getting storage into the systems - we use full fat OS deployments and some of our workloads perform significantly better with local data storage. We eventually agreed on getting SSDs into the systems and having gone through this process, we also wanted to add Mellanox 100Gb/s EDR InfiniBand. (Actually, we want some of the cool features on the ConnectX-4 ASIC). This has proven to be a bit more difficult to actually get, but we have EDR switches and cables and will be adding water cooled EDR cards once they've finally been manufactured for us. Of course in a system with no fans to cool other components, there's a lot of testing gone into making sure the cards can be properly cooled.
Realistically, its taken us 9 months from agreeing we were going to go WCT with this kit to getting our first tests on it. Its taken longer than we anticipated (we ordered it last summer!) and we've learnt a lot from the process. We always knew we'd have a 4-6 month build on the facility side work to do planning permission and installation of dry air coolers, but we'd had delays on getting the SSDs, EDR and silly things like the right valves and hoses!
The infrastructure we've installed is designed to be scalable and to integrate with other warm water cooled systems which have been increasing in the market over the past year - looking round the SC15 conference centre, there were a number of options for direct cooled systems, significantly more so than at SC14. So its good to see that the infrastructure we designed back in July will integrate with other hardware platforms from other vendors.
A little tinkering with xcat and some updates to our deployment system and I've finally got most of the compute nodes running today with an Intel LINPACK test on the hardware. Monitoring the load for about 30mins, I managed to get our supply temperature up to ~25C with a return of 29C, so a 4C delta on 15 compute nodes running flat out. This is a lot less than we originally expected. Checking the kit, we were also getting sustained turbo on all cores to 3.0GHz on the 2x 12 core 2.6GHz SKU we have fitted per node.
Really I want to run the water loops much higher than this, depending on the CPU SKU, this could be up to 45C, but I think we'll aim for 40C. Its interesting that the thermal properties of water are funny, and the warmer it is, the better it is at carrying the heat load away with it...
Installation and final commissioning of the rack took place last week when we worked with the Lenovo engineers, our Estates team and their contractor team to balance and air-bleed the water loops. No leaks so far!
Looking to the future, we're talking to a few technology companies on getting early access to some new hardware to support our projects over the next 18 months. We're also always open to looking at what options we have available from a whole range of vendors.

Saturday, 2 January 2016

Looking back at 2015 - research computing

2015 has been a busy year in research computing at Birmingham. The team has gained two new members and one came back full-time into the team. These have been driven by the need to support new initiatives, for example supporting CCB and needs for research data management.

Getting major project developments out the door took up much of my time in 2015, rolling out our research data store and getting over the teething troubles with that. We got delayed a little by some unfortunate building work which cut fibres between our data centres and also an incident with the fire suppression system, but we worked through it all and got the solution out of the door along with our Sync'n'share solution based on PowerFolder.

With the research data store going live, we also needed to focus on how people move large volumes of data to it, and have been deploying a specialist fabric based network using Brocade VDX technology. This is pretty neat as it provides full layer 1 balancing of multiple links, which we can build up as we need additional bandwidth. Getting the final connections to our campus core took a little longer than anticipated but we got there and now have the network in two buildings for data generating equipment and we've already placed an order to deliver more locations in early 2016. As usual, things like getting access to wiring centres, or faulty building fibre have given us a little trouble, but we're now progressing well with this. Some aspects of the firewall/router product we purchased have caused us a number of headaches and more work than we'd have liked, but we're working through this with the supplier who seem engaged on getting us a fully working and reliable solution in early 2016.

I'll post a separate update on research clouds, but this has also taken a chunk of time, but also a number of opportunities, for example speaking at events, helping to develop strategies and being invited to join the RCUK working group on cloud to help identify and deliver cloud (public, private, hybrid) into supporting research in the UK.

Late 2015 has seen discussion come back about new data centres to help support the increasing demands of research computing, we'll hopefully hear in early 2016 if we'll get funding to develop this further. We did hear that the University is planning to support life-sciences with new storage and compute with a major investment, so we'll have a busy time designing and building this in 2016 as well as recruiting staff to support the investment!

Towards the end of the year, we took a major maintenance window of the HPC service. Although we weren't replacing anything, we will be adding new hardware in early 2016 and we needed to do some major upgrade work on the scheduler and resource manager in preparation for this as well as replacing our core network switches to enable us to scale out more easily. We also generally take this opportunity to apply firmware updates to all research infrastructure and of course did the Spectrum Scale 4.2 upgrade on our GPFS clusters!

We've also been busy working out how we can support the CLIMB project better, and we've now secured funding for a member of staff to join the team to help with this, so watch out in early January as we'll be recruiting a cloud computing person to help deliver the private research cloud.

Looking back at 2015 - SC15, Austin

I was lucky enough to get out to the SC series of conferences again, this time SC15 in Austin Texas. I was pretty late in getting my act together to book to go again and struggled to find a hotel that was within budget, eventually I managed to get one sorted so I was all cleared to go.

I knew I was traveling on the same flight as someone I knew from NOC, so it was no surprise that I bumped into them at Manchester airport. We didn't expect to bump into a couple of the guys from OCF who were travelling a different route. Connecting in Georgia was pretty tight, but we both just made the gate as it was boarding and safely on to Austin!

Although it was pretty late UK time, we agreed to meet up for dinner in downtown Austin to help with the jet lag ... we found a couple of times in the week that there is clearly a strange attraction to the same places for UK folk - with over 12000 attendees at the SC conference, we managed to pick a restaurant and table next to the people I know at Cardiff working on the CLIMB project with me! (we did the same later as we went for a post-dinner pint and bumped into the Oxford e-research centre guys as well as the UK Lenovo people ... entirely coincidentally!).

Sunday morning didn't have anything planned in for the conference, so I again met Tom from NOC for morning coffee - as usual earlier than we wanted due to the joys of jet lag! Followed by a wander through downtown Austin and the Capitol area. I quite like exploring US cities early on Sunday mornings as they are usually pretty quiet, so its a fairly relaxing time to investigate. Whilst I found Austin to be a fairly chilled and welcoming city, it was interesting to check out it's confederate history and the statues around the Capitol.

Sunday was also the day of the Spectrum Scale user group, and as chair of the UK group, and it being the first SC event organised with the user group, I met up early with the US principal and co-principal (the first time we'd all met) along with a couple of the IBM people helping to run the event. The user group itself was really great, and it was good to see so many people there - we'd asked IBM for a bigger room than the one they ran their group in at New Orleans, and they delivered, it was also pretty full as well! There were some great talks from both IBM and from the user community, particularly interesting was the talk from Travelport on the monitoring tools they are using to keep an eye on their GPFS deployments.

Monday morning was the workshop on Parallel Data Storage, this opened with an interesting talk on holographic storage. The guys from Akonia Holographics are developing an interesting looking holographic worm drive with a tape like form factor, the theory being it will eventually retro-fit into existing tape libraries. An interesting idea, but a few years from being a product and how it will fit into the storage market then ... well that's anybody's guess, but with the advent of software defined solutions and cloud storage, maybe it will be less relevant. I also bumped into Dean Hildebrand from IBM there and we had a discussion about various projects we've been working on (Dean and Bill from IBM have been a great help with getting CLIMB up and running on Spectrum Scale!), he showed me a few things from the research he's been doing as well which was interesting.

Monday evening was the gala opening of the conference trade show hall which was as usual massive!  I think Austin convention centre was a little smaller than some of the other locations used by SC, and I'm not sure the two halls really worked well, with a lot of smaller and maybe quite specialised exhibitors in a small hall off the main hall.

On Tuesday I'd been invited by Lenovo to talk on their show floor stand about some of the projects I've been working on, so I had a 15 minute slot their to talk about CLIMB and the warm water cooling systems we've been trying to get installed (long story there ... for another blog post when we get it going!). I was a little worried no-one would be interested, but Lenovo had a great guy working the stand stage and getting people to sit down, if a little difficult to follow his act ;-)

Much of my remaining time was taken up with planned and ad-hoc meetings with vendors and technology companies, a lot of which is under NDA. Some of the companies I talked to include partners we've been working with (e.g. Lenovo, Mellanox, Dell, Adaptive Computing) as well as others who have interesting technology (e.g. Bull, HPE, SpectraLogic, Boston/Supermicro).

There were a number of stands this year focussing on cooling technology, with a lot looking at warm water cooled technology. There are now a lot more vendors focussing on this as a way of cooling systems either via their own solutions or after-market accredited options (which is essential for maintaining warranty!).

I was maybe a little cheeky on the Seagate stand, knowing a little bit about GPFS, I went to find out about their new ClusterStor appliance and had some quite complex questions. They did find me someone on the stand who did actually seem to know his stuff so it all worked out in the end there!

A few other things that were "launched" at SC15 worthy of note, DDN had their SFA14k storage array. Spectra with their Artic Blue (technically launched a few weeks before) which is a cold-storage shingle array with disk spin down. Bull announced Sequana, their exascale HPC solution. Adaptive computing also had torque 6 which has proper NUMA support finally and they spent some time going through the features of this with me, if it works properly as they explained they its pretty neat.

As well as vendor meetings, there were a number of ad-hoc meetings arranged out there, including meeting up with a couple of the CLIMB guys to work out a mini plan for how we are going to progress some of the issues on the project.

In addition to the trade show, there's a full programme of talks running alongside, with invited keynote speakers as well as sometimes quite esoteric papers on specific research projects. Alan Alda was one of the keynotes who spoke on the importance of being able to explain science to non-scientific audiences and he had a great video to demonstrate some of the work he'd been involved with on this. I guess one workshop I went to that did disappoint was the one billed as CEPH and HPC  ... which wasn't really people talking about CEPH and HPC, more how they were using CEPH alongside their HPC solutions (e.g. as a backup target).

I'd have liked to have gone to the session on building HPC applications, but it clashed with a meeting I had, though as with many of these events, the slides get posted online, so I've managed to catch up on these as this is something we really need to deploy at Birmingham to help with our software management.

One thing I think the conference missed this year was the early evening keynote talks which I went to a few in New Orleans. But maybe they were just on topics I wasn't overly interested in. Though I do like to get to some talks on topics that aren't obviously my thing as it helps with awareness on who we should be engaging with in our academic communities to see if we have services that can help empower their research.

The conference technical programme wraps up with a big social event on the Thursday evening, and this year it was held at the home of Texas Longhorns. I'm always amazed at the size of the football stadiums, particularly bearing in mind this is a University football team!