ESXi-Arm for Robots!
How can we bring the power of VMware’s ESXi to robotic systems? For a long time now, ESXi has been the gold standard for production workloads in the datacenter and now that ESXi-Arm is available (https://flings.vmware.com/esxi-arm-edition), it presents a unique opportunity to harness the features of ESXi and introduce them to the world of robotics.
A bit of background…
For about as long as ESX has been used in datacenters, I’ve been building robots – well, technically I’ve been building students but they build robots and I think that counts. I’m a mentor for a FIRST Robotics Competition (https://www.firstinspires.org) team known as The Zebracorns (https://team900.org). We’re based at the NC School of Science and Mathematics (https://www.ncssm.edu) in Durham, NC.
Every year, we design, build, and program a ~150lbs robot capable of playing in a game given to us by FIRST. We’re also the first FRC team to utilize ROS (Robot Operating System – More on this below.) in programming our robots. The below picture is of me, coping with the normal headaches of a problem as one of my (now-graduated) students actually gets on with the work of solving the problems:
I’ve actually spoken a bit about my robotics volunteering work at VMworld (https://www.youtube.com/watch?v=kcCyXnelg_o) and my students have spoken about it at ROSCon (https://vimeo.com/293294796), among other places. Those are both great talks to get more background info if you want it.
Suffice it to say, I’ve been at this for a while and I’m very excited about the future of robotics, industrial automation technologies, and the intersection of VMware’s portfolio.
Let’s start with a short description of ROS. I’ll forgive you for thinking that ROS, which stands for Robot Operating System, is an actual Operating System – though if you consider vSphere to be an operating system then maybe ROS is an operating system. It’s more of a framework to be utilized in building robust robot software. In some ways, it’s more like middleware than anything else. It makes the concepts of introspection, simulation, and modular software design into something achievable using minimal developer time rather than trying to build those concepts into robotic software from scratch.
You might also be forgiven for thinking ROS is purely an academic endeavor and there isn’t industry adoption but you’d be wrong there too. Recently, our friends over at Red Hat have been making an awesome documentary series about ROS and I highly recommend giving it a watch:
If that’s not enough then I would be remiss to mention that our friends over at Amazon are investing in ROS with their RoboMaker service and that’s worth looking at too: https://aws.amazon.com/robomaker/ and of course Microsoft is all over it as well: https://microsoft.github.io/Win-RoS-Landing-Page/
The design of ROS makes it distributed and scalable as ROS nodes can be brought up on a wide range of computer systems. The work I have undertaken to make ROS work on ESXi-Arm was largely based on a paper (https://team900.org/blog/ZebROSNano/) that my students and I had created to port our work with ROS to the Jetson Nano (https://developer.nvidia.com/embedded/jetson-nano-developer-kit).
Granted, the Jetson isn’t a Raspberry Pi so the code for that project doesn’t work for this project without recompiling it but that’s not difficult to do as it was written to be mostly portable C++ and python code… that and ROS’s modular design makes code written to utilize it portable as well.
Enter ESXi-Arm:
So where does ESXi fit into this picture?
It’s simple, ESXi provides a means of managing and orchestrating virtual machines, containers, and more. It means access to the core components of the VMware Software-Defined Data Center (SDDC). The ability to provide High Availability as well as Dynamic Resource Scheduling are powerful tools when it comes to IT infrastructure and administration.
Often, when working with ROS, there is a fair amount of configuration that needs to take place, particularly when using ROS in a more distributed capacity or when using it in combination with systems that aren’t talking in the native ROS ecosystem of standardized ROS messages.
Furthermore, as we enter a world where robots are becoming increasingly smarter, the number of complex systems onboard a robot with excess processing power is also increasing. This inherently implies a need for better management, better automation, better orchestration, and intrinsic security. These are areas where the VMware SDDC platform excels.
Abstracting the core ROS components out into virtual machines creates more flexible and dynamic configurations and enables IT administrators to use the same VMware tools they already know and love to support our future robotic overlords “workloads”.
The architectural plan:
At this point, I’ve given enough background info so let’s move on to talking about the actual architecture for this project. I’ll start with a diagram:
Please note that this is the end state diagram and not where I started. I’ll explain more about how I ended up with this version of things and about possible future improvements later.
Moving from left to right across the diagram, you’ll notice that the robot is powered by a ~20V power pack used by a cordless tool system. That voltage works great with the motor controllers I am using (http://www.ctr-electronics.com/control-system/talon-srx.html) but that doesn’t work so well for powering the Raspberry Pis. For that, I’m using a buck voltage regulator that drops the voltage down to 5V and then some USB-C connectors to power all of the Raspberry Pi devices.
The motor controllers are brushed DC motor controllers with the ability to support quadrature encoder input. These motor controllers communicate over a CAN bus (similar to what most modern cars use) and to interface that to the Raspberry Pi acting as my “Physical Edge”, I’ve got a USB-C to CAN adapter device (https://canwork.dev) along with a Logitech F710 wireless gamepad.
From there, all of the Raspberry Pi devices are connected to an ethernet switch, which is USB powered from one of the Pis. Additionally, two of the Pis are set up to use ESXi-Arm, and a fourth Pi on the robot that is running OpenWRT and acting as a wireless bridge to the rest of my home network. That OpenWRT Pi is also serving out a 100GB iSCSI volume that is mounted by the two ESXi-Arm hosts. The vCenter is running as a VM on my home network and is just being used for control over the hosts.
The last piece of the puzzle is the VM itself that is running Ubuntu 18.04 and has ROS Melodic installed on it (Melodic Morenia is the name for the version of ROS that I’m using). The code for this particular implementation of ROS and all of the configuration for it can be found here: https://github.com/FRC900/zebROS_Nano
I got 0.99volt problems…
One of the first problems I ran into was trying to power all of the Raspberry Pis on the robot. The original voltage regulator that I was using was too small and was not outputting enough power so I had to solve that problem with a quick stop over at Amazon ( https://www.amazon.com/gp/product/B01M03288J/ref=ppx_yo_dt_b_asin_title_o02_s00?ie=UTF8&psc=1) to purchase a larger output voltage regulator.
Once I had the power issues resolved, it was on to solving connectivity problems. There isn’t a wireless network adapter driver for ESXi-Arm (yet) so solving connectivity was going to need to be through a bridge device of some sort. That’s where installing OpenWRT on a Pi came from and since I would have to have a common bridge device already on the robot, it made sense to have it double as my shared storage and leveraging iSCSI for that was easy enough using the tgt commands (https://www.cyberciti.biz/tips/howto-setup-linux-iscsi-target-sanwith-tgt.html).
At this point, I actually had systems on the robot and was using ESXi’s USB Host Device passthrough feature to pass through the USB to CAN transceiver as well as the Logitech joystick and it was working:
“That’s cool! Time for a vMotion with motion!”… or at least that’s what I had hoped to try. It almost worked… I could actually migrate the VM and the USB devices stayed connected thanks to the magic of the checkbox indicated in this screenshot but it had some issues.
The USB passthrough vMotion support has been a feature that has been around for a while in ESXi and it works well in the world of ESXi running on something more powerful than a Raspberry Pi. However, in this case, things weren’t snappy to respond and there are clearly some issues to be worked out with USB performance on ESXi-Arm for the Raspberry Pi. However, I’ve been building robots for a long time and I’m not about to let some pesky performance bugs slow me down on making this work.
I spent a couple of days thinking about this problem. Thanks to the magic of ROS, I don’t actually need the joystick because it’s just being used as a means to control the robot and I can just as easily automate the same functionality using standard ROS Twist messages. That doesn’t mean I don’t want the joystick but I can prove this without it – if I have to. That’s one of the two required USB devices down.
The CAN to USB transceiver is what enables the robot to drive so I started looking at CAN to Ethernet transceivers and there just aren’t many out there and on top of it, the ones I found were either expensive or didn’t explicitly support SocketCAN (this is a standardized way for Linux to talk CAN). However, I eventually found the Cannelloni project (https://github.com/mguentner/cannelloni).
I started with another Raspberry Pi that I had lying around – this time a Pi 3. I installed Raspbian on it and then installed Cannelloni, along with some other basic tools to compile and run it and set about creating a crontab entry that would start it up on boot. To my surprise, it worked!
Since I already had the other Raspberry Pi on the robot, I figured I might as well try to solve my joystick connection issues if I could (It was either that or ask one of my students to write an autonomous routine to drive the robot in circles). Another day of searching and I found the USBIP project and some basic directions to set it up (http://usbip.sourceforge.net and https://developer.ridgerun.com/wiki/index.php?title=How_to_setup_and_use_USB/IP)
Shockingly, this also worked. This meant that I could initiate a vMotion operation and keep control of the robot throughout the process (minus a momentary blip for the network cutover).
The final result, now with 100% more vMotion:
There you have it, vMotion while in motion.
Why didn’t you just…
I suspect one of the questions I’ll eventually have to answer is why I didn’t just do this with Intel platforms and I’m hoping my commentary about ROS earlier and the fact that this is ESXi will point the readers asking that question to know that it is possible. That being said, Arm is the architecture of the far edge for a multitude of reasons and now that VMware has thrown down the gauntlet with the ESXi-Arm fling, this project was something I had been telling myself I would do for a while.
Another question is likely going to be why I couldn’t just do this natively on a Raspberry Pi without ESXi and without all of the complexity I just added to make this work. I did do that… I even taught some high school students about how to do it and we wrote a paper. That box was checked and that challenge was solved.
The problem with that approach is that it doesn’t allow me to take advantage of the abstractions and capabilities provided by a virtualization platform. I can now copy this VM and boot it up on a completely different platform… say a SolidRun HoneyComb LX2K and it would work, just the same. I could add more hosts, I could add more VMs and share resources. I can do everything that I could do with a normal VMware SDDC and that’s what this is really showing. It’s possible to get the power of VMware’s software stack on a frickin’ robot and that’s just cool.
What’s next?
There are obviously a few things to work out for future use of ESXi in the world of industrial automation and robotics. For starters, the local protocols like CAN and RS-485, SPI, etc need some mechanism for VMs in ESXi to communicate with – I’m not sure that my solution using a “physical edge” connection is the best one but the concept of a lightweight edge device to connect to lower level protocols while the heavy lifting is done by VMs running elsewhere isn’t new and the concept clearly works. Ideally, there would be passthrough functionality for a wide range of protocols.
The next obvious thing is that ESXi-Arm needs support for more system types and loads of performance improvements. I suspect it is possible to utilize ESXi’s USB passthrough with a more powerful system or one where there isn’t a bottleneck around USB devices like the Raspberry Pi. There is work to do here.
I’d like to get this running on one of the bigger robots that I help build with my students but that’s going to need to wait until it is safe to do so. The good news is that it won’t be difficult because of ROS. In fact, it’s just a few config files that need to be updated to make it work because the code is the same, it’s mostly just the configurations that change.
Additional sensor integration is also something I want to work on. This robot was never meant to be anything other than a small toy system with a joystick to control it but ROS enables me to easily add a LIDAR sensor, or stereo vision for some kind of mapping and navigation.
Lastly, ROS was built to be distributed so deploying nodes across more than one VM is an obvious extension of this. It’s easy to see how adding in additional VMs, each with access to different accelerators and each with purpose built software or right-sized to perform a particular task is something that VMware’s SDDC stack can help with managing.
Stay tuned! I’ll be back with more as this is just the start of ESXi-Arm robotics.
Thanks!
I owe a great many thanks to the team working on ESXi-Arm at VMware and to the OCTO organization and CTO Ambassador program for helping me get to know them better. I also wouldn’t be able to do this without my students and the myriad of people and businesses that have supported The Zebracorns robotics team over the years. Quite seriously, I learn a lot from mentoring my students – mentoring is a two way street. If you can find a team in your own area and mentor then do it (https://firstinspires.org).
In particular, I want to thank https://www.andymark.com for making such an awesome little robot kit to get this project off the ground more than a year ago and http://www.ctr-electronics.com for making motor controllers that work in a wide range of use cases and being supportive of our efforts with ROS… and of course, I have to thank the folks at OSRF and others who have made ROS (https://www.ros.org) for continuously transforming the way we build software for robots.