Pocobor.
Pocobor.

Robocop

Strictly speaking, this post is about a project that is only tangentially related to mechatronics. However, given Pocobor’s name and history (and my background as a native Michigander), I couldn’t resist passing along an update regarding a unique public art project coming to Detroit.

In 2011, there was a successfully funded Kickstarter project to create a “life-sized” monument in Detroit that would pay tribute to that city’s greatest half-man, half-machine crime fighting hero. Although the pace has been slow, things are still moving forward and the makers have given us an update.

It’s nice to be reminded that sometimes, science fiction decides to look at the positive side of a future, more mechatronically advanced world. I can’t wait to see the statue.

Interface/Off (Part V): THRILLING CONCLUSION

(Context: this is the final installment in our look at the future of smart product interface technology. Part I set the stage, Part II looked at touch-based systems, Part III covered touch-free approaches, and Part IV dug into brain-computer links.)

Well, this is it – the day that you’ve no doubt been waiting breathlessly for since July, when the gorgeous half-faces of Nic Cage and John Travolta confusingly headlined a mechatronics blog post. We are (finally) concluding our Interface/Off series examining the state and future of smart product interface technology.

How Did We Get Here?
As a very brief refresher, we’ve structured our look so far in roughly increasing order of sophistication. We started with touch-based interface technology such as physical buttons, touch sensing, and touch screens – these systems are simple and well-understood (by both designers and users) but are hobbled by physical limitations such as proximity requirements and dimensionality (it’s hard to make a 3-D touch-based interface). Touch-free systems, such as gestural interfaces and voice control, eliminate some of these limitations and broaden the designer’s palette, enabling more intuitive user behavior. However, they have their own disadvantages, including “translation” issues (both gestures and spoken language) and scaling issues when multiple potential users are present. Finally, brain-computer interfaces (BCIs) have the potential to maximize the communication bandwidth between the user and the smart product but require the solution of thorny issues in fields as diverse as medicine, engineering, and ethics.

Judge’s Decision
So, what is the interface of the future? A realistic answer is inevitably something of a cop-out; I expect to see all of these interface types in applications that they are particularly well suited for throughout the foreseeable future. For instance, there will always be contexts in which the low price, simplicity and reliability of a physical button will make it the preferred solution.

Furthermore, as these technologies mature, I think we will see more blended systems. Let’s not forget – even touch screens have only become ubiquitous with the rise of smart phones over the past 5 years; nearly all of the technologies we’ve looked at have a tremendous amount of room left to grow and evolve. For example, imagine a product that offers the ability to communicate with it via voice commands and/or gestures (actually, this is strikingly similar to face-to-face human conversation, where a mix of verbal and body language is typically used).

However, it would be too anticlimactic to leave you with such a non-answer after all these months. So, if I have to choose one technology to rule them all, there’s no way I can choose anything other than direct brain-computer links.

Think of it this way: the whole history of communication is an effort, with increasing success, to convey thoughts and ideas with more precision and accuracy, in as close to real time as possible. Languages create more and more new words; technology has let us communicate from a distance, then actually talk to each other, then even see each other live (so to speak), all so we can communicate the most accurate possible representation of what we think and feel at a given time. However, even when talking with each other face to face, there is a middleman. Language (both spoken and body) is a double-edged sword, allowing us to better understand ourselves and our feelings by giving us a framework with which to think but also requiring that we translate thoughts and feelings into that framework, sometimes inelegantly or inaccurately. Have you ever struggled to find the word to express something or felt like your words can’t keep up with your thoughts? Imagine cutting out that middleman and communicating what you are thinking with no mouth-enforced bandwidth limitations or language-driven description restrictions. The possibilities in both depth of communication and also efficiency and even privacy are staggering.

The above vision is ambitious, and we may even find that parts of it are physically or physiologically impossible (e.g. directly conveying a feeling from one person to another). And, unlike a lot of mechatronic technology we write about on this blog, it is not around the corner. However, despite those and a whole host of other issues currently associated with the technology, the potential is too great for humankind not to keep reaching for that future.

Inflatable Robotics

I recently saw a cool article and video (below) on a new project from Other Lab, one of the most interesting groups in the Bay Area robotics scene.

The video gets into some inflatable robotics work that they are doing, with some really interesting potential applications around human-safe robots and medical robotics. However, what I found most interesting were some thoughts from Otherlab co-founder Saul Griffith on the impact that engineers can and do have on the world around them. The topic of how and how meaningfully we as engineers can affect the world really resonates with me and I am happy to see it get discussed in a larger forum. I couldn’t agree more with Saul’s challenge to all of today’s (and tomorrow’s) engineers: keep dreaming and stretching your notions of what is possible. The world is a canvas with infinite possibility for improvement and beauty.

21st Century Pack Mule

Boston Dynamics, who many of you may know from Big Dog and other autonomous quadruped robots they have developed, released a new video a few days ago featuring their Legged Squad Support System (LS3) robot (as well as Friend of Pocobor Alex Perkins, the thespian/engineer who co-stars in the video).

The LS3 is essentially a pack mule for the future: it is meant to carry supplies for US soldiers and Marines in the field to both reduce their physical burden and also free up cognitive bandwidth for more important tasks. As such, it is designed to carry up to 400 pounds, walk 20 miles and operate for 24 hours without human intervention (other than voice commands to follow / stop / etc.).

From a mechatronics perspective, there are a number of interesting design challenges that an application like this poses. First of all, the dynamics and controls for a quadruped robot are considerably more complex than those for a wheeled or tracked equivalent. However, quadrupeds (or bipeds like humans, for that matter) can handle much rougher terrain than wheeled or tracked robots and this expanded mobility drastically increases their value. Ultimately the hope is that pack systems like the LS3 will be able to follow the soldier anywhere they are capable of going.

The second significant engineering obstacle centers around the the sensing and control systems required to follow the soldier and choose the best path given the upcoming terrain at any given time. There are considerable hardware, software and algorithmic issues that have to be addressed to arrive at a prototype robot like one in the video. There are also some interesting overlaps with the technology used for self-driving cars or any other mobile autonomous system.

Based on the examples that have been publicized over the past few years by Boston Dynamics and other groups pursing similar research and development, I have been deeply impressed by the pace of development of these types of systems. At this rate, even mountaineering porters might be feeling a little nervous about their job security soon…

Interface/Off (Part IV): Brain-Computer Interfaces

(Context: this is Part IV of a look at the future of smart product interface technology; Part I set the stage, Part II looked at touch-based systems and Part III delved into touch-free systems.)

This week’s no doubt eagerly awaited installment of our series on smart product interfaces looks at some even more novel technology. Brain-computer interfaces (BCIs) are direct communication pathways between a human brain and an external device (i.e. without using things like hands or voices to intermediate). Various types of sensor networks are used to directly measure brain activity, which is then converted to something recognizable to external systems.

The Technology
Without getting too deep into the nitty gritty of the technology, it’s worth a brief look at some of the common approaches used in these systems. Both invasive (implanted into the body) and non-invasive systems exist, with unsurprising trade-offs: invasive systems increase medical risks and non-invasive systems often yield poor signal resolution because the skull dampens signals. For both practical and psychological reasons, commercially plausible systems outside special circumstances are almost certain to be non-invasive for the foreseeable future .

On the sensing side of the system, electroencephalography (EEG) is the most common technique and typically involves an array of electrodes mounted to the scalp that measure voltage fluctuations resulting from ionic current flows within neurons. Functional magnetic resonance imaging (fMRI) has also been used effectively, where brain activity is scanned by detecting changes in blood flow associated with different brain activity. However, fMRI has vastly increased hardware requirements regarding size, cost and complexity and thus early commercial systems have typically utilized an EEG headset (such as the systems we looked at a few years ago).

On the processing side, I am confident that progress over the next few years will be rapid, partly because the current state of the art is so crude (relatively). Historically, BCIs have had the most success translating physical commands (“Clench left fist”), but newer research has made strides in extracting more specific data (such as addresses or ATM numbers). However, one of the most interesting parts of BCIs is that processing can be addressed from both sides of the problem. The traditional engineering approach would be to work on developing and refining algorithms that can understand what the subject intends; to correlate certain brain activity measurements to physical commands or ultimately language. However, because of the remarkable neuroplasticity of the human brain, it is also possible to work in reverse and train the brain to think in certain ways that are more easily measured and understood by existing systems. It will be fascinating to watch the interplay between these two approaches as the technology matures.

The Why
Given the technical hurdles currently associated with BCIs, some people question whether they are worth the development headaches. Perhaps the most powerful potential application is for disability treatment – there is an already impressive body of research exploring the use of BCIs to control prosthetics and even, for instance, to restore non-congenital blindness. The ability to give a person back the use of a body part is a profoundly valuable result and should justify development of these systems all by itself.

However, there are other applications with larger potential markets that are also pushing both research and commercialization. Two of the major drivers are laziness and bandwidth. Laziness seems like a pretty minor incentive but the galvanizing force of annoyance should not be underestimated. History is littered with product improvements that did nothing more than save users a few seconds but came to dominate the market anyway (does anyone under 30 even know what a rotary phone is anymore?).

The bandwidth issue is more powerful. In previous posts, I’ve described the progression of interface technology in terms of increasing technical sophistication but another way to look at it is as a journey of increasing information transmission capacity. Buttons or even keyboards can convey only a single bit of information with each press. Touch screens improve on this but are inherently 2D. Gestural interfaces or voice command systems enable communication that is richer yet but the ultimate in bandwidth will occur when we can cut out the middleman (e.g. hands or mouth) altogether. Despite the sci-fi and touchy-feely connotations of a “mind-meld” (sarcasm intended), that concept embodies the pinnacle of possibility for human communication bandwidth and arguably also richness of communication, depending on how philosophical we want to get.

Such potential value does not come without potential costs, however. The idea of letting a person or system look directly into one’s thoughts will be considered creepy by many and there are significant technical obstacles between today’s state of the art and the kind of interface I posit above; it is even possible that the architecture of our brain does not lend itself to outputting the full information content of all our thoughts. Also and most importantly, we might have to wear dorky-looking helmets or headsets.

All in all, I expect brain-computer interfaces to become increasingly common in the world around us over the next few decades. In my next post, I will wrap up the series and summarize my best guess for where smart product interfaces are going – what would a competition be without a winner?

$300 Prosthetic Arm

Came across this uplifting video about a super affordable ($300!) prosthetic arm. For comparison, a prosthetic arm typically ranges from $3,000 to $30,000, depending on how much articulation the prosthetic allows. This prosthetic arm was developed by a team of engineers at the University of Illinois who then went on to form their own company, Illini Prosthetic Technologies (IPT).

Village Tech Solutions

Pocobor recently started helping out Village Tech Solutions, an awesome non-profit organization dedicated to providing affordable and relevant technology solutions to the developing world. One of the first systems they developed was the WireBridge (shown below), which is a human-powered river crossing system that has facilitated over 3 million river crossings across deep river gorges in Nepal to date.

One of their current projects is to develop Looma, which is an affordable audio-visual technology device that can provide an interactive window to the internet and access to educational content for village schools that don’t have access to electricity, computers, or even books. The portable, battery powered system integrates a projector with control wand (imagine a Wii controller) so it can act as an electronic whiteboard. It also provides an internet connection over WiFi or any mobile data network if either is available. If not, there is a wide variety of educational content (textbooks, lessons, etc.) stored locally on the device itself to facilitate learning. The system can be recharged from solar panels in villages where the electric grid is not available or is unreliable.

A team of volunteer students from Stanford, Dartmouth, Columbia, Johns Hopkins, George Washington University and Menlo School spent this summer developing an amazing initial prototype. As the fall rolled around and field testing in Nepal begins, there were a few more design tasks that Pocobor started helping out with, including developing a custom power supply for the module and a circuit board for the IR camera used to track the control wand motion.

It’s been a great collaboration and Pocobor is excited to continue helping out with such an interesting and useful project!

Interface/Off (Part III): Touch-Free Systems

(Context: this is Part III of a look at the future of smart product interface technology; Part I set the stage and Part II looked at touch-based systems.)

Continuing along the spectrum of increasing sophistication and novelty, we come to touch-free interface systems. In the last post, I talked about some of the limitations of touch-based systems (usage scenario rigidity, physical contact requirements, and inherent 2D-ness). Some of the interesting advantages of touch-free approaches revolve around their ability to address those same limitations.

Gestural
Gestural interfaces typically use a vision capture system such as a camera and image processing algorithms to recognize hand and arm gestures made by the user. If you have seen the movie Minority Report, it contains a scene with a good example of the concept. This type of approach enables the use of very intuitive behavior from the user (e.g. grabbing or tapping an icon to interact with it) but, in contrast to touch screens, do not require the user to make physical contact with the interface and can be much more flexible in terms of user location. As an example, imagine a TV with a gestural interface that could be controlled equally well regardless of whether you were sitting on a sofa on one side of the room or a chair on the other side. There have also been some interesting developments in wearable systems, such as the SixthSense developed by Pranav Mistry and MIT’s Media Lab:

However, there are also some inherent limitations to these types of systems as well as challenges with implementation. First of all, the same gesture can look very different for different users, or even for the same user in different instances. Gestural systems have to be able to “translate” gestures that may look different but mean the same thing. Second of all, the hardware (a camera that can distinguish minute hand movements) and software (the algorithms to translate those subtle movements into the intent of the user) are non-trivial from an engineering standpoint.

Eye Tracking

Eye tracking could be considered a variant in the gestural interface family, albeit one that is better suited to close-in use. The idea is that instead of looking at hand and arm gestures, these systems look at tiny movements of the user’s eyes. For instance, imagine flicking your eyes down then back up to scroll this webpage down. The main advantage is ease of use, often for those with medical reasons preventing them from using traditional interfaces. There are a number of companies that have made impressive progress in commercializing systems for a broader audience, such as Tobii, but I question whether these systems will be able to match the performance of some alternatives in the long run. The size and range of motion of human eyes is sufficiently limited relative to the size and freedom of hands that I think it will be an uphill battle for eye tracking systems outside of applications where they are the only option.

Voice Command
In fundamental contrast to the eyes, which are typically used to receive data, and the arms and hands, which rarely convey large amounts of data, the human voice is anatomically and culturally evolved to transmit lots and lots of data. This makes it very well suited to telling your smart product what you want it to do. However, implementing an effective system is quite a technological challenge. The first step is recognizing words – imagine a voice control system used for keyboard replacement. However, newer systems, such as Siri for the later iPhone models, attempt to make the jump from recognizing words to understanding meaning and responding appropriately. This is a profound shift and one that will likely require continued refinement to accommodate different accents, dialects, styles of pro- and enunciations, and other peccadillos of individual human speech. The brain is an astounding processing tool but I believe that speech recognition algorithms will continue to make significant strides over the next few years and will be able to match the ability of humans to understand each other within a decade.

One arena in which voice systems are limited, however, is in conveying information back to the user. There is a reason for the expression “a picture is worth a thousand words” – there are certain concepts that are easier to convey visually than verbally. There are also potential scaling issues with voice systems – imagine everyone on a subway or in an office trying to control their phone or computer simultaneously with voice commands. All in all, though, I think that when combined with displays, voice control systems will be a significant part of the technology landscape in the future.

In my next post, I will look at the most sci-fi-esque possibility – brain-computer interfaces (thought control again!).

Brain Hacking

I just saw this article about the potential to involuntarily extract information from someone’s brain using an off-the-shelf brain-computer interface (BCI) such as the systems that we’ve previously blogged about and decided to take a quick side track from our Interface/Off series to address it. The idea is to use an electroencephalograph (aka EEG) headset and process the measurements when the subject thinks about various subjects to extract meaningful data. In the study referenced in the article, which was performed by researchers from UC Berkeley, Oxford and the University of Geneva, the data that was extracted included ATM PIN numbers and home addresses.

I find these results to be fascinating, exciting and a little bit disconcerting. Despite the imperfect success rate of this initial study (10-40% chance of obtaining useful information), it is clear that the potential exists to cross a threshold of human privacy that has never been violated – the sanctity of private thought. Obviously, this has the potential to change the world in fairly fundamental ways. I don’t actually think that we are on the cusp of a time in which your thoughts can be plucked out of your head right and left (if nothing else, I believe the necessity of the subject wearing an EEG headset is a limitation that is unlikely to be surmounted any time soon), but these results bring up a really interesting discussion about the ethics of progress.

This is not a new debate – people have been arguing over the benefits and drawbacks of scientific and technological development for centuries, in contexts from economic (robots will steal our jobs!) to medical (cloning, gene therapy, etc.) to apocalyptic (nuclear, biological and chemical weapons). However, a significant difference in this version of the debate is the ubiquity of this technology. I frequently write on this blog about how exciting and powerful I find it that the tools and materials to develop smart products and mechatronic systems are so accessible and inexpensive but this can be a double-edged sword when the resulting technology has the potential for misuse or abuse. For example, the Emotiv and Neurosky BCIs cost around $200-300, including access to their APIs.

I think this post is already long enough, so instead of getting into a detailed look at the philosophy and ethics of science and engineering, I’ll just give my two cents on the big picture and leave it there for now. I think that it is impossible and usually counter-productive to try to restrict development of science or technology, all the more so when there are not natural barriers (such as enormous capital requirements). I also believe that there is inherent good in the pursuit and acquisition of knowledge. However, I think that as a scientist or a developer / engineer, we have a responsibility to let our work be guided by our personal morals and ethics. Hopefully this is enough to ensure that none of us have to worry about stolen thoughts any time soon.

Interface/Off (Part 2): Touch-Based Systems

In my last post, I set the stage for a multi-post look at the future of smart product interfaces. To recap briefly, I believe that interface technology is at a really interesting point, where older approaches such as the keyboard and mouse are being superseded by new possibilities. However, there are lots of options and I thought it would be worth looking at some of the major contenders to understand the pros and cons of each and see if any look particularly well-suited to become as ubiquitous as keyboards were. In today’s post, I look at the simplest class: touch-based systems.

Physical Buttons
The simplest touch-based interface is a physical button, whether used as an on/off switch, a part of a keyboard, or otherwise. They are simple, cheap, reliable and well-understood, both by users and system designers. However, they are static and immutable – you can’t make them disappear when you don’t need them and you can’t really change how they look or what they do. They offer very limited versatility and are often not the cleanest or most elegant solution.

Touch Sensors
At the next level up in sophistication are discrete touch sensors (usually capacitive) – picture a lamp with no switch that you can touch anywhere to actuate. The use of capacitive and other types of touch sensors can liberate designers from some button-placement restrictions but that is about the extent of the benefits. Most of these sensors are binary (either on or off; no other states) and again offer limited versatility as their function cannot easily be changed during operation.

Touch Screens
Continuing the trend towards increasing complexity brings us to touch screens, which are essentially just an array of touch sensors overlaid on a display. The monumental benefit of this approach is the versatility enabled by the screen – the same “button” (or area of the screen) can have an infinite number of purposes if the system merely changes what the user sees. However, touch screens have their limitations as well – their inherent 2D-ness requires a level of abstraction to map the real, 3D world and they have some proximity restrictions (the user has to be close enough to physically touch them).

The smart phone revolution of the last few years has given us a powerful illustration of how useful and popular touch screens can become and I think that they are by far the most powerful interface currently available at a production level of sophistication and usability. However, in my next post I will start getting into some new technologies that are quickly advancing and could supplant touch screens in the not-too-distant future.