(Context: this is the final installment in our look at the future of smart product interface technology. Part I set the stage, Part II looked at touch-based systems, Part III covered touch-free approaches, and Part IV dug into brain-computer links.)

Well, this is it – the day that you’ve no doubt been waiting breathlessly for since July, when the gorgeous half-faces of Nic Cage and John Travolta confusingly headlined a mechatronics blog post. We are (finally) concluding our Interface/Off series examining the state and future of smart product interface technology.

How Did We Get Here?
As a very brief refresher, we’ve structured our look so far in roughly increasing order of sophistication. We started with touch-based interface technology such as physical buttons, touch sensing, and touch screens – these systems are simple and well-understood (by both designers and users) but are hobbled by physical limitations such as proximity requirements and dimensionality (it’s hard to make a 3-D touch-based interface). Touch-free systems, such as gestural interfaces and voice control, eliminate some of these limitations and broaden the designer’s palette, enabling more intuitive user behavior. However, they have their own disadvantages, including “translation” issues (both gestures and spoken language) and scaling issues when multiple potential users are present. Finally, brain-computer interfaces (BCIs) have the potential to maximize the communication bandwidth between the user and the smart product but require the solution of thorny issues in fields as diverse as medicine, engineering, and ethics.

Judge’s Decision
So, what is the interface of the future? A realistic answer is inevitably something of a cop-out; I expect to see all of these interface types in applications that they are particularly well suited for throughout the foreseeable future. For instance, there will always be contexts in which the low price, simplicity and reliability of a physical button will make it the preferred solution.

Furthermore, as these technologies mature, I think we will see more blended systems. Let’s not forget – even touch screens have only become ubiquitous with the rise of smart phones over the past 5 years; nearly all of the technologies we’ve looked at have a tremendous amount of room left to grow and evolve. For example, imagine a product that offers the ability to communicate with it via voice commands and/or gestures (actually, this is strikingly similar to face-to-face human conversation, where a mix of verbal and body language is typically used).

However, it would be too anticlimactic to leave you with such a non-answer after all these months. So, if I have to choose one technology to rule them all, there’s no way I can choose anything other than direct brain-computer links.

Think of it this way: the whole history of communication is an effort, with increasing success, to convey thoughts and ideas with more precision and accuracy, in as close to real time as possible. Languages create more and more new words; technology has let us communicate from a distance, then actually talk to each other, then even see each other live (so to speak), all so we can communicate the most accurate possible representation of what we think and feel at a given time. However, even when talking with each other face to face, there is a middleman. Language (both spoken and body) is a double-edged sword, allowing us to better understand ourselves and our feelings by giving us a framework with which to think but also requiring that we translate thoughts and feelings into that framework, sometimes inelegantly or inaccurately. Have you ever struggled to find the word to express something or felt like your words can’t keep up with your thoughts? Imagine cutting out that middleman and communicating what you are thinking with no mouth-enforced bandwidth limitations or language-driven description restrictions. The possibilities in both depth of communication and also efficiency and even privacy are staggering.

The above vision is ambitious, and we may even find that parts of it are physically or physiologically impossible (e.g. directly conveying a feeling from one person to another). And, unlike a lot of mechatronic technology we write about on this blog, it is not around the corner. However, despite those and a whole host of other issues currently associated with the technology, the potential is too great for humankind not to keep reaching for that future.

21st Century Pack Mule

Boston Dynamics, who many of you may know from Big Dog and other autonomous quadruped robots they have developed, released a new video a few days ago featuring their Legged Squad Support System (LS3) robot (as well as Friend of Pocobor Alex Perkins, the thespian/engineer who co-stars in the video).

The LS3 is essentially a pack mule for the future: it is meant to carry supplies for US soldiers and Marines in the field to both reduce their physical burden and also free up cognitive bandwidth for more important tasks. As such, it is designed to carry up to 400 pounds, walk 20 miles and operate for 24 hours without human intervention (other than voice commands to follow / stop / etc.).

From a mechatronics perspective, there are a number of interesting design challenges that an application like this poses. First of all, the dynamics and controls for a quadruped robot are considerably more complex than those for a wheeled or tracked equivalent. However, quadrupeds (or bipeds like humans, for that matter) can handle much rougher terrain than wheeled or tracked robots and this expanded mobility drastically increases their value. Ultimately the hope is that pack systems like the LS3 will be able to follow the soldier anywhere they are capable of going.

The second significant engineering obstacle centers around the the sensing and control systems required to follow the soldier and choose the best path given the upcoming terrain at any given time. There are considerable hardware, software and algorithmic issues that have to be addressed to arrive at a prototype robot like one in the video. There are also some interesting overlaps with the technology used for self-driving cars or any other mobile autonomous system.

Based on the examples that have been publicized over the past few years by Boston Dynamics and other groups pursing similar research and development, I have been deeply impressed by the pace of development of these types of systems. At this rate, even mountaineering porters might be feeling a little nervous about their job security soon…

Interface/Off (Part IV): Brain-Computer Interfaces

(Context: this is Part IV of a look at the future of smart product interface technology; Part I set the stage, Part II looked at touch-based systems and Part III delved into touch-free systems.)

This week’s no doubt eagerly awaited installment of our series on smart product interfaces looks at some even more novel technology. Brain-computer interfaces (BCIs) are direct communication pathways between a human brain and an external device (i.e. without using things like hands or voices to intermediate). Various types of sensor networks are used to directly measure brain activity, which is then converted to something recognizable to external systems.

The Technology
Without getting too deep into the nitty gritty of the technology, it’s worth a brief look at some of the common approaches used in these systems. Both invasive (implanted into the body) and non-invasive systems exist, with unsurprising trade-offs: invasive systems increase medical risks and non-invasive systems often yield poor signal resolution because the skull dampens signals. For both practical and psychological reasons, commercially plausible systems outside special circumstances are almost certain to be non-invasive for the foreseeable future .

On the sensing side of the system, electroencephalography (EEG) is the most common technique and typically involves an array of electrodes mounted to the scalp that measure voltage fluctuations resulting from ionic current flows within neurons. Functional magnetic resonance imaging (fMRI) has also been used effectively, where brain activity is scanned by detecting changes in blood flow associated with different brain activity. However, fMRI has vastly increased hardware requirements regarding size, cost and complexity and thus early commercial systems have typically utilized an EEG headset (such as the systems we looked at a few years ago).

On the processing side, I am confident that progress over the next few years will be rapid, partly because the current state of the art is so crude (relatively). Historically, BCIs have had the most success translating physical commands (“Clench left fist”), but newer research has made strides in extracting more specific data (such as addresses or ATM numbers). However, one of the most interesting parts of BCIs is that processing can be addressed from both sides of the problem. The traditional engineering approach would be to work on developing and refining algorithms that can understand what the subject intends; to correlate certain brain activity measurements to physical commands or ultimately language. However, because of the remarkable neuroplasticity of the human brain, it is also possible to work in reverse and train the brain to think in certain ways that are more easily measured and understood by existing systems. It will be fascinating to watch the interplay between these two approaches as the technology matures.

The Why
Given the technical hurdles currently associated with BCIs, some people question whether they are worth the development headaches. Perhaps the most powerful potential application is for disability treatment – there is an already impressive body of research exploring the use of BCIs to control prosthetics and even, for instance, to restore non-congenital blindness. The ability to give a person back the use of a body part is a profoundly valuable result and should justify development of these systems all by itself.

However, there are other applications with larger potential markets that are also pushing both research and commercialization. Two of the major drivers are laziness and bandwidth. Laziness seems like a pretty minor incentive but the galvanizing force of annoyance should not be underestimated. History is littered with product improvements that did nothing more than save users a few seconds but came to dominate the market anyway (does anyone under 30 even know what a rotary phone is anymore?).

The bandwidth issue is more powerful. In previous posts, I’ve described the progression of interface technology in terms of increasing technical sophistication but another way to look at it is as a journey of increasing information transmission capacity. Buttons or even keyboards can convey only a single bit of information with each press. Touch screens improve on this but are inherently 2D. Gestural interfaces or voice command systems enable communication that is richer yet but the ultimate in bandwidth will occur when we can cut out the middleman (e.g. hands or mouth) altogether. Despite the sci-fi and touchy-feely connotations of a “mind-meld” (sarcasm intended), that concept embodies the pinnacle of possibility for human communication bandwidth and arguably also richness of communication, depending on how philosophical we want to get.

Such potential value does not come without potential costs, however. The idea of letting a person or system look directly into one’s thoughts will be considered creepy by many and there are significant technical obstacles between today’s state of the art and the kind of interface I posit above; it is even possible that the architecture of our brain does not lend itself to outputting the full information content of all our thoughts. Also and most importantly, we might have to wear dorky-looking helmets or headsets.

All in all, I expect brain-computer interfaces to become increasingly common in the world around us over the next few decades. In my next post, I will wrap up the series and summarize my best guess for where smart product interfaces are going – what would a competition be without a winner?

Interface/Off (Part III): Touch-Free Systems

(Context: this is Part III of a look at the future of smart product interface technology; Part I set the stage and Part II looked at touch-based systems.)

Continuing along the spectrum of increasing sophistication and novelty, we come to touch-free interface systems. In the last post, I talked about some of the limitations of touch-based systems (usage scenario rigidity, physical contact requirements, and inherent 2D-ness). Some of the interesting advantages of touch-free approaches revolve around their ability to address those same limitations.

Gestural interfaces typically use a vision capture system such as a camera and image processing algorithms to recognize hand and arm gestures made by the user. If you have seen the movie Minority Report, it contains a scene with a good example of the concept. This type of approach enables the use of very intuitive behavior from the user (e.g. grabbing or tapping an icon to interact with it) but, in contrast to touch screens, do not require the user to make physical contact with the interface and can be much more flexible in terms of user location. As an example, imagine a TV with a gestural interface that could be controlled equally well regardless of whether you were sitting on a sofa on one side of the room or a chair on the other side. There have also been some interesting developments in wearable systems, such as the SixthSense developed by Pranav Mistry and MIT’s Media Lab:

However, there are also some inherent limitations to these types of systems as well as challenges with implementation. First of all, the same gesture can look very different for different users, or even for the same user in different instances. Gestural systems have to be able to “translate” gestures that may look different but mean the same thing. Second of all, the hardware (a camera that can distinguish minute hand movements) and software (the algorithms to translate those subtle movements into the intent of the user) are non-trivial from an engineering standpoint.

Eye Tracking

Eye tracking could be considered a variant in the gestural interface family, albeit one that is better suited to close-in use. The idea is that instead of looking at hand and arm gestures, these systems look at tiny movements of the user’s eyes. For instance, imagine flicking your eyes down then back up to scroll this webpage down. The main advantage is ease of use, often for those with medical reasons preventing them from using traditional interfaces. There are a number of companies that have made impressive progress in commercializing systems for a broader audience, such as Tobii, but I question whether these systems will be able to match the performance of some alternatives in the long run. The size and range of motion of human eyes is sufficiently limited relative to the size and freedom of hands that I think it will be an uphill battle for eye tracking systems outside of applications where they are the only option.

Voice Command
In fundamental contrast to the eyes, which are typically used to receive data, and the arms and hands, which rarely convey large amounts of data, the human voice is anatomically and culturally evolved to transmit lots and lots of data. This makes it very well suited to telling your smart product what you want it to do. However, implementing an effective system is quite a technological challenge. The first step is recognizing words – imagine a voice control system used for keyboard replacement. However, newer systems, such as Siri for the later iPhone models, attempt to make the jump from recognizing words to understanding meaning and responding appropriately. This is a profound shift and one that will likely require continued refinement to accommodate different accents, dialects, styles of pro- and enunciations, and other peccadillos of individual human speech. The brain is an astounding processing tool but I believe that speech recognition algorithms will continue to make significant strides over the next few years and will be able to match the ability of humans to understand each other within a decade.

One arena in which voice systems are limited, however, is in conveying information back to the user. There is a reason for the expression “a picture is worth a thousand words” – there are certain concepts that are easier to convey visually than verbally. There are also potential scaling issues with voice systems – imagine everyone on a subway or in an office trying to control their phone or computer simultaneously with voice commands. All in all, though, I think that when combined with displays, voice control systems will be a significant part of the technology landscape in the future.

In my next post, I will look at the most sci-fi-esque possibility – brain-computer interfaces (thought control again!).

Interface/Off (Part 2): Touch-Based Systems

In my last post, I set the stage for a multi-post look at the future of smart product interfaces. To recap briefly, I believe that interface technology is at a really interesting point, where older approaches such as the keyboard and mouse are being superseded by new possibilities. However, there are lots of options and I thought it would be worth looking at some of the major contenders to understand the pros and cons of each and see if any look particularly well-suited to become as ubiquitous as keyboards were. In today’s post, I look at the simplest class: touch-based systems.

Physical Buttons
The simplest touch-based interface is a physical button, whether used as an on/off switch, a part of a keyboard, or otherwise. They are simple, cheap, reliable and well-understood, both by users and system designers. However, they are static and immutable – you can’t make them disappear when you don’t need them and you can’t really change how they look or what they do. They offer very limited versatility and are often not the cleanest or most elegant solution.

Touch Sensors
At the next level up in sophistication are discrete touch sensors (usually capacitive) – picture a lamp with no switch that you can touch anywhere to actuate. The use of capacitive and other types of touch sensors can liberate designers from some button-placement restrictions but that is about the extent of the benefits. Most of these sensors are binary (either on or off; no other states) and again offer limited versatility as their function cannot easily be changed during operation.

Touch Screens
Continuing the trend towards increasing complexity brings us to touch screens, which are essentially just an array of touch sensors overlaid on a display. The monumental benefit of this approach is the versatility enabled by the screen – the same “button” (or area of the screen) can have an infinite number of purposes if the system merely changes what the user sees. However, touch screens have their limitations as well – their inherent 2D-ness requires a level of abstraction to map the real, 3D world and they have some proximity restrictions (the user has to be close enough to physically touch them).

The smart phone revolution of the last few years has given us a powerful illustration of how useful and popular touch screens can become and I think that they are by far the most powerful interface currently available at a production level of sophistication and usability. However, in my next post I will start getting into some new technologies that are quickly advancing and could supplant touch screens in the not-too-distant future.

Interface/Off (Part 1)

As the pace of technological development continues to increase, it becomes more and more interesting to me to try to predict the future – I’m impatient and enjoy getting closure on my guesses sooner rather than later. Despite the difficulty of accurately forecasting the future, it’s worth taking a shot at because the upside is so high: if you can position yourself or your company as an expert on the next world-changing technology before it breaks big, you will be as popular as a PBR salesman in Dolores Park. Conversely, a poor understanding of where technology is going can be the death knell for even a dominant company – look at how quickly Nokia and RIM have fallen following their inability to anticipate and understand the smart phone revolution.

I would argue that we are in the midst of a pretty significant shift right now in the field of interface technology, specifically with regard to smart products. For a few decades after the advent of the personal computer, the keyboard and mouse were standard interface technology without much in the way of alternatives. However, the overlapping rises of mobile computing, smart products, and some new interface technologies have significantly opened the playing field as we look forward (nowadays Minority Report seems a lot less futuristic and a lot more presentistic, which is definitely not a word). Just look at the last 5 years – touchscreens and then voice control have dramatically expanded the realm of possibility for how we interact with our phones. It is possible that touch screens’ day in the sun will be shorter than one might think, though – new technologies such as gestural interfaces and brain-machine interfaces (BMIs) are showing significant promise and could become ubiquitous within a few years. So, the question is, how will you interact with the (man-made) world around you in 5, 10 or 25 years?

Because this topic deserves a little more space to breathe than I can give it in a single post, I decided to break up my thoughts into 4 follow-up sections, distributed on a spectrum of approximately increasing sophistication and system versatility:

  1. In the red corner are Touch-Based Systems: physical button systems such as keyboard and mouse, touch sensors, and touch screens
  2. In the blue corner are Touch-Free (Gestural) Systems: systems that are controlled physically but without contact, such as eye tracking, voice control and gestural interfaces
  3. In the yellow corner (it’s a triangular ring apparently) are Brain-Machine Systems: revisiting thought control, which I’ve touched on several times in previous posts
  4. Finally, the fight can’t end without a Judge’s Decision: what system will claim victory?

2011 Maker Faire Spotlight: The Drawing Machine

Well, I hope some of you got the chance to check out this year’s Maker Faire – as always, there were a bunch of really interesting projects and people to see.  This week, I wanted to follow up on my previous post and highlight my favorite exhibit from this year’s show.  It is a project that does a great job of illustrating some of the best things about the Faire and mechatronics in general: creativity, execution and the synthesis of fields like computer science and mechanical engineering.

The Drawing Machine is a project by Harvey Moon that makes awesome drawings based on photographs.  There are two parts of the system: (1) a mechanical drawing module and (2) image processing and control software.

The mechanical part of the system consists of a drawing module with a pen and several servo motors which hold the pen either against or away from the drawing surface.  This assembly is suspended in front of the drawing surface via two belts being driven by stepper motors (see picture).  By driving the motors, the pen can move in both axes of the drawing (think of an Etch-a-Sketch where the knobs are computer controlled).

The software component of the system is responsible for processing the input photograph, deciding how to draw the sketch, and controlling the pen module.  The most interesting part of the system to me is that the software algorithm is non-deterministic – it makes decisions as it moves and creates a unique drawing each time, even if it gets the same input photograph multiple times.

Seeing the system in action at the Faire was really cool and the artwork that is created looks fantastic.  This is a great example of a sophisticated and well-designed mechatronic system being used to do something really creative and I’m glad I got to see it!

Mind Control

Controlling things with your mind is hard work.

I’ve been seeing more articles about a cool platform lately and wanted to write a bit about why I think it’s exciting that, like 3d printers or wearable electronics, this technology is becoming more accessible. The topic: mind control… seriously. Actually, that is overly sensationalist but the ability to induce physical actions outside the body via use of the brain is becoming much more common. Long the province of well-funded research labs, the technology is finally now being commercialized by both DIYers and companies.

I decided to write about this when I read about the Pyrokinesis for Alex project, where some members of the Site3 coLaboratory in Toronto rigged up a setup that uses a wireless EEG headset and control unit to allow the user to control a flamethrower with his/her mind (appealing to both the pyromania and the love of comic books of my inner 8-year-old).

The technology required is surprisingly simple and the headsets are pretty user-friendly from an interface perspective. For instance, the NeuroSky MindSet used for the PK4A project uses one dry sensor on the forehead and three reference sensors on the ear. Its output includes raw brainwave data, brainwave band values and two values for “attention” and “meditation” which are computed by a proprietary algorithm. The communication link is via Bluetooth serial with a simple protocol. Emotiv’s system is purported to be able to track eye motion, facial expressions, emotional state and even directional thoughts. The Emotiv EPOC system starts at $300 while the MindSet sells for $200.

But, here’s where it gets cooler. Both companies have made SDKs available for developers and have app stores up and running: just like smart phone app stores except the platform is the brain control headset instead of a phone. Just as we saw the effects of opening up app stores for smart phones on the breadth and variety of phone applications, I think this model will enable a mind-boggling (ha!) array of applications for brain interfaces, from the medical space to entertainment to the workplace.

The space is still in its infancy – I’m sure the next few years will bring significant improvements in the quality and type of data that the headsets can measure as well as the available applications. Even now though I think this is a great example of the kind of science-fiction-esque technology that gives us a glimpse of the future and shows why I am so passionate about mechatronics.

(HT: Techcrunch)

“Objectified” Movie Review

Some of us at Pocobor went to see Objectified on Sunday (playing at Yerba Buena Center for the Arts throughout most of July here in San Francisco). Objectified is a documentary on industrial and product design by Gary Hustwit (the same person who produced/directed the documentary Helvetica). I really enjoyed the film. The film meanders through the different aspects of design (what is design, what makes good design, what are the goals of good design, and where is design heading). Along the way it interviews many of the heavy hitters in the design community, from the lead industrial designer at Apple to the founders of IDEO.

I found the film to be very inspirational because it reconfirmed my views of the importance of good design, and in a kind of self congratulatory slap on the back, it made me feel proud that we here at Pocobor are part of this growing design movement.

Although there were many interesting points in the film, there was one in particular that stuck with me: most companies know the average life span of their products, whether it is a mobile phone that will probably be used for 15 months, or a toothbrush that will probably last 2 months. But despite this short lifespan, almost all products are made with materials that will last thousands of years. Karim Rashid, who brought up this point, mentioned that if he will only keep his cell phone for 11 months, it might as well be made of cardboard that can more easily be recycled.


Cardboard Mobile Phones: Coming to Apple Stores Soon?

While his statement was hyperbole, it got me thinking about the idea of making products with materials that will start to biodegrade almost immediately after their use is done. I don’t have any immediate thoughts or solutions, but I find it a very interesting concept.