CEO Alex Capecelatro discusses how Josh AI provides a level of voice control Google and Amazon can’t match—and without any of the snooping
by Dennis Burger
October 21, 2022
The explosion of voice control over the past few years has changed the way people interact with their lights and locks and entertainment systems. But all of this comes at a cost—mainly, legitimate privacy concerns. There is a luxury alternative to the Alexas and Siris and OK Googles of the world, however, known as Josh AI—or simply Josh—that allows you to talk to your Control 4, Crestron, or Lutron control system
as an alternative to using touchscreens, keypads, or wand-style remotes—and without the risk of violating your privacy. In the conversation below, CEO Alex Capecelatro talks about Josh’s approach to privacy, customization, and the unique needs of the custom luxury market.
Amazon has done a pretty good job of selling voice control to the masses but it doesn’t seem to have moved the needle as much with high-end custom homes. Why do you think that is?
Amazon doesn’t really care about the problems and situations we deal with in catering to the luxury market. They’re just trying to get millions upon millions of listening devices into people’s homes. So while it’s good that they’ve moved people towards acceptance of voice control, it also presents this opportunity where people are saying, “I see the benefits but I also see the concerns.” And that’s a dilemma we can speak to. You don’t have to give up your privacy just to have good voice control.
Also, Amazon’s approach relies more on mapping simple commands to simple actions. For example, Alexa can create a scene that controls your lights but if you want to be able to walk into
The Josh mobile interface lets you talk to your system from poolside or the other side of the world
The Josh mobile interface lets you talk to your system from poolside or the other side of the world
any room and say “Turn it up”—something like that is room-dependent, device-dependent. Josh, by contrast, understands what’s going on with the state of the home so the homeowner can speak very naturally.
This isn’t as much of a problem when you’re dealing with a single-bedroom apartment or a smaller-footprint home. But when you’re getting into 5,000-to-10,000-square-foot homes or larger, it’s going to make a difference because in homes like that, you can have hundreds of connected devices across dozens of rooms.
I assume data privacy is also a big part of the appeal of your system for a high-end clientele.
Exactly. We don’t upload your voice to the cloud unless we need to. We don’t believe it’s actually required, and it’s not the right thing to do except in very specific cases.
With Amazon, they do practically no processing on the device itself. They’re sending everything out to the cloud. When you do that, it’s very tempting to start using that information to serve up ads and other things. And we see it when Amazon files patents. They’re building passive listening devices that are monitoring what you say even if they’re not invoked, and specifically listening for words like “vacation” or “Florida” so then it knows to serve you ads for airlines and stuff.
I was always under the impression all of the processing for Josh was done locally, but looking at your FAQ, I see that Josh does require minimal access to the Josh Cloud. Is that a new development?
No, we’ve always had that. Reason being, if you want to be able to connect to cloud services—streaming music from Spotify, for example, or streaming video from Netflix—that has to go out to the cloud. If you want to be able to ask questions like, “What’s the weather forecast?” you’re
the Josh Micro voice-control module
the Josh Micro voice-control module
hitting a weather API that’s going to be out in the cloud. The local processing is simply not going to be able to know or access all of that.
That said, the way our hardware in the home communicates with the Josh Cloud is very similar to the way banking-app encryption works so it’s very secure. It’s just to a trusted endpoint; it’s not going out to any third parties that aren’t controlled by us.
You were talking earlier about what “Turn it up” might mean on a room-by-room basis. Is that adaptability—the ability to have a command mean something different in one room from another—based on programming done by the installer or is that machine learning?
That’s using a few different technologies. Basically, it’s looking at a mapping of the home, what devices are in the rooms, and what capabilities those devices have, in addition to what things have been recently asked for. So when you walk into the living room and say, “Turn it up,” Josh knows the living room has three devices capable of being turned up.
That could refer to the volume of music, the temperature on the thermostat, or the brightness of the lights. Josh says to itself, “Which of these devices are currently running and have the ability to be turned up?” So if there’s music playing and nothing else is active, “Turn it up” is almost certainly referring to the music volume. On the other hand, if there’s no music playing but you have a thermostat connected to an HVAC zone currently engaged in heating, “Turn it up” is likely going to refer to the temperature.
Josh is constantly looking at the context of the environment you’re in, which involves retaining the context of your recent commands. The system understands the context of the way we naturally speak.
Do you have Josh users who are uncomfortable that the system analyzes how they use different devices and systems throughout the home and over time and retains that information?
Yes. There’s a lot of value to keeping a history of commands, in that you might want to know why the fireplace was on or why the music was playing in a certain room. Maybe it’s because the kids gave it a command. But some people would rather have the utmost privacy, where there’s no history or logging, and so we give the ability to put Josh into incognito mode where you give a command, the action happens, but it never gets written to a database, even on your local hardware.
We also thought about the middle ground. What about someone who wants to be able to see what the microphones heard last night that made their music start playing at bedtime but maybe they don’t care about a week ago because that’s old news? We allow the homeowner to set up a trigger that automatically deletes their history every day, week, or month. So that effectively allows you to say, “Hey, keep my command logs for as long as they’re useful to me, but don’t keep them forever.”
Do the settings that let a user delete their command history affect the system’s ability to adapt to their habits or preferences? Or is that just an irrelevant question?
It’s relevant, but it’s something that matters less when you have a professional installer because there are a lot of things you can program into the system. For example, an integrator can program it such that when the client says, “Play some music,” if it’s in the morning it plays classical and if it’s in the evening it plays jazz, or whatever genres might match the homeowner’s preferences throughout the day to set the right mood.
That being said, if you don’t have your commands being erased and you haven’t specified what you want it to do, when you walk into a room and ask it to simply “Play music,” Josh has the ability to look
the Nano embedded in a Lutron wall plate with the privacy switch visible near the bottom of the microphone
the Nano embedded in a Lutron wall plate with the privacy switch visible near the bottom of the microphone
Sign up for our monthly newsletter
to stay up to date on Cineluxe
at what music you’ve historically asked for at any given time of the day and pick something it thinks is appropriate. If you are deleting your logs, though, we’re not going to be able to do those types of things without some extra programming ahead of time.
Let’s talk about the privacy switch on the Josh Nano. It’s a little switch that turns red when you flip it off, giving the user more confidence that the system is indeed unable to listen to them. How did that come about?
There are a number of microphone devices out there that have the ability to mute but typically it’s a software-controlled mute, and I remember hearing people saying in the early days of the Amazon Echo that they didn’t trust its mute function. Did it really disable the microphone? Is it really not listening or is it just turning on a red light that makes you think it’s not listening? I’m not sure.
When you flip that switch on the Josh Nano, though, we physically disconnect the microphone. There’s a physical connection that’s broken. There’s no way that device could be listening to you.
Also, on a lot of other devices from mass-market companies, the mute is on the back or on the bottom or somewhere that’s hard to see. We decided to make it the only physical switch on the face of the product, so when you approach it and see that one switch, it’s super easy to know what it does.
Dennis Burger is an avid Star Wars scholar, Tolkien fanatic, and Corvette enthusiast who somehow also manages to find time for technological passions including high-end audio, home automation, and video gaming. He lives in the armpit of Alabama with his wife Bethany and their four-legged child Bruno, a 75-pound American Staffordshire Terrier who thinks he’s a Pomeranian.
© 2023 Cineluxe LLC
receive a monthly recap of everything that’s new on Cineluxe