August 19, 2016
Andy Gstoll is the chief marketing officer of Wikitude.
Andy Gstoll of Wikitude describes the opportunities and challenges of working with 3-D content in augmented reality applications.
PwC: Andy, can you please tell us about yourself and your company?
Andy Gstoll: I’m the chief marketing officer of Wikitude in Salzburg, Austria, where I have been based since Wikitude’s founding in 2009. In early 2016, I moved to Silicon Valley to run our new US office and to spearhead our initiatives in this part of the world. We have come a long way since 2009 when we were a typical startup, working out of a garage in Salzburg. Today we have more than 25 people and a large global augmented reality [AR] community around us.
At first we focused on a B2C [business-to-consumer] proposition, developing and increasing the install base of our AR browser called Wikitude. After some shifts in the market and the realization that the DNA of our team is better suited for technology and R&D [research and development], we changed our business model in 2012 and have since been focusing on developing AR tools and technology that we license to developers, mobile agencies, and OEMs [original equipment manufacturers] around the world.
PwC: What capabilities do your tools and technology provide?
Andy Gstoll: Our core product, the Wikitude SDK, addresses three main categories of AR technology. The first is GEO AR, which shows points of interest around a user and then puts those in reference to a user’s location. This feature is also known as GPS- or sensor-based AR, and it is the most mature AR technology. It is very relevant for use cases in the travel industry or for location-based services.
The second category is computer vision based and is called 2-D image recognition and tracking. It predominantly is used to augment posters, books, magazines—anything that is flat and has a 2-D surface.
Most recently we have launched the third category of AR technology, our 3-D tracking capabilities, which enable app developers to augment spaces, rooms, objects, and buildings. The world consists of mostly 3-D spaces and objects, and providing this technology is a natural progression for us.
Many developers are using our tools to integrate 3-D models into their AR apps. Wikitude Studio, which is a web-based AR content management system and creation tool, allows people to author 3-D content through a simple drag-and-drop interface. The experience may involve showing 3-D content, such as a 3-D graphics animation, on top of the cover of a magazine, brochure, or billboard.
We’re also working on next-generation tools that will allow developers to author in 3-D. Developers can map a 3-D space or capture a 3-D object. Then, they can use this map for augmentation. For example, in a warehouse they would take video footage and upload it to the authoring tool. In the authoring tool, they could define where the augmentations should be to help the workers in this warehouse to find certain objects or to perform certain tasks.
“The world we live and work in is 3D, and a lot of augmentation will make sense only when it is anchored to 3-D spaces.”
PwC: So you see opportunity in 3-D content for AR?
Andy Gstoll: Yes, we do see tremendous opportunity. As I mentioned, the world we live and work in is 3-D, and a lot of augmentation will make sense only when it is anchored to 3-D spaces. There are many use cases in operations, maintenance, training, services, and other enterprise processes. Digital content can augment various environments to solve real business problems and challenges.
An enterprise might have a factory with lots of machinery and might need to train a new employee to perform maintenance on the machinery. The easiest way to transfer knowledge would be to give the employee a smartglasses headset that augments reality during the training or that eliminates the training process altogether.
We’re working with a car manufacturer to build 3-D use cases for the company’s service technicians. They would work through a procedure while wearing AR glasses that would show them how to test or diagnose the braking system, remove particular components, and carry out some necessary repair tasks, so they know what to do for a specific car model.
PwC: What role are standards and compatibility issues playing in authoring AR content?
Andy Gstoll: Standards are a challenge, specifically with regard to 3-D content. When a car or a piece of machinery is designed, it is designed by using software on a desktop and it has huge amounts of data. If someone is designing the door of a car, that 3-D model produced with any CAD [computer-aided design] software might consist of millions of polygons. The model could be huge in size and therefore not suitable for rendering on a mobile device.
So you have this huge 3-D model, which you would love to display as an augmentation for use cases I described. But that is not possible now, because the file formats are not compatible with mobile devices. They’re simply too big. It is not practical to run a 2 gigabyte 3-D model on a smartphone.
You need to reduce the files and encode them into something we can work with. We have a 3-D encoder, which optimizes big files, reduces them in size, and turns them into a proprietary format. That makes it possible to run and display the files within our technology.
“It is not practical to run a 2 gigabyte 3-D model on a smartphone. You need to reduce the files and encode them into something we can work with.”
PwC: How do you reduce the file size?
Andy Gstoll: It depends. If you reduce a car door, you probably don’t lose a lot of detail if you stick to just the outside view. When a designer designed the door, he or she not only designed it from the outside, but also probably designed every screw on the interior. If you need the screws in your AR environment, then that’s another story.
When you reduce files, often there’s a process to cut off certain elements that are not needed for the specific use case. I don’t want to get too technical, but there are ways to compress files. There are ways to then make them smaller, so they can be rendered on a mobile device.
PwC: Is this process automated or manual?
Andy Gstoll: It is a combination. The 3-D models within large data sets could be converted automatically without touching each one, because they’re all of the same nature—maybe they’re all screws. You might be able to do batch encoding.
But 3-D models for specific design purposes are another problem. Perhaps an architect wants to show an interior design of a building, and it should look as realistic as possible for a client who’s buying it. Then you must make sure all the elements you display are properly reduced. You might end up working on every single aspect. In general, it depends on the use case.
PwC: What role do other authoring companies such as CAD vendors play in AR authoring?
Andy Gstoll: Right now in early 2016, there are AR platforms like Wikitude. There are also companies that provide software for creating CAD models. There are different software providers for architectural solutions. And there are different software providers for factory design and whatever else. All these companies have their own solutions. But they don’t have a button that says “export to AR.” That capability will happen, and eventually the two technologies will speak to one another without any issues.
Today, most of these tools are not 100 percent compatible. AR is still new. When these tools were designed, they were not designed to produce content that could run for AR on smartphones or tablets—let alone head-mounted displays, which normally have even less processing power and capabilities than tablets and smartphones.
“AR technology must handle many different use cases, such as the recognition of a building, the tracking of an object, or the augmentation on top of a surface.”
PwC: What can we expect in the future from AR authoring solutions?
Andy Gstoll: AR technology must handle many different use cases, such as the recognition of a building, the tracking of an object, or the augmentation on top of a surface. You will need to use different algorithms for these different scenarios. The functions you try to perform, such as object recognition or face detection, also vary with the use cases. These differences create silos of functionality.
In our conversations with ecosystem partners, the word combining comes up a lot. In the future, when we have more powerful devices and when we are further along with our research, there will be algorithm trees that combine different approaches. Depending on the context, you could pull one algorithm or another and then everything could run in parallel, which is not quite possible today. The ultimate goal is to have one solution for everything. Imagine if people could point their phones or look through their smartglasses at everything and everyone around them to get more valuable information on what they see.
We offer an API [application programming interface] that allows us to integrate other computer vision technologies to run in parallel with ours, if required. All these approaches are different ways of dealing with this problem—this potentially temporary problem—until we can run everything in parallel at lightning speed in the not-so distant future.