P5 Runway – Team CAKE | Human-Computer Interface Technology

Group number and name: #13, Team CAKE
Team members: Connie, Angie, Kiran, Edward

Project Summary

Runway is a 3D modelling application that makes 3D manipulation more intuitive by bringing virtual objects into the real world, allowing natural 3D interaction with models using gestures.

Tasks Supported

The tasks cover the fundamentals of navigation in 3D space, as well as 3D painting. The easiest task is translation and rotation of the camera; this allows the user to examine a 3D scene. Once the user can navigate through a scene, they may want to be able to edit it. Thus the second task is object manipulation. This involves object selection, and then translation and rotation of the selected object, thus allowing a user to modify a 3D scene. The third task is 3D painting, allowing users to add colour to objects. In this task, the user enters into a ‘paint mode’ in which they can paint faces various colours using their fingers as a virtual brush.

Task Choice Discussion

From user testing of our low-fi prototype, we found that our tasks were natural and understandable for the goal of 3D modelling with a 3D gestural interface. 3D modelling requires being able to navigate through the 3D space, which is our first task of camera (or view) manipulation. Object selection and manipulation (the second task) are natural functions in editing a 3D scene. Our third task of 3D painting allows an artist to add vibrancy and style to their models. Thus our tasks have remained the same from P4. In fact, most of the insight we gained from P4 was in our gesture choices, and not in the requirements for performing our tasks.

Interface Design

Revised Interface Design

The primary adjustments we made to our design concerned the gestures themselves. From our user tests for P4, we found that vertex manipulation and rotation were rather unintuitive. This, in addition to our discovery that the original implementations would not be robust, prompted us to change the way these gestures worked. We added the space bar to vertex manipulation (hold space to drag the nearest vertex), both to distinguish vertex manipulation and object translation, and to make vertex manipulation more controlled. For rotation, we changed our implementation to something with a more intuitive axis (one stationary fist/pointer acts as the center of rotation), so that the gesture itself is better-defined for the sake of implementation.

Two main features that we added were calibration (elaborated on below) and statuses. The statuses are on-screen text prompts that show data fetched from the leap (e.g. what hand configuration(s) are currently being tracked), the mode, the current color (if relevant), and some debugging information about the mesh and the calibration. These statuses are displayed at screen depth in the four corners, in unassuming white text. They are primarily for information and debugging purposes, and hopefully do not detract from the main display.

Everything else in our design remained largely the same, including the open-handed neutral gesture, pointing gestures for object manipulation and painting, and fists for camera/scene manipulation. We also mapped mode and color changes to keyboard keys somewhat arbitrarily, as this is easy to adjust. Currently, modes are specified by (1) calibration, (2) object manipulation, and (3) painting. Colors are adjusted using triangle brackets to scroll through a predefined list of simple colors. We hope to adjust this further from user testing feedback.

Update Storyboards

Our storyboards have remained the same from P4, since our tasks from P4 are suitable for our system. These storyboards can be found here

Sketches of Unimplemented Features

The two major features that we have not implemented are head tracking and 3rd-person viewing (both of which are hard to sketch but easily described). Head tracking would allow us to adjust the object view such that a user can view the side of the object by simply moving his or her head to the side (within limits). Software for third-person viewing can render the display in its relative 3d space in a video, so that we can demo this project more easily (this is difficult to depict specifically in a sketch because any sketches are already in third-person). Both of these features are described in more detail below.

Prototype Description

Implemented Functionality

Our current prototype provides most of the functionality required for performing our three tasks. The most fundamental task that we did not incorporate into our low-fi prototype was calibration of the gesture space and view space. This was necessary so that the system knows where the user’s hands and fingers are relative to where the 3D scene objects appear to float in front of the user. The calibration process is a simple process that must be completed before any tasks can be performed, and requires the user to place their fingertip exactly in the center of where a target object appears (floating in 3D space) several times. This calibration process is demonstrated in Video A. Once this calibration is completed, a pointer object appears and follows the user’s fingertip. This pointer helps provide feedback for the user, showing that the system is following them correctly. It also helps them if they need to fine-tune the calibration, which we have implemented on a very low level where various keyboard keys change the values of the calibration matrix.

After calibration has been performed, the user can start performing the 3D tasks. As mentioned above, there are three modes in our prototype: View mode, Object Manipulation mode, and Painting mode. In all three of these modes, we allow camera manipulation, including translation and rotation, using fist gestures as described above. Basic camera manipulation is shown in Video B.

In Object Manipulation mode, we support translation and rotation of individual objects in the scene, as well as vertex translations. Translation requires the user to place a single fingertip onto the object to be moved, after which the object would follow the fingertip until the user returns to the neutral hand gesture. Rotation involves first placing one fingertip into the object, and then using the other fingertip to rotate the object around the first fingertip. These operations are shown in Video C. Finally, vertex translation is activated by holding down the spacebar; then, when the user approached suitably close to a vertex on an object, they could drag that vertex to a desired position and release it by releasing the spacebar. Vertex manipulation is shown in Video D.

Finally, the Painting mode allows the user to select a paint color using the keyboard and paint individual faces of a mesh using a pointing fingertip. When a fingertip approaches suitably close to a face on a mesh, that face is painted with the current color. This allows the user to easily paint large swathes of a mesh in arbitrary shapes. This process is shown in Video E.

We implemented these features because they are central to performing our tasks, and we had enough time to implement these features to a reasonable level of usability.

Future Functionality

All of our basic functionality is in place. However, there are several things that should be completed for our final system, mentioned in the previous section.

The most important such task from a user standpoint is to add in head tracking. This will allow the user to move their head and see the appropriately differing view of the scene while still maintaining the calibration between viewpoint and gesture. We will accomplish this by using an Augmented Reality toolkit that provides can provide the position and orientation of fiducial markers (which look like barcodes) in a camera frame; attaching a marker to the 3D glasses will allow us to track the user’s head position and therefore viewpoint relative to the monitor. We have not implemented this because the core scene manipulation functionality is more central to our system. Our system works fine without head tracking, as long as the user does not move their head too much.
For the purposes of demonstrating our project, we also want to use augmented reality toolkits to allow third-person viewers to see the 3D scene manipulation. Currently, as you can see in first videos, the system looks rather unimpressive from a third-person point of view; only the active user can see that their finger appears in the same 3D location as the pointer. Adding this third-person view would allow us to take videos that show the 3D virtual scene “floating” in space just as the main user would see it. This is also implemented using the Augmented Reality toolkit, which will allow such a third-person camera to know its position relative to the Leap sensor (which we will place at a specified position relative to a fiducial marker on the desk). Since this is clearly for demo purposes, it is not central to the application’s core functionality that is required for user testing.
Finally, if we have time we would like to add in an additional hardware component to provide tactile feedback to the user. We have several small vibrating actuators that we can use with an arduino; this will allow us to provide a glove to the user that will vibrate a fingertip when it intersects an object. This would add a whole new dimension to our system, but we want to make sure that our core functionality is already well in place before extending our system like this.
Some already implemented parts of our system can also use improvement – for example, several parts of the mesh processing code will likely require optimization for use on larger meshes. Similarly, we can experiment with our handling of Leap sensor data to increase robustness of tracking; for example, some sensing inaccuracies can be seen in the videos below, where an extended fingertip is not detected.

Wizard of Oz techniques

No wizard of oz techniques were used in this prototype. Our system already has all of its basic functionality, so this was not necessary.

External Code

We are building off of the Unity framework as a platform for development, which provides a lot of the basic functionality of an application, input, rendering, and scene management. Conveniently, the Leap Motion SDK provides a Unity extension that allows us to access Leap sensor data from within Unity. We also use the DotNumerics Linear Algebra C# library for some calibration code.

We borrow several classes from the sample Unity code provided with the Leap SDK. Otherwise, the C# Unity scripts are all written by our team. For our unimplemented portions, namely the third-person view and head tracking, we will be using the NYARToolkit C# library for tracking and pose extraction.

Videos

Video A – This video shows the basic setup with stereoscopic 3D monitor and Leap. It also demonstrates the calibration workflow.

Video B – This video shows demonstrates camera manipulations, including how to translate and rotate the scene.

Video C – This video shows object manipulation, including how to select, translate, and rotate objects.

Video D – This video shows vertex translation, where the user can deform the object by selecting and moving a vertex on the mesh.

Video E – This video shows 3D painting, where the user can easily paint faces on the mesh using gestures.