P5 – Group 10 – Team X

Group 10 – Team X

Junjun Chen (junjunc), Osman Khwaja (okhwaja), Igor Zabukovec (iz), Alejandro Van Zandt-Escobar (av)


A “Kinect Jukebox” that lets you control music using gestures.

Supported Tasks:

1 (Easy). The first task we have chosen to support is the ability to play and pause music with specific gestures. This is our easy task, and we’ve found through testing that it is a good way to introduce users to our system.

2 (Medium). The second task is being able to set “breakpoints” with gestures, and then use gestures to go back to a specific breakpoint. When a dancer reaches a point in the music that they may want to go back to, they set a breakpoint. Then, they can use another gesture to go back to that point in the music easily.

3 (Hard). The third task is to be able to change the speed of the music on the fly, by having the system follow the speed of a specific move. The user would perform the specific move (likely the first step in a choreographed piece) at the regular speed of the music, to calibrate the system. Then, when they are ready to use this “follow mode” feature, they will start with that move, and the music will follow the speed at which they performed that move for the rest of the session.

Changes in tasks and rationale

Our choices of tasks has not changed much over P3 and P4. We thought Not changing the first task was an easy decision, as it is simple, easy to explain to users and testers, and easy to implement. This makes it a good way to introduce users to the system. We’ve also learned from testing that starting and stopping the music is something dancers do a lot during rehearsal, so it is also an useful feature. Our second and third tasks have morphed a little (as described below), but we didn’t really change the set of tasks, as we thought that they provided a good coverage of what we wanted the system to do (to make rehearsals easier). Being able to go back to specific points in the music, and have the music follow the dancers were tasks that were well received in both our initial interviews with dancers as well as P4 testing.

Revised Interface Design

Our first task didn’t change, but our second task has morphed a little from P4. We had originally planned to allow only one breakpoint, as it was both easier to implement and simpler to explain, which was ideal for a prototype. However, since one of our users said that he would definitely want a way to set more than one, we are including that in our system. Our third task has also changed slightly. We had thought that the best way to approach our idea, which was to have the music follow the dancer as they danced, was to have it follow specific moves. However, it did not seem feasible for us to do that for a whole choreography (as it would require the user to go through the whole dance at the “right” speed first for calibration), we decided to follow only one move in P4. However, this did not seem to work well, as dancers do not just isolate one move from a dance, even during practice. So instead, we’ve decided to follow one move, but use that one move to set the speed for the whole session.

Updated Storyboards:

Our storyboards for the the first two tasks are unchanged, as we did not change those tasks very much. We did update the storyboard for the third task:

Sketches of Unimplemented Features:

A mockup of our GUI:

Some other sketches of our settings interface can be found on our P3 blog post (https://blogs.princeton.edu/humancomputerinterface/2013/03/29/group-10-p3/) under the “Prototype Description” heading.

Overview and Discussion:

Implemented functionality:

We implemented the core functionality of each of our three tasks, which includes recognizing the gesture using the Kinect, and generating the appropriate OSC messages that we can then pick up to process the music. As seen in the pictures, with the software we’re using (Kinect Space), we’re able to detect gestures for pausing, playing, setting a breakpoint, as well as detect the speed of the gesture (Figures A1 and A2). We are then able to process and play music files using our program (Figure B1), written in Max/MSP. As show in the figure, we are able to change the tempo of the music, set breakpoints, as well as pause and play the music.

Functionality Left Out:

We decided to leave out some of the settings interface (allowing users to check and change customized gestures), as we believed that this is would not affect the functionality of our prototype too much. Also, the software we are using for gesture recognition includes a rudimentary version of this, which is sufficient for testing (Figure A3). We also realized that we are not able to change the playback rate of an .mp3 file (because it’s compressed), so we need to use .wav files. It should be simple to include (as part of our system), a way for users to convert their .mp3 files into .wav files, but we’ve decided to leave this out for now, as we can do user testing just as well without it (it would just be more convenient for users if we supported mp3 files).

Wizard of Oz Techniques:

Our prototype is essentially two parts: capturing and recognizing gestures from the Kinect, and playing and processing music based on signals we get from the first part. The connection between the two can be done using OSC messages, but we need the pro version of the software we are using to recognize gestures for this to work. We’ve been in contact with the makers of that software, who are willing to give us a copy if we provide them a video of our system as a showcase. For now, though, we are using wizard-of-oz to fake this connection. Figure B2 shows the interface between our music manipulation backend, GUI, and the Kinect program.

Code from Other Sources

We are using the Kinect Space tool for gesture recognition (https://code.google.com/p/kineticspace/). For now, we are using the free, open source version, but we plan to upgrade to the Pro version (http://www.colorfulbit.com).


Figure A1 (The Kinect Space system recognizing our “breakpoint” gesture.)

Figure A2 (The Kinect Space system recognizing our “stop” gesture. Note that it also captures the motion speed (“similar” for this instance), which we are using for task 3).



Figure A3 (We are able to set custom gestures using Kinect Space.)


Figure B1 (A screenshot of our backend for audio manipulation, made with Max/MSP).

Figure B2 (A screenshot of interface between the different parts of our system. This is how people will control it, and it also shows communication with the kinect program.)


A2 – Junjun Chen


I did most of my observations before COM 313, Monday 1:30. I arrived 15 minutes early, to an empty classroom. The first person arrived around 1:20. She got out her tablet and checked her email for a couple minutes. Then, she opened the readings we had for this class, and flipped through that for the rest of the time. More people started arriving around 1:25. (A couple students arrived with headphones on.) One girl got out her laptop, and continued a readings she had open. She would occasionally switch tabs to browse the web: facebook, tumblr. Another girl had her calendar open (as well as many other windows). After checking her schedule, she opened a reading for another class, then switched to her email. She started writing an email, referencing her schedule occasionally.

Several other people also checked email/schedules, then opened up their notes for this class, started a new heading, and waited. Several read back over readings they had for this class, as well as the notes they had taken from last week. Most students arrived within a couple minutes of 1:30, which seems in line with what most people I’ve talked to have told me: that there is really not much time between classes, and it often takes the whole 10 minutes to get from one class to the next.

The professor came in right around 1:30 and passed out notes. Then he started an attendance sign-in sheet.


  1. An app to submit comments/questions on the lecture to the professor.
  2. An app that shows highlights from the previous lecture.
  3. A way to help organize windows/notes/browser tabs for a class and open/close them together.
  4. An app that breaks readings into small chunks that you can read in spare minutes.
  5. An app that makes group scheduling easier/more automatic by syncing with your calendar.
  6. An app that helps that tells you which friends are in classes close by so you can get together for lunch.
  7. App for ordering food from late meal.
  8. An app that calculates the time it would take to get from your current location to your next class, and tells you when it’s time to leave.
  9. An app to find an open seat (least disruptive) for the late student.
  10. An app that finds the shortest path to class from current location.
  11. A way for professors to save their settings for lights/projector.
  12. An app that lets you check in to class for attendance/sign in.
  13. An app that plays 5 minute clips of audio to learn a foreign language on the way to class (Alex Zhao).
  14. An app that reads your readings to you (text to speech), so you can listen as you’re walking to class.
  15. An app that makes it easier to add events from your email to your calendar.

Favorite Ideas:

1. Number 8: There are a lot of people who are early to class and don’t have anything to do, as well as late to class, suggesting that it is difficult for some people, including myself, to judge how long it takes to get from one location to another.
2. Number 14: It takes advantage of the time students spend walking to class (which takes up most of the 10 minutes), and breaking down a week’s readings into 10 minute segments helps prevent procrastination.


Number 8: An app that calculates the time it would take to get from your current location to your next class, and tells you when it’s time to leave.

A screen showing your classes.


Add a class: Class Name, Date, Time, Location.


Edit a class: Class Name, Date, Time, Location.


An alert will show up on your phone when it’s time to go.


Number 14: An app that reads your readings to you (text to speech), so you can listen as you’re walking to class.

A screen of your current documents. (Main page)


Add a new document. (New Document page)


Choose one to read. (Page 2)



I tested the prototype for idea #14 on three students. I first gave each user the prototype (starting on the Main page), explained the purpose of the app, and had them navigate freely through the app. Then, I asked each of them complete the following tasks on the prototype: 1. Select a document and play it. 2. Skip to the next section. 3. Add a new document. 4. Delete a document. I observed that all of the users had little trouble completing most of the tasks, and their feedback confirmed this.

Select reading (users 2 and 3):

IMG_20130301_202306 IMG_20130301_220555

Play the current selection (users 2 and 3):


Add a new paper/document (user 2):


The first thing I realized with the first user was that I needed a back button from Page 2 back to the Main page. I added this before testing with users 2 and 3. Before I gave them specific tasks, only one user pressed the ‘+’ button on the Main page; the others just clicked on an individual document, which took them to Page 2.


The delete button placment: The second user had some trouble deleting a document, as she didn’t see the delete button. The third user also commented on the delete button placement as not what she expected. Perhaps having a delete button on the Main page (instead of having it on Page 2) would be more intuitive. Also, the delete button should have a prompt, asking if the user really wants to delete the document.

Some general points:

Adding a document may be difficult, if the users don’t have it on their phones (and they are not very likely to). It would be a hassle to download the paper onto the phone or find the url of the paper. Perhaps it would be better to have a web interface for managing papers.

The first student, a psychology major, brought up the point that many scientific papers come with graphs/pictures, which the current prototype wouldn’t handle very well. It would be better if the graph was shown on the screen when it is reference.