Final Project – Team X | Human-Computer Interface Technology

Group 10 — Team X
–Junjun Chen (junjunc),
–Osman Khwaja (okhwaja),
–Igor Zabukovec (iz),
–Alejandro Van Zandt-Escobar (av)

Description:
A “Kinect Jukebox” that lets you control music using gestures.

Previous Posts:
P1: https://blogs.princeton.edu/humancomputerinterface/2013/02/22/p1-team-x/
P2: https://blogs.princeton.edu/humancomputerinterface/2013/03/11/p2-team-x/
P3: https://blogs.princeton.edu/humancomputerinterface/2013/03/29/group-10-p3/
P4: https://blogs.princeton.edu/humancomputerinterface/2013/04/09/group-10-p4/
P5: https://blogs.princeton.edu/humancomputerinterface/2013/04/22/p5-group-10-team-x/
P6: https://blogs.princeton.edu/humancomputerinterface/2013/05/06/6085/

Video Demo:

Our system uses gesture recognition to control music. The user can choose a music file with our Netbeans GUI. Then, they can use gestures for controls such as pause and play. We also have functionality for setting “breakpoints” with a gestures. When the dancer reaches a point in the music that he may want to go back to, he uses a gesture to set a breakpoint. Then, later, he can use another gesture to go back to that point in the music easily. The user is also to be able to change the speed of the music on the fly, by having the system follow gestures for speed up, slow down, and return to normal. Every time the slow down or speed up gesture is performed, the music incrementally slows down or speeds up.

Youtube Video Link

Changes

Improved the weights for predefined gestures in Kinetic Space, as our testing from P6 indicated that our system struggled to recognize some gestures, and this was the main source of frustration for our users. By changing the weights Kinetic Space places on certain parts of the body (for example, by weighing arms more when the gesture is mainly based on arm movement), we can make the recognition better.
Finished and connected an improved GUI made in Netbeans to our MAX/MSP controller. We want the interface to be as simple and easy to use as possible.

Goals and Design Evolution:
While our main goal (to create a system making it easier for dancers to interact with their music during practices) and design (using a Kinect and gesture recognition) has not changed much over the semester, there has been some evolution in the lower level tasks we wanted the system to be able to accomplish. The main reasons for this have been technical: we found early on through testing that for the system to be useful, rather than an hinderance, it must be able to recognize gestures with a very high fidelity. This task is further complicated by the fact that the dancer would be moving a lot during practice.
For example, one of our original goals was to be able to have our system follow the speed of the dancer, without the dancer having to make any specific gestures. We found that this was not feasible within the timeframe of the semester, however, as many of the recognition algorithms we looked at used machine learning (so worked better with many examples of an gesture) and many required knowing, generally, the beginning and end of the gesture (so would not work well with gestures tied into a dance, for example).

Also, we had to essentially abandon one of our proposed functionalities. We thought we would be able to implement a system that would make configuring a recognizable gesture a simple task, but after working with the gesture recognition software, we saw that setting up a gesture requires finely tuning the customizable weights of the different body parts to get even basic functionality. Implementing a system that automated that customization, we quickly realized, would take a very long time

Critical Evaluation:
Based on the positive feedback we received during testing, we feel that this could be turned into an useful real-world system. Many of our users have said that being able to control music easily would be useful for dancers and choreographers, and as a proof of concept, we believe our prototype has worked well. However, from our final testing in P6, we found that the user’s frustration levels would increase if they had to repeat a gesture even once. Therefore, there is a large gap between our current prototype and a real world system. Despite the users’ frustrations during testing, they did indicated in the post-evaluation survey that they would be interested in trying an improved iteration of our system and that they thought it could be useful for dancers.
We’ve learned several things from our design, implementation, and evaluation efforts. Firstly, we’ve learned that while the Kinect was launched in 2010, there actually isn’t a great familiarity with it in the general population. Secondly, we’ve found that the Kinect development community, while not small, is quite new. Microsoft support for development, with SDKs, is many for Windows, though there are community SDKs for Mac. From testing, we’ve found that users are less familiar with this application space than with windows based GUIs, but that they are generally very interested in gesture based applications.

Next Steps:

In moving forward, we’d like to make the system more customizable. We have not found a way to do so with the Kinetic Space gesture recognition software we’re using (we don’t see any way to pass info, such as user defined gestures, into the system), so to do so, we may have to implement our own gesture recognition. The basic structure gesture recognition algorithms we looked at seemed to involve looking at the x,y,z positions of various points and limbs, and comparing their movement (with margins of error perhaps determined through machine learning). We did not tackle this implementation challenge for the prototype, as we realized that the gesture recognition would need to be rather sophisticated for the system to work well. With more time, however, we would like to do our gesture recognition and recording in Max/MSP so that we could integrate our music playing software, and then maybe imbed the Kinect video feed in the Netbeans interface.
We still like the idea of having the music follow the dancer, without any other input, and that would be something we’d like to implement if we had more time. To do so, we would need the user to provide a “prototype” of the dance at regular speed. Then, we might extract specific “gestures” or moves from the dance, and change the speed of the music according to the speed of those gestures.
As mentioned before, we would also like to implement a configure gesture functionality. It may even be easier after we moved to Max/MSP for gesture recognition, but at this point, it’s only speculation.
We’d also like to do some further testing. In the testing we’ve done so far, we’ve had users come to a room we’ve set up and use the system as we’ve asked them to. We’d like to ask dancers if we can go into their actual practice sessions, and see how they use the system without our guidance. It would be informative to even leave the system with users for a few days, have them use it, and then get any feedback they have.

Source Code:
Source Code

Third Party Code:

Kinetic Space: http://www.colorfulbit.com/; We used Kinetic Space to do our Kinect gesture recognition. We set up predefined gestures in Kinetic Space, and sent OSC messages to our MAX/MSP program, which processed and played the music.
JavaOSC: http://www.illposed.com/software/javaosc.html We imported the JavaOSC library to send OSC messages from the interface to MAX/MSP, which processed and played the music.
quit executable: http://www.macupdate.com/app/mac/31821/quit We used the quit executable in the java interface to close Max/MSP and Kinetic Space

Demo Materials:
Demo Materials