P6 – PostureParrot

Group 7, Team Colonial

David, John, Horia

Project Summary

The Pos­turePar­rot helps users maintain good back posture while sitting.


The system that is being evaluated is the functional prototype of the PostureParrot and its accompanying GUI.  The purpose of this experiment is to determine whether our target audience will find the operation of the PostureParrot to be simple, intuitive and instructive.  It needs to be easy for users to wear the product and learn from its audible feedback and from the GUI’s visual feedback.

Implementation and Improvements

Link to P5: https://blogs.princeton.edu/humancomputerinterface/2013/04/22/p5-postureparrot/

Between P5 and P6, we have altered the list of tasks, now reintroducing a GUI that allows the user to set their default back posture, as well as alter the allowed wiggle room (degree that user can move without being notified) and time allowance (total time that user can deviate from default posture before being notified.) This GUI originally allowed the user to set the default back posture, as well as see changes in separate areas of the back over time; through our tests, we discovered that it was simpler, more refined, and more accessible to produce a compact device with a single component rather than a device that monitored multiple parts of the back. Overtime, we discovered that our fabric “reset” button embedded in the device was extremely unreliable, and, as a result, moved this functionality to our GUI.



To select our participants, we looked for Princeton students who studied while sitting.  To find these students, we went to an eating club library and asked sitting students if they would like to take part in our usability study.  Additionally, we asked them each a few questions to make sure that they were a part of our target user group.  For example, we asked them if they studied at desks a lot of the time or if they ever experienced back pain.


We performed the test at a table that was set up in an eating club.  We put a standard desk chair up against the table.  The only other equipment we used was our PostureParrot connected to a laptop.


(Easy) Set the default back posture of the PostureParrot.  This requires that you have already attached the device and opened the GUI.  The user should position himself so that he has good back posture   Then he should press the button on the GUI that says “Set Default Posture”.

(Medium) Deviate your back from the desired back posture and respond to the PostureParrot’s audible feedback.  After the user has set a default back posture,  the device will make a noise when the user deviates too far from it.  The user must correct their back posture back to the default to make the noise stop.

(Hard) Use the GUI to adjust the PostureParrot’s wiggle room and time allowance.  While the device is attached and the GUI is running, the user adjusts the values associated with wiggle room and time allowance and observes the effects.  The user should select a wiggle room and time allowance that they feel comfortable with.


For each participant, we first requested that they fill out the consent form, as well as their demographic data. We then gave them a quick demo, opening up the GUI and demonstrating how one attaches the device onto their shoulder. After beginning our video recording, we then asked the participant to sit before the laptop and attach the device themselves before proceeding to the three tasks. For each of these tasks, we explained to each participant what they were trying to achieve, and asked them to say aloud any concerns or observations as they arise (which we recorded by taking notes, in addition to any critical incidents that we see.) Once a task was complete, we paused to explain the goal of the next task. Once all three tasks were complete, we asked each participant to take a brief survey.

Test Measures

Our main goal for testing was to evaluate the level of difficulty associated with the different tasks.  In addition to making standard observations, we had the user fill out a questionnaire where they self-reported values pertaining to their satisfaction and level of difficulty with different aspects of the device.  The following were our questions:

  • Please rate the difficulty of task 1 (setting the default posture)
  • Please rate the difficulty of task 2 (deviating from default back posture)
  • Please rate the difficulty of finding a good “wiggle-room” value
  • Please rate the difficulty of finding a good “time-allowance” value
  • Please rate how intuitive the device was over-all
  • Please provide any additional comments / potential improvements

Results and Discussion

Through our observational notes, we noticed that users were confused when asked to set the default back posture. This could potentially because there was no confirmation when a user selected the button to set their desired back posture. To alleviate this, we plan on making the arduino beep in confirmation when a new default is set. We also discovered a bug that was present in our GUI; due to some odd communication between Processing and Arduino, values for both wiggle room and time allowance become distorted when the increment/decrement buttons were quickly selected. Fixing this problem will involve looking at the values that are being passed between the two programs, as well as how various delays (intentional delays, tone delays, etc) are affecting results.

Through our observational notes, we learned that wiggle room and time allowance were not intrinsically intuitive and require additional explanation. In future iterations, this may simply require a small text snippet in the GUI that briefly explains what each value represents. We also found that our default value for wiggle room was far too lenient; in our final iteration, we will have refined values so that the device responds appropriately to those using it, especially first-time users. We also discovered that sometimes it becomes difficult finding your original posture, especially when the wiggle room is relatively unforgiving. One potential way of addressing this in our next version is to have the tone respond according to your deviance from the base posture; for example, we could have it so it increases in frequency the farther you deviate. However, perhaps this functionality may be unnecessary, since users can always reset their desired back posture.

From our questionnaire, we discovered that the first two tasks were relatively easy; this shows that although our interface was slightly surprising without a notification, it was still an easy task to accomplish. The last task – regarding the two variables – was generally more difficult; although this was most likely due to the distorted values, it would be useful to find out if there were additional reasons for why this task was unintuitive. For the additional comments part of the questionnaire, we received a comment about how the adhesive on the device made it difficult to keep it on the user’s shoulder.  Another comment that we received stated that it was possible to achieve the same “good posture” angle when slouching.  Although this is a valid point, the user did have to work to find this same angle and our adjustable wiggle room and time allowance should compensate for this.  To completely eliminate this issue we would need to keep track of lower back posture too, something that would require the device to be more cumbersome (what we moved away from after P5).

There were some limitations that we found while conducting our usability study.  Due to our small population size, it was difficult to get an accurate idea for what users thought were optimal wiggle room and time allowance values.  It would be interesting to track the final wiggle room and time allowance values settled upon by many different users.  Another limitation that we had in our sample population was that our test users all had very low back pain.  Each of our users were male as well, meaning that we were not able to distinguish differences in usage by gender.  After tracking each user’s major, we also thought that it would be interesting to see how people with different lifestyles (lifestyles associated with computer science majors, english, etc) would affect the product usage.  This would also require a larger population.


Consent and Demographic Form



Name: ________________________

Please rate the difficulty of task 1 (setting the default posture)
1 – difficult 2 3 4 5 6 7 – easy

Please rate the difficulty of task 2 (deviating from default back posture)
1 – difficult 2 3 4 5 6 7 – easy

Please rate the difficulty of finding a good “wiggle-room” value
1 – difficult 2 3 4 5 6 7 – easy

Please rate the difficulty of finding a good “time-allowance” value
1 – difficult 2 3 4 5 6 7 – easy

Please rate how intuitive the device was, over-all
1 – not intuitive 2 3 4 5 6 7 – very intuitive

Any additional comments / potential improvements:

Observational Notes & Questionnaire Responses

Max J.

Task #1

  • 7:06 Clicks default posture and laughs (potentially because nothing happened on either the interface or the device)

Task #2

  • 7:06 Laughs (potentially because he needs to bend his back to a great degree)

Task #3

  • 7:08 Moves GUI around screen
  • 7:08 Finds a bug: if user clicks buttons too quickly, values are corrupted
  • 7:08 Device falls off shoulder and needed to be placed on the shoulder again
  • 7:09 Experiments with wiggle room until a desirable value is found

Questionnaire Responses

  1. 7
  2. 7
  3. 3 (“buggy”)
  4. 3
  5. 7
  6. Additional Comments: Better tape [emphasis his]

Zhexiang W.

Task #1

  • 7:25 Confused by no confirmation after selecting default

Task #2

  • 7:26 Comments that the default wiggle room is extremely large
  • 7:27 Wonders if there is a time delay / if he should move slower while testing

Task #3

  • 7:27 Minimizes the range of wiggle room, is pleased with stricter values

Questionnaire Responses

  1. 6
  2. 6
  3. 2
  4. 5
  5. 6
  6. Additional Comments: None

Andy H.

Task #1

  • 7:45 Success!

Task #2

  • 7:45 Tries different ranges of motions: forward-back, left-right, diagonally

Task #3

  • 7:47 Has difficult time finding original position
  • 7:48 Tries device moving forward and backward, but not any other direction

Questionnaire Responses

  1. 6
  2. 5-6 (circled together)
  3. 5-6
  4. 5-6
  5. 4
  6. Additional Comments: Needs to keep track of lower back posture too – same “angle” can be achieved w/ both slouching and “good” posture

P6 – NavBelt Pilot Usability Test

Group 11 – Don’t worry about it.

Amy, Daniel, Krithin, Jonathan, Thomas

Project Summary

The Nav­Belt will help nav­i­gat­ing around unfa­mil­iar places safer and more convenient.


The NavBelt is a system for discreetly providing directions to its wearer. It comprises a belt with four embedded vibrating motors (‘buzzers’), an electronic compass, and a connection to a GPS-enabled smartphone. The phone computes a walking route (including intermediate waypoints) to a destination specified by the user on a familiar map interface, and, based on real-time location information obtained from the GPS, sends a signal to the Arduino indicating the direction the user needs to move to the next waypoint; the Arduino then measures the user’s current orientation using the electronic compass and activates the appropriate buzzer to let the user know which direction to move. The experiment will test how intuitive and effective our system is to use by testing a user’s ability to input directions into the interface, follow the directions with minimal reference to the map, and know when they’ve arrived at the destination.

Implementation and Improvements

The link to our P5 submission is: http://blogs.princeton.edu/humancomputerinterface/2013/04/22/p5-expressive-navbelt-working-prototype/. The changes we made to the prototype since P5 include:

  • More durable construction of belt. The wires and buzzers are sewn into place, and now all four buzzers’ leads are soldered to a single four-pin connector which can be plugged directly into the Arduino. The Arduino and battery pack are also attached to the sides of the belt.

  • The phone component uses GPS to find the user’s current location and computes the absolute bearing along which the user should move to the next waypoint.

  • Our prototype now includes a compass module, and the Arduino uses that to compensate for the user’s orientation to calculate the relative bearing that the user should move along.

  • We preprogrammed a route for a tester to complete; this eliminates much of the wizard-of-Oz control from the P5 as we no longer signal each turn manually.

  • The phone UI is no longer a paper prototype.  A tester enters a destination into Google Maps on an actual phone.



We e-mailed listservs in search of interested participants. Our three testers are non-international Princeton students who had never been to the target destination, the Davis International Center.  They are:

  1. Helen Yang ‘14 – AB English

  2. Marjorie Lam ‘13 – AB Psychology

  3. Shawn Du ‘14 – AB Economics


Our apparatus includes a belt attached with four buzzers, an Arduino and an Android smartphone (along with a GPS), a battery pack, and a breadboard containing the compass and all the wires. The user wears the belt such that the buckle is near the user’s left hip, and the Arduino and breadboard are attached to the buckle; the battery pack is attached to the belt along the back. The user could either hold the phone upright or leave it in his/her pocket while traveling. This time, we held our tests outdoors. We mapped out a route from Friend Center to Davis International Center (78 Prospect Avenue), a location that many students are not familiar with. We recorded the route’s GPS coordinates using MyTrack.


Our tasks have not changed from P5:

Task 1. Hard: Choose the des­ti­na­tion and start the nav­i­ga­tion sys­tem.

It should be obvi­ous to the user when the belt “knows” where they want to go and they can start walking. For this task, we use Google Maps and wizard-of-oz the transition between choosing the destination and beginning their journey, since we currently have hardcoded the route and there is no way for the user to change it. 

Task 2. Medium: Fig­ure out when to turn.

When the user reaches a waypoint on the route, the phone vibrates and the buzzers start guiding the user to the next waypoint. The user’s goal is to notice these signals, understand them, and translate the buzzing signals into a new revised heading.

Task 3. Easy: Know when you’ve reached the des­ti­na­tion and should stop walking.

This is signalled by all four buzzers vibrating, and then no vibrating thereafter.

We chose to keep our tasks the same because we think they are very appropriate gauges of the ease/difficulty of our system’s usability and a good measure of how intuitive/non-intuitive our system is to first-time users.


We gave each user-tester a “pre-test demographic survey”, obtained their consent, and read to them the demo and interview scripts. We broke up our procedure for the users and us in the following table:

Procedure for User Procedure for Testers
Task 1
Start outside entrance to Friend Center
Provide user with address
Key in “78 Prospect Ave” in Google Maps
Switch to prerecorded route map in MyTracks app.
Observe route on map Provide any clarification re. route as required
Switch to NavBelt app, with predefined waypoints
Task 2
Observe buzzers start to buzz. Buzzer check: slowly turn on the spot and verify that they feel each buzzer in turn Tell them to do that check.
Proceed to walk, following buzzers to make turns as appropriate Explain the “think-aloud” process and video-record the user, staying behind them as to not inadvertently guide them to the destination. Maintain critical incident log. Try to stay slightly behind user, following their lead instead of inadvertently giving them directions. Intervene only to clear up some confusion or when something unexpected has gone wrong (e.g. phone app crashing)
Task 3
On observing phone vibration and all four belt buzzers go off at the same time: stop (at destination)


Test Measures

Task 1:

We did not make detailed observations here, because there is still a significant wizard-of-oz component to this task in our prototype. This happens because we chose to leave most of the functionality relating to this task unimplemented, as it is a purely software problem, and is the component furthest removed from the interesting HCI problem we’re trying to solve.

Task 2:

  • Task time – how long it takes to complete each route segment

    • rationale: To know how much longer it takes to use our belt versus a map on a phone. We can also use it as a proxy measure for confusion, because confused people walk more slowly.

    • In measuring this we took the time it took the user to travel from one waypoint to the next and subtracted the time spent on direct interactions with the experimenters, since we occasionally had to step in to correct for e.g. accidental user presses on the phone app that we know are going to be impossible in a production version.

  • Number of errors – coded by humans watching the video

    • rationale: To know how accurately a user can follow a GPS route using the belt.

    • We had to be careful in distinguishing between three types of errors:

      • trivial errors (phone screen reorientation, phone app restarting, accidental user taps on the debug controls in the phone app, etc)

      • inherent errors in our system leading to poor directions for the user (e.g. issues relating to discretization error in the direction signal from the phone)

      • errors from external conditions (poor GPS signals near trees and buildings).

    • We only count the latter two kinds of errors, since the first set can be trivially eliminated by making changes in the phone application, which we already know will need an overhaul to better support task 1.

  • Self-reported measures of satisfaction or dissatisfaction

    • rationale: in addition to how well the belt works, we want to know how intuitive the feedback system is to a user and whether it’s physically comfortable, things that are hard for us to measure externally.

Task 3:

  • Whether user noticed the three-buzzer signal at end of route and stopped

    • rationale: Simplest way to measure if the users were able to perform the task or not.

  • Self-reported measure of how easy that task was

    • rationale: So we can measure whether or not a user stopped because the user realized s/he has reached the destination and not from confusion.

Results and Discussion

All three participants successfully arrived at the destination, using only the belt for guidance. However, they moved much more slowly than they would have if they had known the route. A member of our team who knew the route (control group), walking at a comfortable pace, completed the journey in 5:51 minutes without the NavBelt. Excluding false starts and delays, our testers took an average of twice that amount of time; including those increased the duration by about 5 minutes more. We observed that our users seemed to spend some time at the start getting comfortable with the belt, but they seemed to overcome this learning curve fairly fast, since the ratio of the average user’s time to control time in the second path segment (2.80, excluding the third user, who was an outlier since he walked that segment twice) was much higher than that percentage excess in the fifth segment (1.35), even though these segments were of similar length.

Finding initial waypoint Walking south down Olden street Crossing Olden Crossing Prospect Walking east along Prospect Walking south to Davis Center
-1 -> 0 0 -> 1 1 -> 2 2 -> 3 3 -> 4 4 -> 5
Krithin times 1:15:00 1:22:00 0:10:00 0:14:00 1:38:00 1:12:00
Helen times 2:24:00 3:29:00 0:21:00 0:21:00 2:01:00 2:21:00
Marjie times 1:58:00 4:11:00 0:58:00 0:21:00 2:48:00 ???
Shawn 2:13:00 1:29:00 0:12:00 0:18:00 1:48:00 2:22:00


Although we originally had concerns that crossing streets could be problematic, as there were several turns in close succession there, all of our users did not encounter problematic signals from the belt and consistently crossed those waypoints relatively quickly. From our data table, it looks like Marjie (our second tester) was an outlier here, but she was the exception that proves the rule – the video transcript shows that she was aware of which way the belt indicated she should go, but from her memory of the map knew that there was an alternative route as well, and took that time to point it out to us. It was also good to note that they incorporated the belt’s buzzing as just one signal instead of an absolute mandate – they still remembered to look and wait for vehicles to pass before crossing.

We learned during this study that our navbelt is a feasible way of guiding a person from one place to another. However, it is not yet the most intuitive way to convey directions to a user. We also learned that having a more robust prototype makes all of the tasks significantly easier. In earlier lo-fi tests, we had problems with users receiving signals intermittently or even losing connection to a buzzer altogether, which caused major confusion as they missed turns or got lost. However, during our hi-fi prototype testing we ascertained that even with functioning electronics, the system can still be confusing, and that this is due to design choices, not the hardware problems we experienced in the low-fi testing.  For example, the NavBelt gave confusing signals because of environmental factors from trees and buildings.

Ideally, we would also change the form factor of the belt to make the compass more robust. Most of the components (the buzzers and the wires connecting them to the Arduino) were sewn onto the fabric of the belt, but the Arduino itself, the breadboard, and the battery pack were attached using only electrical tape. One of our testers kept hitting the compass with her elbow, leading to it being potentially jostled out of position and giving incorrect readings. This user also asked if she could wear her jacket over the belt, which would be nice but is currently impossible due to the fragility of the setup and technical limitations of the compass.


Demographic questionnaire, scripts, consent forms, and post-test questionnaire are included here: https://docs.google.com/document/d/1Oac209Ao0ppf9nADVIFVcyrarSpfRdbEhgtbsLaYB4A/edit


Full logs of time taken to arrive at each waypoint are at:





P6: AcaKinect

Your group num­ber and name

Group 25 — Deep Thought

 First names of every­one in your group

Neil, Har­vest, Vivian, Alan

1-sentence project summary

AcaKinect is voice record­ing soft­ware that uses a Kinect for gesture-based con­trol, which allows content creators to record mul­ti­ple loops and beat sequences and orga­nize them into four sec­tions on the screen, which is a more effi­cient and intu­itive way of presenting a music record­ing interface for those less experienced with the technical side of music production.


We are evaluating the ability of musicians with various levels of technical and musical expertise to use the AcaKinect system to make music. AcaKinect’s functionality is a subset of what most commercial loop pedals and loop pedal emulators are capable of; it is intentionally designed to not provide a “kitchen-sink experience,” so that musicians – many of which are not also technicians or recording engineers – may easily jump in and learn how to use looping and sequencing without having to climb the steep learning curve involved in using commercially available looping products. Thus, this experiment is good for modelling what real users would do when faced with this system for the first time; we are much more concerned with allowing users to jump straight in and start making music than enabling experienced users to use advanced functionality.

Implementation and Improvements

The implementation can be found here. We have fixed several instabilities and exceptions that may cause the program to crash; we have also added prototype indicators so that the user knows how many loops are in each column.



Participant 1 is an English major who has some technical music production background, and has used Garageband and Audacity to record raps. Participant 2 is a literature major with formal classical training in violin and piano, but no prior experience in the technical aspects of music production. Participant 3 is an ORFE major with formal training in flute and piano and also no prior experience with technical music production. Participants 2 and 3 were chosen to be a baseline user; how well can musicians with no background in recording or producing music use this system without any prior training? Participant 1, who has some recording experience, allows us to determine whether a little bit of prior technical knowledge helps make the learning process much quicker.


A laptop running Processing is connected to a Kinect and a microphone, and optionally a set of speakers; the user stands an appropriate distance from the Kinect and sings into the microphone. Note that the microphone should ideally be highly directional, such that the live-monitored output from the speakers does not generate feedback. Most ideally, the user would be monitoring the sound using a pair of headphones; this would eliminate the feedback issue. However, in our tests, we simply used the laptop’s speakers, which were quiet enough not to be picked up excessively by our microphone, which was an omni mic. Experiments were all conducted in a music classroom in Woolworth, so that testers could feel free to sing loudly.


Task 1: Record a simple loop in any one of the sections. It can have whistling, percussion, singing, etc. (This is meant to test the user’s ability to grasp the basic mechanics of recording music using this system; calibrating and then signaling to record a track are tested here.)

Task 2: Record any number of loops in three different sections and incorporate the delete feature. (This adds in the core functionality provided by the column abstraction; we would like to see if users structure their recordings by grouping similar parts into the same column. Deletion is added into the core gestures.)

Task 3: Create organization in the loop recording. For example, record two percussive sections in the first section, one vocal section in the second section, and maybe a whistling part in the third section. (This encourages users who may not have used columns to structure their recordings to do so; we want to see whether they pick up on this design choice or ignore it entirely.)


First, the users read and signed the consent form, and then filled out the questionnaire. We then demonstrated the main workings of the system, including the concept of columns and the requisite gestures to interact with the system. Next, we read the description of each task, and allow the user to try out the task uninterrupted; if the user has questions, we will answer the question, but we do not prompt if the user is struggling with some aspect of the task but has not explicitly asked for assistance. Finally, when the user has completed the tasks, we ask a few general questions about the experience to get overall opinions.

The test setup. Here, you can see the positions of the laptop, Kinect sensor, user, and microphone.

The test setup. Here, you can see the positions of the laptop, Kinect sensor, user, and microphone.

We tested in Woolworth music hall, to allow users to be loud in recording music.

We tested in Woolworth music hall, to allow users to be loud in recording music.

Test Measures:

  • Timestamps of significant events (e.g. starting and stopping recordings): we want to gain a general picture of the amount of time users spent in certain states of the system, to see if they are dwelling on any action or having trouble with certain parts of the system.
  • Timestamps of significant observations (e.g. struggling with the delete gesture): we want to track how often and how long users have problems or otherwise interesting interactions with the system, so we can identify potential areas of improvement.
  • Length of time needed to perform a “begin recording” gesture. Since this is one of the most fundamental gestures needed, we want to make sure that users are able to do this quickly and effortlessly; long delays here would be catastrophic for usability.
  • Length of time needed to perform a “delete track” gesture. This isn’t quite as central as the recording gesture, but it is one that may still be frequently used, especially if users want to rerecord a loop that wasn’t perfect; we also want this to be fast and accurate, and if the user has to repeat the gesture multiple times, it will take too long.

Results and Discussion

  • We discovered that a primary problem was text visibility: users simply were unable to read and distinguish the various messages on the screen indicating when a recording was about to start, so in certain cases, when the screen read “get ready…”, the student would start singing immediately after the recording gesture. In some ways, this is a good problem to have, since it is fairly simply fixable by providing more contrast and possibly more prominent textual cues that are more visible against the background and the rest of the UI, so the users know exactly when to start recording. This also applies to the beat loops, which are currently just shown as numbers on the screen; once the block metaphor is implemented, this should be less of a problem as well.

  • There were several issues with using the Kinect to interact with the system. We found that in order for gesture to be accurately recognized, the user must be standing reasonably upright and facing the Kinect; tilting of the body can cause gestures to not be recognized. While we do provide the skeletal outline onscreen, it seems that the users either did not recognize what it was, or just did not look at it to figure out whether gestures would be picked up or not; thus, some gestures were performed repeatedly (like deletes) in ways that would just not be picked up by the Kinect. We found that slow delete gestures worked much better than fast ones, but it did not seem that the users realized this immediately after a few attempts. In order to fix this, we could provide some indication that the Kinect saw some sort of gesture, but did not know what to make of it; a message on screen along the lines of “Sorry, didn’t catch that gesture!” might go a long way in helping the users to do gestures in a way that is more consistently recognizable. In addition, it seemed that users had some issues moving side to side through columns, as there is actually a considerable distance to move physically if the user is standing far enough back from the Kinect. We do not really consider this a problem, but rather just something that the user needs to acclimate to; perhaps clearer indication of which column the user is in would help in sending the message that recording in various columns is highly dependent on the user’s physical location, which also helps to reinforce the idea that different structural parts of the music belong in different spatial locations.
  • A more fundamental issue we have currently is that our code and gestures do not deal with the addition of the microphone properly; in this case, we didn’t use a mic stand (which is a reasonable assumption, since most home users who have never touched music recording software would probably also not have a mic stand). Thus, when the user holds the microphone naturally, the recording gesture is less intuitive to perform, but still possible, since it is a one-handed gesture; the two-handed delete gesture is by far much more difficult to perform while simultaneously trying to hold a microphone attached to a long cable. Thus, a reasonable idea would be to try to adapt the gestures to be one-handed such that one hand can always be holding the microphone in a reasonable location, and so that the presence of the hand holding a mic in front of a face does not confuse the Kinect’s skeleton tracking abilities.

  • We also have to work out some technical issues with the Kinect recognizing the user; one problem we saw was that the calibration pose can easily mistaken for a record gesture if the user has already calibrated. Similar problems also occur if the user ducks out of the frame and then reenters, or to a lesser extent when the user’s skeleton data is temporarily lost and then reacquired; however, these are purely technical issues, not design issues.

  • We believe that repeating these tests with a larger population of testers will not produce vastly different results, since most of the problem spots and observation we found were suggested by multiple users and were acknowledged to be potential areas for improvement. In addition, our test users approximated very well the type of users we want to be using this system, having a good amount of musical experience but very limited technical experience; their inputs were cross-corroborated with the other testers, which would suggest that their suggestions and problems are going to correspond fairly well with what we would have seen given a larger population.


Consent form:




INVESTIGATING GROUP: Group 25 (Neil C., Harvest Z., Alan T., Vivian Q.)

        The following informed consent is required by Princeton University for any research study conducted at the University.  This study is for the testing of a project prototype for the class “Human Computer Interaction” (COS 436) of the Spring 2013 semester.

Purpose of Research:

        The purpose of this prototype test is to evaluate the ease of use and the functionality of our gesture-based music recording software. The initial motivation for our project was to make simpler software for beat sequencing, allowing one person to easily make complex a capella recordings. We will be interviewing three students for prototype testing. You are being asked to participate in this study because we want to get more insight on usability and how to further improve our gesture-based system.


You will be asked to listen to our basic tutorial on how to use the gesture-based system and perform three tasks using the prototype. The tasks require that you sing and/or make noises to simulate music recording. We expect your participation to take about 10-15 minutes of your time, and you will not be compensated for your participation but will earn our eternal gratitude.


Your answers will be confidential. The records collected during this study will be kept private. We will not include any information that will make it possible to identify you (such as name/age/etc). Research records will be kept in my desk drawer and only select individuals will have access to your records. If the interview is audio or video recorded, we will destroy the recording after it has been transcribed.

Risks or Discomforts/Benefits:

The potential risks associated with the study include potential embarrassment if the participant is not good at singing, however we will not judge any musical quality. Additionally since we are using a gesture-based system, the participant may strain their muscles or injure themselves if they fall.


        We expect the project to benefit you by giving the group feedback in order to create a simpler, more efficient system. Music users and the participant themselves will be able to use the system in the future. We expect this project to contribute to the open-source community for Kinect music-making applications.

I understand that:

        A.     My participation is voluntary, and I may withdraw my consent and discontinue participation in the project at any time.  My refusal to participate will not result in any penalty.

        B.     By signing this agreement, I do not waive any legal rights or release Princeton University, its agents, or you from liability for negligence.

I hereby give my consent to be the participant in your prototype test.





Audio/Video Recordings:

With your permission, we would also like to tape-record the interview. Please sign below if you agree to be photographed, and/or audio videotaped.

I hereby give my consent for audio/video recording:





 Demographic questionnaire:

Here’s a questionnaire for you.


Age  _____


Gender  _____


Education (i.e. Princeton) __________________


Major (ARC, FRE, PHI…) _________



Have you ever in your life played a musical instrument? (Voice counts) List the instruments you have played.





Tell us about any formal musical training you’ve had or any groups you are involved in (private lessons, conservatory programs, Princeton music classes, music organizations on campus, etc).





Have you ever used music recording or live performance software (Audacity, Garageband, Logic, ProTools, Ableton, Cubase, etc)? List all that you’ve used and describe how you’ve used them.





Have you ever used music recording/performance hardware (various guitar pedals like loop/delay/fx, mixer boards, MIDI synthesizers, sequencers, etc)? List all that you’ve used and describe how you’ve used them.





Are you a musical prodigy?


Yes ______      No ______



Are you awesome?


Yes ______

 Demo script:

Your Job…


Using  the information from our tutorial, there are a few tasks we’d like you to complete.



First, we’d like you to record a simple loop in any one of the sections. It can have whistling, percussion, singing, etc.


Next, we’d like you to record any number of loops in three different sections and incorporate the delete feature.


Finally, we’d like you to create an organization to your loop recording. For example, record two percussive sections in the first section, one vocal section in the second section, and maybe a whistling in the third section. The specifics don’t matter, but try to incorporate structure into your creation.


Additionally, we’d like you to complete the final task another time, to see how quickly and fluidly you can use our system. We’ll tell you exactly what to record this time: 2 percussive sections, 1 vocal (singing section), and 1 whistling section.

Post-task questionnaire:

AcaKinect Post Trial Survey


1. What is your overall opinion of the device?


2. How did you feel about the gestures? Where they difficult or confusing in any way?


3. Are there any gestures you think we should incorporate or replace?


4. What about any functionalities you think our product would benefit from having?


5. Do you feel that with practice, one would be able to fluidly control this device?


6. Any final comments or questions?

Raw data:

Lessons learned from the testing:

  • Main problem — text visibility! Need to make the text color more visible against the background and UI clearer, so the users know exactly when to start recording and can see the beat loops.

  • Realized our code did not deal with the addition of the microphone properly, such as how the user naturally holds the mic (interferes with the recording gesture), or the delete function (when holding the mic, hands do not touch)

  • Person’s head needs to be perpendicular to the Kinect or their gestures won’t record as well

  • People didn’t utilize the skeletal information displayed on screen, seems like they didn’t understand how the gestures translated

  • Slow delete gestures were best, compared to fast gestures which were hard to track by the system

  • Calibration pose mistaken for a record gesture if it is not the first time calibrating (user leaves screen and then returns), need to check for that.

  • We found out that the MINIM library we used requires us to manually set the microphone as audio input, so during the second subject test we determined that we were actually using the laptop’s built in mic, which accounted for the reduced quality of sound and feedback from previous loop recordings



Test Subject #1:



  1. 20

  2. M

  3. Princeton

  4. English

  5. Nope

  6. N/A

  7. Yes; garageband, Audacity (Recording Raps)

  8. No

  9. No

  10. No


00.00 Begin reading of the demo script

00:13 Subject agrees current music recording interfaces are hard to use

02:30 Question about why there are different boxes. We explain that it is a way to organize music.

04:09 Begin testing for Task 1

04:25 Not clear about how to calibrate, asks a question

04:45 Tries to start singing before the recording starts, can’t see messages or numbers on the screen very well, doesn’t think the blue text color is visible

05:01 Begin testing for Task 2

05:37 Attempt to do the same delete feature six times. Subject thinks the delete function is hard to use. The quick motions of the subject are hard for the Kinect to track. Also, the subject was holding the mike so their hands were not perfectly together. This caused our gesture detection to not register the control request.

06:45 Subject says, “I can’t see when it’s recording” referring to the text. Color is hard to distinguish against the background.

06:50 Begin testing for Task 3

07:00 Gestures are harder to detect on edges, subject’s attempt to delete on the edge needs to be repeated several times

07:42   Recording did not register when holding the mike up at the same time. Need more flexibility with the gesture command.

Time testing (Recording tracks): 5s, 6s, 8s, 9s

Time testing (Deleting tracks):  20s, 10s, 15s, 11s,



Test Subject #2:



  1. 18

  2. F

  3. Rutgers

  4. Literature

  5. Yes; violin and piano

  6. Private lessons

  7. No

  8. No

  9. No

  10. Yes


00:00 Begin reading of the demo script

02:26 Begin testing for Task 1

02:35 Ask about how to configure (get the Kinect to recognize their body)

02:48 Subject says that can’t see the information on the screen

03:28 Subject starts recording and begins singing immediately, because the “Countdown…” message was not clearly visible.

04:07 Subject repeats that the information on screen is hard to see

04:46 Begin testing for Task 2

05:03 Subject missed the start of the recording because they couldn’t see the on-screen text

05:17 Second attempt to record in the same column

05:46 Delete action was not successful

05:51 Delete succeeded

06:25 Begin testing for Task 3

07:44 Needed to stop and reconfigure because the Kinect lost tracking of body

09:32 The subject did not put their arm high enough so the recording command was not detected

Time testing (Recording tracks): 5s, 4s, 5s, 5s

Time testing (Deleting tracks):  7s, 10s, 15s, 11s,

Test Subject #3:



  1. 20

  2. F

  3. Princeton

  4. ORFE

  5. Yes; piano and flute

  6. Private lessons, high school orchestra

  7. No

  8. No

  9. No

  10. Yes


00:00 Begin reading of the demo script

02:10 Begin testing for Task 1

02:11 Subject didn’t remember to calibrate before attempting to record

02:20 Calibration

02:22 Subject records

02:54 Begin testing for Task 2

03:30 Subject easily uses the delete command

03:41 All recorded tracks deleted

Summary: not much trouble, only calibration

04:42 Begin testing for Task 3

04:50 Records first (leftmost) section

05:11 Records second section

05:30 Records third section

05:56 Records fourth section. Problem in the last section because subject only used right hand to signal the recording command, but in the last section the right hand was off the screen. The hand was out of the image and not picked up, so after a few tries the subject had to switch to the left hand.

Time testing (Recording tracks): 7s, 8s, 8s, 8s

Time testing (Deleting tracks):  7s, 8s, 11s, 12s,


P6: The GaitKeeper

a) Group 6, GARP

b) Alice, Rodrigo, Phil, Gene

c) Our product, the GaitKeeper, is an insole pad that can be inserted into a shoe, and an associated device affixed to the user’s body, that together gather information about the user’s gait for diagnostic purposes.

d) The GaitKeeper can be placed inside a shoe and uses flex/pressure sensor throughout its surface to register data of a user’s gait. This information can be loaded into GUI and users can see a heat map of the pressure on the bottom of their foot changing with time. By making data collection and analysis simple, we intend to allow runners to observe the eccentricities of their own gait without the aid of more expensive devices. We also hope to make the analysis comprehensive enough that a running store operator can use it to better advise a customer or so a sports medicine practitioner can diagnose gait problems in patients. Our experiments are meant to test whether the prototype is simple enough to operate in all our intended use cases and whether the data analysis is comprehensive enough to be worthwhile.

e) The previous writeup can be found here: https://blogs.princeton.edu/humancomputerinterface/2013/04/21/p5-garp/

Here are the changes we have made since P5:

  • The sole is now connected to the Arduino. We have soldered all of the wires to the breadboard.

  • The Arduino and breadboard is now connected to the velcro band that will hold it up.

  • Basic Arduino code has been written to collect data, but it must be connected to a laptop and the data is not fully formatted.

  • The GUI has been tweaked and some of the buttons are now functional. The GUI still does not directly respond to data input from the prototype itself.

f) i. Participants:

The first participant was an employee at a local running store. We are envisioning the GaitKeeper as being used by employees at running stores to help with custom shoe fitting. This participant is one of our potential users and we wanted to see whether the product provided information that previous services have been unable to provide. Our second participant is an avid student runner who has run regularly for years. After hearing about our project, he volunteered to give it a try and provide useful feedback.

Our third participant was a less frequent runner, but tends to run for longer distances.  He had never been to a running store for gait analysis.  We considered him to be a typical running user, and a good indicator of whether the product might have a good market outside of the typical hard running group

ii. Apparatus:

To test the device, we asked users to place the device’s sole into their shoe. The wires go to the back of the sole, up the backside of the shoe and the leg, into the Arduino/Breadboard strapped to the back of the user’s waist. We asked users to put it on themselves to see if they had any trouble placing the sole. Then, we asked them to run around and see whether the device impaired their running in any way. In the case of the first participant, we conducted the test in the local running store’s treadmill. In the case of the other participants, we asked them to run around outside and we followed them with a laptop to collect data.

iii. Tasks:

In the first task (easy) the user looks at the dis­play of a past run to eval­u­ate their gait. They use the com­puter inter­face to exam­ine the heat map from an “ear­lier run” and see if the gait has any eccentricities. In this task, they must be able to recognize if any part of the gait stands out. They must also be able to navigate the data as it changes with time and understand what the information means. Ideally, this step should provide them with actionable intelligence that they can use on their next run.

The sec­ond task (medium) is a user putting on the device for the first time with minimal instruction. This will allow us to understand whether the device is simple enough for the average user to install. This will also allow us to observe whether the placement of wires is inhibiting the usability of the device. If the device is intended for use by the average runner, usability must be a very high priority for us.

Lastly, for the third task (hard), the user goes for a run with the device on. We need to know whether the device is placed in such a way that it will not affect their gait. Feedback on the device’s weight and comfort are very important in this task. To complete this task, the user must plug the device into the com­puter at the end of their run and input the data using the UI.

iv. Procedure:

For the first task, we asked users to look at our mock GUI and explore it. We asked them if they could tell us anything interesting from this example person’s gait. In the second task, we asked them to sit down, install the device’s sole in their shoe to the best of their abilities and strap on the device to their waist. In the last task, we asked the user to run with the device and connect it to a computer for data input.

g) Test Measures

  • In the first task, we simply measured how long it took for the user to make an observation about the person’s gait. We felt that this was directly dependent on the usability of the GUI.

  • In the second task, we measured how long it took for the user to install the device correctly. If it was complicated, we expected the user to take a long time.

  • In the third task, we took sample data from the user. This is important for further development of the heatmap of the GUI.

h) Results and Discussion

The running store employee gave us some very useful feedback. She mentioned that there was a fair amount of bunching of wires at her toes, which was caused by the prototype crumpling as it was put into the shoe. The employee suggested that removing the insole of the shoe might be a good idea. While putting on the device the employee needed help. The velcro straps were difficult to manage alone. Said that it felt similar to a field belt and was an acceptable weight. When the ankle strap was not velcroed the wires flopped around and were almost stepped on. The extra velcro helped, but required our explanation as to how it should be used. When asked about actually using the device she said that she could definitely run for a short amount of time normally, which is all that is needed for a store fitting, but the wires might make going for an actual run with the device difficult.

Our second user did not think the sole’s thickness was uncomfortable. He thought the material was a little sticky and was caught on the adhesive we used to keep the device together. The resizing system of the sole was not used, but the user easily understood its worth and how to potentially use it. He liked the idea of having live feedback on his gait and picked up the GUI easily. Lastly, this user would have preferred to have another sole so he would not have to take it off to measure his other foot.

The third user found the thickness to be acceptable, and did not mention the wires.  When asked, he said that he noticed the feeling of the wires, but did not find them irritable.  He had large feet, and did not make use of the resizing system.  He enjoyed the UI a lot, although he had some difficulty with understanding the forwards/backwards navigation through screens.  He thought the heat map was interesting, but asked for a scale to indicate how much the pressure was (as a science major, he felt that it would be nice to see how much pressure there was).  He had some difficulty interpreting the results, and asked us what sort of shoe we would recommend based on the results which were simulated.

If we can, we would like to make a thinner but more rigid foot pad so as to prevent the amount of bunching in the toe that the first user mentioned. It might also be a good idea to have something to attach the velcro to so that the device does not get tangled up when put away in storage (which happened between tests!). We also hope to complete our interface and tweak a few buttons to make it more understandable.

i) Appendices

This is the demo script: https://docs.google.com/document/d/1lhKJ_DIPd-ytAH5JQWKngYscE7y46kkFy9fNnUEut2c/edit?usp=sharing

This is the user consent form: https://docs.google.com/document/d/1BLw8CSQ7SaTtjhxvk9GwS88LrsquEdwAS13M1yFo5j0/edit?usp=sharing

This is the user questionnaire: https://docs.google.com/document/d/1WERwLGZSJTUI0_e95mPIFo-oXCknQVvQ5wQgmq-U2ZE/edit?usp=sharing


Pictures and videos from testing:







P6 Grupo Naidy

Group Number: 1

Group Members: Kuni Nagakura, Avneesh Sarwate, John Subosits, Joe Turchiano, Yaared Al-Mehairi

Project Summary: Our project is a serv­ing sys­tem for restau­rants, called Ser­vice­Center, that allows wait­ers to effi­ciently man­age orders and tasks by dis­play­ing infor­ma­tion about their tables.


Here, we describe the methods and rationale behind a user study geared to aid us in the design of our serving system. The goal of our system is to present  pertinent information to servers in restaurants that allow them to task manage more efficiently. The prototype being evaluated is a single laptop connected to a display projector, which serves as the motherboard. The laptop acts as both an order input interface for the servers and a kitchen-side order status interface for the kitchen staff (this task  was performed by a group member). We chose to use a laptop as the basis of our hardware because the primary purpose of our testing was for functionality and usability, and any interaction with a device in our system is a mouse-screen interaction, which a laptop provides well. Furthermore, the laptop allowed us to display customer information on a large screen, which is how we envision our motherboard should look like in our final design. As we have discussed, the purpose of our user testing was to gain insight in future design improvements. This being the case, our user study consisted of three tasks, ranging from easy to hard that covered the main actions that a server would perform  using our system, described in the sections below.

Implementation and Improvements:

Our P5 post can be found here: http://blogs.princeton.edu/humancomputerinterface/2013/04/22/p5-grupo-naidy/

We do not have any new changes to our implementation, as we were preparing for P6.


The participants in our research and testing were all members of the university community. They were selected purely on a volunteer basis without any specific requirements; we did give them a quick demographic survey in order to gain a deeper perspective on who we were working with and how they would relate to our target audience. All of our test subjects were familiar with computers and used them regularly; one of our test subjects had some prior experience working at a restaurant. Overall, we found our participants to be related only tangentially to the target user group, but since the primary purpose of our testing was for functionality and the tasks were relatively simple, we found this was sufficient. In particular, we had one user who had experience in the service industry–his post-testing interviews were incredibly helpful. Otherwise, our subjects did not have waiting experience.


The testing was conducted in the dining room of one of Princeton’s eating clubs, at times when meals were not being served. We found this setting to be ideal, since it nearly replicated the environment of a restaurant and provided the necessary equipment (plates, glasses, a pitcher) without having to impose on any third party. A laptop computer was used to run the prototype system, and it was connected to a projector to show the order display on a large, highly visible screen – more or less how we imagine the final product to be used. The computer ran both the order display interface, into which the test subject entered new orders, as well as the “kitchen staff” interface, which was used to update the orders. The kitchen interface’s data was sent over a simulated wireless connection to better replicated intended use circumstances.


In our work­ing pro­to­type, we have cho­sen to sup­port an easy, medium, and hard task. For the easy task, users are sim­ply asked to request help if they ever feel over­whelmed or need any assis­tance. This task is impor­tant to our sys­tem, because one of our func­tion­al­i­ties is the abil­ity to request assis­tance from wait­ers who may have a minute to spare and are watch­ing the motherboard.

For our medium task, we ask users to input order infor­ma­tion into the sys­tem for their table. Once users input orders, the moth­er­board will be pop­u­lated with orders and the kitchen will also be noti­fied of the orders. This is a medium task since this is sim­ply inputting information.

Finally, for our hard task, we ask our users to uti­lize cus­tomer sta­tuses (ie. cup sta­tuses, food orders) to deter­min­ing task orders. This task cov­ers both ensur­ing that cups are always full and that the pre­pared food is not left out for too long. The motherboard’s main func­tion­al­ity is to gather all this infor­ma­tion in an easy to read man­ner, allow­ing wait­ers to sim­ply glance at the cus­tomer sta­tus to deter­mine what order they should do things.


The testing procedure was designed to place the test subject in as accurate a simulated restaurant environment as was possible.  The tasks that can be accomplished with the system were required to be performed interspersed with each other, as they would be in a real restaurant, rather than completely divorced from each other.  Three of the group members sat at different tables and each played the part of 3 to 4 customers.  The tester acted as the waiter for about 15 minutes while the customers ordered a standard, prescribed combination of appetizers, entrees, and deserts.  The tester was responsible for entering the orders into the system and for bringing the “food” (actually paper labels) to the table after it had been prepared and plated.  Each dish took a specific length of time before it was ready.  Periodically, generally about twice a test, the tester would have to fill the water glasses at each table since the motherboard indicated that they were empty.  Participants generally did a good job of announcing both what they were doing, ex. “I am refilling the water glasses,” and what they were thinking, ex. “It’s confusing that red means the water glasses need to be refilled but that the food isn’t ready yet, so I don’t have to do anything.”

Test Measures: 

“Turn around time’” – time between dish being put out by kitchen and being delivered. This was measured by hand – electronic measurements were dependent on users correctly using the system, making the metric interdependent on other variable factors (discussed later)

Post-Test ratings by users (Self reported Likert scale 1-5. The results of these questionnaires are provided in the Appendix):

  • “Recoverability” – Whether the system made it easy to recover from mistakes. We felt that this was important because in a system where “online” decisions are being made, the system must be especially robust to user error.
  • “Easiness to Learn” – Since this system would be deployed in existing restaurants, the restaurants would want to minimize the time that users (waiters) would take to adapt to the new system and start working at their maximum efficiency. Thus, we felt this an important metric to measure.

Results and Discussion: 

Testers had some trouble interpreting the graphical display on the motherboard.  In particular, they suggested that the color coding of various events could be made more intuitive and that the presence of empty water glasses could be indicated in a more obvious way than the number of full or half-empty ones.  Basically, the display of things that need to be urgently addressed should grab the user’s attention and colors should be consistent. If red is to indicate attention for cups, it should not indicate “not ready” (and thus requiring no attention) for orders.  Perhaps in the future, the red circle that contains the number of empty water glasses can be made to flash.  On the bright side, users seemed to have a fairly easy time entering orders into the terminal for the kitchen to prepare. One thing that we were interested in seeing was whether or not order input would be a hinderance to servers.

Overall, the testers were fairly efficient in the use of the system. We had wanted to put a “heavy load” on the testers to see how they would respond to a lot of activity, and we did this by scheduling what we thought was a high number of orders (9 “orders” comprising of about 25 dishes) over the course of 10-12 minutes. The average “turn around time” of the users was quite low. We decided to split the data into instances where the waiter was in the “kitchen” area when the dish came out and when the waiter was not. Unsurprisingly, when the waiter was around, the average turn around time was negligible. When the user was not in the kitchen, the average across users was about 30 seconds per order.

However, we realized after the fact that we did not have a “control set” of  turnover times, which would ideally be turnover times from waiters working in a real restaurant environment. We noticed that waiters had the most trouble when sets of orders from different tables came out at the same time, and in a real restaurant setting, there could be more tables, and thus more of these challenging situations. Our setup used only 3 “tables” of guests placing orders, but it may have been more accurate to have a larger number of tables placing orders with a lower frequency. We also noticed that none of the users decided to call for help in this trial. This is most likely because we used paper plates during the trial, which allowed users to pick up as many plates as needed when delivering orders. The physical space we were using to test was also quite small, allowing users to move back and forth between patrons and the “kitchen” very quickly, allowing them to deliver a large amount of plates quickly without too much delay. Since this system is meant to help with actions whose durations are very short, environmental influence on action time can be proportionally very significant, and care must be taken to recreate an accurate environment to get meaningful data.

Users seemed to figure out very quickly how to enter the data into the interface for a new order, and by the time they were entering their second or third full order they had become quite efficient with the system. However, we noticed that several test subjects seemed to forget about the requirement to delete the orders after they were ready and delivered. Since there are a limited amount of order slots for each table, any order input on a full table would not go through. This resulted in at least one mistake where an order was lost in translation from paper to prototype when the user did not look up at the screen immediately to check that the data had been entered successfully. We have also considered making the ready orders disappear automatically to avoid this problem, but this raises the additional problem of the system needing to know when the plate has been taken out. In this case, at the very least a warning to tell the user that a table is full of orders would be a good addition to the prototype.


All surveys, forms, and scripts are located in this dropbox folder. All the collected raw data is also located in this folder.

P6 User Testing of Gesture Glove

Group 15 : The LifeHackers

Prakhar Agarwal, Gabriel Chen, and Colleen Carroll

The Gesture Glove in a Sentence

Our project is a glove that allows us to con­trol (sim­u­lated) phone actions by sens­ing various hand gestures.

Our System

The system being evaluated was a system that simulated commonly used functions of a phone that could be performed by users off the screen using the Gesture Glove that we built in the previous assignment. The sensor readings that mapped to built-in gestures let users perform 3 tasks (see the Tasks section under Method). The purpose was to see if different users would be able to easily and intuitively perform the 3 tasks, and to look for potential improvements that could be implemented in future iterations of our system. The rationale of the experiment was that if any of the users had difficulty with any of the tasks, then improvements would need to be made in order to let all users interface with the system comfortably and conveniently.

Implementation and Improvements

Our submission for P5 can be found here: http://goo.gl/DB4Sq. Since P5, we left the general structure of the prototype the same for P6. A couple of quick changes were made to the interface, though:

  • We changed the threshold values for a number of the in built in gestures to better match the different hand sizes and maneuverability of different people.
  • We lowered the delay between cycles of glove readings to allow for higher sensitivity.



Our users were chosen out of students in public places. We tried to vary gender and technical background. The first was a 20 year old male named Josh who was studying in Brown Hall, and is a computer science major. The second was a 21 year old female named Hannah who is a Chemical and Biological Engineering major with a certificate in the Woodrow Wilson School. She was using her iPhone in Frist. Lastly, we chose a  20 year old male named Eddie who was studying in Frist. He is an Economics major who owns an iPhone.


We conducted our test in a secluded corner in Frist campus center. Our equipment was our laptops, one of which was used for the phone simulator, and the Gesture Glove. Two members of our team recorded critical incidents, while the other read the demo script to the user.


The first and eas­i­est task we imple­mented is pick­ing up a phone call. A user sim­ply puts his hand into the shape of a phone to pick up a call and then can do the ‘okay’ hand motion in order to end the call. This was the easiest task.

Our sec­ond task is more dif­fi­cult as there are more com­mands and ges­tures to be rec­og­nized. This task allows users to con­trol a music player with the glove. By using a pre­set set of hand motions, a user can play and pause music, nav­i­gate between a num­ber of tracks and adjust the vol­ume of the music that is play­ing.

Finally, our last task involved allow­ing users to set a cus­tom pass­word rep­re­sented by a series of three cus­tom hand ges­tures, which the user can then use to unlock his phone. This is the most dif­fi­cult task as it involves set­ting ges­tures oneself, then remembering the sequence of gestures that were previously set so that a user could unlock a phone.


Users were chosen from public areas and asked if they would be able to spare 5 minutes for our project study.The study was focused on 3 main tasks which the users had to complete as described above. One team member prompted the user to complete these tasks using the following demo script. http://tinyurl.com/cqwktog The users wore the Gesture Glove and interacted with a simulated smartphone screen on the computer, while two members noted the critical incidents of the testing session.

Test Measures

The bulk of our study was on qualitative measures because of the nature of the tasks that we asked the users to complete. Picking up the phone and hanging up take a trivial amount of time. The users were asked to experiment with the music player, which implied any amount of time could be used. Lastly unlocking the phone took exactly 9 seconds each time a user attempted to unlock. For these reasons we did not measure time per task.

The following metrics were studied:

  • number of attempts to successfully unlock the phone
  • qualitative response to the system based on a Lichert scale (sent to the users as a questionnaire at the following link : http://tinyurl.com/p6questionnaire with these results: http://tinyurl.com/p6likert )
  • general feedback during study – positive or negative
  • observations of how users made gestures during session
  • time to set password

Results and Discussion

Some of the most common user feedback was that unlocking the screen was hard to do.Our original implementation has users enter a gesture, hold it for 3 seconds until a green dot shows up on screen, and then move on to the next gesture. The purpose of this was to keep the password secure. If the program told you as soon as you made one right or wrong gesture, a thief could eventually unlock your phone by process of elimination. However, we received feedback that unlocking took too long, too short, or should not require looking at the screen. It was apparent that for security we were sacrificing usability. We also realized that on current unlocking systems different people choose between a more complex password for security or a simple password for convenience.  Considering all of the design trade-offs involved, we decided to leave it up choose a middle road. We will provide some basic security, but let the system be flexible enough for users to be able to make and use a password according to their preferences.

Almost everyone had issues using the thumb sensors. Even our team, who by this point are very used to using the system, occasionally has issues with it. Upon closer observation during the usability testing, we realized that users don’t always press the same part of the thumb. This varies even for a particular individual. They may sometimes press the very tip, sometimes the middle of the area between the tip and the knuckle, and sometimes very close to the knuckle. What is even harder than just hitting the thumb on the right spot (without looking at it) is to get the forefinger and thumb sensors right on top of each other in order to activate both at once. Our conclusion is that the thumb needs larger pressure sensors. With proper equipment, we could imagine getting a sensor that covers the entire surface of the part of the thumb above the knuckle. Because the thumb is critical in a lot of gestures (it is used to activate the pressure sensors of the other fingers), we believe this would be a very important fix in future iterations.

The interface for setting a password obviously needed more instruction. The main problems with this were that the users did not realize that they needed to press set before making the gesture and then pressing save. This could be fixed with a simple line of instruction at the top. A more complicated problem is that of the visualization of the gesture that the system registers the user making while they are setting the password. We had a user whose gesture was not quite registering what he was intending to do. This was apparent from the visualization for us, but he did not notice. This could have resulted in the user setting a password that they think is one sequence of gestures, while the machine thinks it is something else, resulting in the user’s unlocking hardly, if ever being successful. This tells us that we may want a visualization that is even easier for a user to understand, for example, a visualization of a hand doing the gesture that the machine is registering the user doing. We also had a user who made his password, and then couldn’t remember what he had done for the password and made a simpler one instead. This again tells us that he couldn’t read the visualization easily enough to quickly recall what he had done. This is just further justification for a more intuitive visualization.

Overall, the system got some very positive reactions. Though we definitely have a number of improvements to make, we got comments throughout user testing, like “This is sick!” and “Awesome!” These recommendations for improvement as well as the positive reactions are reflected through the responses we got from the questionnaire we had participants submit after testing the interface (link to results can be found in the Appendix). Along with asking for subjective feedback, we had users rank how much they agree with certain statements on a Lichert Scale where 1 represented “Strongly Disagree” and 5 represented “Strongly Agree.” The average results are shown in the table below. As we see, users had the most amount of trouble using the password interface. Through testing, we found that the reasons why users had difficulty was sometimes just technical glitches (the wiring got unplugged) and other times because of the issues we discussed above. We have tried to address these concerns as well as possible in the discussion above.

It was easy to use the gesture glove to pick up and end phone calls.


It was easy to use the glove to control the music player.


The interface to set a password with the glove was intuitive and easy to use.


The interface to unlock the phone with the glove was intuive and easy to use.


The gestures chosen for built in functionality made sense and would be easy to remember.



Materials given to or read to users :

  • consent form and demographic info : http://tinyurl.com/p6consent
  • task script: http://tinyurl.com/cqwktog
  • post-interview questionnaire: http://tinyurl.com/p6questionnaire
  • demo script:   We didn’t demonstrate our system before letting users test it; instead, we demonstrated the built-in gestures that we wanted to exhibit before each task, while the user had the glove on. This way, the gestures could be explained more efficiently and the users could use them right away.

Raw Data


P6 – Group 8 – Backend Cleaning Inspectors

Your group number and name.

Group 8 – Backend Cleaning Inspectors


First names of everyone in your group.


Tae Jun




A 1-sentence project summary.

Our project is to make a device to improve security and responsibility in the laun­dry room.


Introduction (5 points). 1 paragraph. Introduce the system being evaluated and state the purpose and rationale of the experiment.

We have built a laundry security system prototype that is intended to be used in Princeton University laundry rooms to help protect student’s personal clothing when it is left alone to be washed and/or dried.  The purpose of this experiment is to test out three tasks that will be performed by users of our system.  We wish to analyze and observe test participants attempting each task and to determine if there are any changes we need to make in the implementation of these three tasks.


Implementation and Improvements (5 points). Provide a link to your P5 submission. If you have changed your working prototype at all since submitting P5, supply a brief bulleted list of the changes made since P5. (It is not necessary to change your prototype from P5 before doing P6.) This section should be no more than 1 paragraph.



Method (10 points).

Participants: 1 paragraph. (Who they are, how they were selected.)


Participant 1)  21 year old Junior, Male, MAE major


Participant 2)  19 year old Freshman, Female, undecided


Participant 3)  19 year old Freshman, Male, undecided/Econ


See demographic questionnaire results below for more information.


We set up our study in a laundry room in Butler, and approached the first three people to enter the room.  They all agreed to do the study and both genders were represented.

Apparatus: 1 paragraph. (What equipment did you use, where did you conduct the


We used our model prototype to test the three tasks.  We do not need any wizard-of-oz techniques in our model prototype currently, so we were able to just use our prototype.  We tested it out in a public laundry room at Princeton University.


Tasks. ~1/2 page. Describe the tasks you have chosen to support in this working

prototype (3 short descriptions, 2–3 sentences each; should be one easy, one

medium, one hard). If you have not changed the tasks from P5, you can re-use your text from P5 here. Otherwise, if you have changed the tasks, explain how and why. In any case, explain why you have chosen these tasks.

1. Cur­rent User: Lock­ing the machine:

– The Current User inputs ID number into lock­ing unit key­pad. The product will then try and match his number to a netid in our system. If it finds a match, it will ask the user if this is their id. They can then answer yes or no and if yes, the machine locks. This netid is also the netid that will be sent emails if an alert is sent. This task would be medium in difficulty as the user has to ensure that he/she enters the right 9-digit number.

2. Next User: Send­ing mes­sage to cur­rent user that laun­dry is done and some­one is wait­ing to use the machine:

– When there are no machines open, the next user can press a but­ton to send an alert at any time dur­ing the cycle. When the but­ton is pressed, and the id of the next user is verified, an alert will be sent to the current user saying someone is waiting to use the machine. The difficulty of this task would be easy/medium, as the next user has to input his/her 9-digit ID number.

3. Cur­rent User: Unlock the machine:

– If the machine is cur­rently locked and the current user wishes to unlock the machine, the cur­rent user must input his princeton ID number. Once he has done this, the system then checks this id and tries to find a potential match to a net id. If it does, it will ask the user if this is his net id and on yes, it will unlock. This is a medium/hard task, as the user must input his number and confirm to unlock.


We chose these specific tasks, because these are the three main (and only) tasks our system is designed to execute (at least in this model…. in future models, additional features could be implemented).


Newly added to the machine was a grace period allowed after the first alert had been sent (if the alert button is pushed while the laundry is still running, aka during “laundry time”, then the alert is sent immediatly after “laundry time” is finished).  This specific feature did not have any impact on the above tasks; however we did explain the concept to them and observed their reactions to it.

Procedure. 1 paragraph. Describe how you conducted the study.

Once we found a willing participant, we obtained his consent using our consent script, and then proceeded to describe the basics of what our system was and how it works, leaving out any important instructions on how to perform the three tasks.  We also described the “think aloud” procedure following the protocol given at http://www.hu.mtu.edu/~njcarpen/hu3120/pdfs/thinkaloud.pdf.  We then watched and listened to the participant attempt the three tasks and observed any and all things that they said and did.  We then repeated this procedure for the other two participants.


Test Measures (5 points). Describe what you measured and why. Bullet points are


We measured the following:

  • task time – the time it took to complete each task, measured in seconds

  • self-reported satisfaction with each task

  • number of errors or mistakes

Results and Discussion (25 points). ~4 paragraphs. Provide results of your tests.

Describe what you learned from the user study. Document any changes that you plan to make in your prototype as a result of the study.


Times (seconds)

Task 1

1 – 20

2 – 54

3 – 27


Task 2

1 – 18

2 – 24

3 – 22


Task 3

1 – 16

2 – 14

3 – 32


Satisfaction (1-5)

Task 1

1 – 5

2 – 3

3 – 5


Task 2

1 – 5

2 – 5

3 – 5


Task 3

1 – 5

2 – 5

3 – 4


Number of errors (#)

Task 1

1 – 0

2 – 1

3 – 0


Task 2

1 – 0

2 – 0

3 – 0


Task 3

1 – 0

2 – 0

3 – 1


Overall, the results of our user study were encouraging.  The users generally had a good feel for what was required of them to accomplish the tasks set before them.  The first user was able to accomplish all tasks with no problems.  The second user had a bit of trouble with locking the machine, and the third user had a small problem with locking it again.



The second user had a problem with locking the machine.  She couldn’t figure out what the machine was asking for when it said “Enter ID: “.  At Princeton, every student is assigned a Princeton ID number when they enroll at the school.  However, this number is only used rarely, as most activities only require the student to swipe their prox.  This is probably the main reason behind the problem.  The third participant messed up when typing in his ID again to unlock the machine.  We judged this is as a rather inconsequential error, but noted that there may be easier ways to authenticate a user rather than have them type in a 9-digit number the user most likely does not have memorized each time they lock or unlock the machine.  Additionally, although not specifically part of any of the tasks, users one and three expressed a little bit of confusion when informed about the concept of the grace period.  Although inconsequential to the tasks they were asked to perform, it was still noted and taken into consideration.


Cool! Comments

All three users, when prompted for the second task, made some kind of remark about how they thought the ability to send an email to the “current user” of the machine was really cool.  We saw this as a very good sign for our product.


Although, we mostly received very positive reviews from each participant, there are some areas where the product could be improved for future usability tests.  First, we could arrange an easier way for user authentication.  Possibilities for this would include a prox swiper, a netid input (rather than Princeton ID), or other Princeton identification means.  Second, we could attempt to inform the user more about what is going on with the machine currently.  A couple of the users expressed confusion about the concepts of the grace period and alert sending when the laundry was not done.  This could be easily improved in future versions.

Appendices (5 points).

Provide all things you read or handed to the participant: consent form,

demographic questionnaire, demo script, post-task questionnaire and/or interview


consent form,  demo script, demographic questionnaire


Also provide raw data (i.e., your merged critical incident logs, questionnaire

responses, etc.)


Demographic Questionnaire Responses:

Participant 1:

1.) Male

2.) 21

3.) single

4.) summer internship

5.) college junior

6.) MAE

7.) every 2 or 3 weeks

8.) the keypad on the princeton doors?

9.) Nope


Participant 2:

1.) Female

2.) 19

3.) single

4.) dining hall

5.) college freshman

6.) undecided

7.) every week

8.) No, I don’t think so?

9.) Yes, had my laundry taken out of dryer and put in washer, when some of the things shouldnt be washed.  It made me very mad.


Participant 3:

1.) Male

2.) 19

3.) single

4.) unemployed

5.) high school

6.) college freshman

7.) every two weeks on Sundays

8.) Princeton keypads?

9.) Not really.  Have had it taken out before, but I didn’t care too much.


***** Post-Demo Questionnaire


Based on the following scale from 1 to 5, please rate your agreement with the following questions:

0.) Strongly disagree

1.) Disagree

2.) Slightly Disagree

3.) Slightly Agree

4.) Agree

5.) Strongly Agree


1. I felt secure with my laundry using this device.

-Subject 1: 4

-Subject 2: 5

-Subject 3: 4


2. I felt less guilty pulling the laundry out of the machine after the grace period is expired.

-Subject 1: 5

-Subject 2: 5

-Subject 3: 5


3. I believe this device will increase efficiency in the laundry room.

-Subject 1: 4

-Subject 2: 3

-Subject 3: 5


4. I would be less irritated if my laundry was taken out after a warning and the grace period.

-Subject 1: 4

-Subject 2: 4

-Subject 3: 4


5. When mass-produced, we expect this device to take $30/device to manufacture. I believe this is a reasonable investment by the university.

-Subject 1: 5

-Subject 2: 4

-Subject 3: 5


6. I would definitely use this device in my daily laundry interactions.

-Subject 1: 5

-Subject 2: 5

-Subject 3: 4


7. It was easy to (secure the laundry machine) using the device (Task I/II/III)

-Subject 1: 5/5/5

-Subject 2: 5/5/5

-Subject 3: 5/4/5

8. The time it takes to perform the task I/II/III is reasonable

-Subject 1: 5/5/5

-Subject 2: 5/5/5

-Subject 3: 5/5/5


1. How long do you think the Grace Period should be?

-Subject 1: 10 minutes

-Subject 2: 5 minutes

-Subject 3: 7 minutes


P6 — Epple

Group Number 16: Epple

Andrew, Brian, Kevin, and Saswathi

Our project is an interface through which controlling web cameras can be as intuitive as turning one’s head.

Our system uses a Kinect to monitor a person’s head orientation and uses this to remotely control the angle of a servo motor on top of which a web camera is mounted.  This essentially allows a user to remotely control the camera view through simple head movements.  The purpose of our system is to enable the user to intuitively and remotely control web cameras and thus engage in a realm of new web chat possibilities.  Normally web chat sessions end up being one-on-one experiences that fall apart once a chat partner leaves the view of the camera.  With our system, we aim to allow for more dynamic web chat sessions in which there may be multiple chat partners on the other side of the camera, and these partners can move freely.

Continue reading

P6 – Name Redacted

Group 20 — Name Redacted

Brian, Ed, Matt, and Josh

Sum­mary: Our group is cre­at­ing an inter­ac­tive and fun way for mid­dle school stu­dents to learn the fun­da­men­tals of com­puter sci­ence with­out the need for expen­sive soft­ware and/or hardware.

Introduction: The purpose of our system is to introduce students to the basic concepts and principles of computer science and to teach them in a fun, intuitive, and interactive way. The purpose of this experiment is to gauge various aspects of our system such as intuitiveness, ease of use, and difficulty, and to ascertain any changes we could make that would improve the user’s overall experience.  In particular, we want to see how easy or hard each of the three tasks are and whether or not the tasks were intuitive.  Our bigger concern is that the tasks are not intuitive.  Every student of computer science will think that different topics are of varying difficulties, but if our system is not intuitive, no study will be able to learn from it.


Implementation and Improvements:

  • Since P5, we created a better debugging mechanism and initialization feature, which we have now made into our easy task.

  • There are no wizard-of-oz techniques in P6 whereas there were those techniques in P5.

  • The TOY Program has a compilation phase and a running phase to allow jumps to future labels.  The user does not notice the difference between the two phases to provide a better level of abstraction.

  • The TOY Program goes line by line waiting 1.5 seconds during execution so the users can see how the registers, etc. change.

  • The TOY Program has error messages where there is either a runtime or compile time error.




Participants: Our participants were all Princeton University students with varying degrees of computer science backgrounds, ranging from no background to only two introductory courses.  We chose these participants because they did not have a lot of computer science background, and our target audience are people who are trying to learn computer science but do not have much formal training.  The participants ages ranged from 19 to 22.  One came with an engineering background, one with a social science, and one with an anthropology background.  All three participants were male and each had some prior level of teaching experience.  When choosing participants, we wanted students who also had taught, so that we could simultaneously test both groups in our market.  With only three participants, we really did not have a lot of opportunity to create a diverse group of participants.


Apparatus: Our project uses paper tags with the corresponding text labels, a projector, a webcam and a laptop.  In a school, the projector will already be connected and set up to a laptop, so we set that up for the testers.  We asked the users to connect the webcam and then follow the initialization instructions so that the program can map the webcam coordinates to the projector coordinates.  The paper tags already had tape on the back of them so that the users could easily put the tags onto the wall.  We conducted the test in an eating club since there was a space about as large as a classroom that allowed us to project on a large open wall.



Program Initialization (EASY):  This task is the initialization of the program that requires the users to initialize the webcam and projector space.  This task also includes key commands that allow the users to debug their own code, by showing them which lines are grouped together as rows.  Part of the idea from the task came from an expansion of the Tutorial feedback, since the tutorial should really begin when the program is first loaded on the teachers laptop and be geared towards helping users debug their programs.


Numbers (MEDIUM): In this task, users will be introduced to number systems and learn how to convert between them. Users will be able to make numbers in binary, hexadecimal, octal, or decimal and see the number converted in real time into the other number systems.

TOY Program (HARD): In the TOY programming environment, users (primarily students) will be free to make and experiment with their own programs. Users create programs by placing instruction tags in the correct position and sequence on the whiteboard. The program can then be run, executing instructions in sequence and providing visual feedback to users to help them understand exactly what each instruction does.

Task Selection: We chose the “Program Initialization” task because it is the first step in using our system and every user would have to complete this task. Once this step is completed, it makes sense to have the user try out each of the applications we developed for the platform.

Procedure: We made sure each participant consented to the study with knowledge that their image and or video with them may be published on a blog post on the publicly searchable web.  With each participant we introduced them to the goals of the project as stated in a previous section of this write up.  We then gave the participants a high level introduction into the system, explaining how the tags are used as well as what the tasks entail.  We introduced the participant into each task and asked them to perform the task.  We offered the participants a post-survey questionnaire which is in the Appendices section  of this write up.

Test Measures: We asked the participants to fill out a questionnaire that asked them to rate our program on a scale from 1 to 5 on several different aspects.  These aspects were Overall Ease of Use, Overall Intuitiveness, Initialization: Difficulty of Task, Assembly: Difficulty of Task, and Numbers: Difficulty of Task.  We also analyzed how users interacted with the program by taking less quantitative measurements about how much it seemed that they struggled with a given task, or how easy a task was, etc.


Each tag has a unique visual marker.

Participant 1 Using “Numbers” Program

Participant 2 Performing the Initialization Task

TOY Program


Results and Discussion:


Average Rating (out of 5)

Overall Ease of Use


Overall Intuitiveness


Initialization: Difficulty of Task


Assembly: Difficulty of Task


Numbers: Difficulty of Task


Overall we felt that the tests were rather successful in not only reinforcing the feedback that we had received in previous iterations of the prototype but also in showing us how to further improve our system.  To start, the feedback was extremely positive.  We received an overall average “Ease of Use” rating on our anonymous survey of a 4.66 out of 5.  On the “Intuitiveness” of our system, we received an overall average rating of 4.33 out of 5.  Both of these numbers were high both for users of moderate and limited CS background.  This tells us that we were able to provide users with an interesting an intuitive educational experience despite the fact that they had very limited training and knowledge of CS concepts.  Users had very positive verbal feedback.  One of the users who had no computer science background even remarked upon the “theme of rows” that seemed to be going across our different applications. This was really interesting because horizontal tag rows is a fundamental metaphor built into the system and it is something that someone could pick up without seeing a line of code.  Of the three tasks, it was clear both from their self reported values and from observing the users that the “Numbers” task was clearly the most confusing. They primarily seemed to take issue with the location of where numbers were displayed in relation to the ID tags placed on the board. We will focus more energy on cleaning up that task and making it more visually intuitive and appealing.


Going forward from the positive information given, much of the constructive criticism that we received centered around the depth of the product.  Our users seemed to grasp the fundamentals of putting tags on the wall.  Our users even started to develop different user patterns.  One user, for example, would constantly “compile” his assembly code to check for errors while most just wrote the program and compiled once at the end.  The place where they struggled was with the tasks being too “abstract.”  This idea of abstractness has been on our minds since our first contextual inquiry when it was suggested to us.  Users wanted to know “how the initialization step worked.”  They wanted more information about the conversion process from binary to hex.


Jumping off from comments like these, we have ideas of specific changes we want to make to the system to help increase its connection with the users.  We are going to change the number system a little so that we display exactly how the numbers are generated (2^0 + 2^1 + 2^3 = 11).  We are going to include more documentation for each of the three tasks.  It was a little confusing where to put the base tags for the numbers method.  We may create rows for the number method so that the user knows where to put the hexadecimal, decimal, and binary tags.  It was also unclear that the base numbers had to be to the far left of each row, so that would be included in the documentation.  Similarly, the testers asked what the arguments for various commands were and this needs to be cleared up in the documentation about the program.  All of these specifics boil down to two main themes.  We need to cater the UI to make visual suggestions to the user as to what they should do next (i.e. how many arguments does print take with visual boxes).  We, also need to increase the visual documentation so that the user can ask the questions about what’s going on “behind the curtain.”


We are also working on tag persistence.  Users don’t understand that if they walk in front of the camera, they block it’s field of view so we want to make the camera less vulnerable to view blocking but having it remember tags for a few seconds.  We also learned a few things about the space that our project needs to be in to work.  The sunlight was causing problems for the camera since the black and white tags were not being picked up by the software.  We needed to move the projector closer to the wall, which created a smaller output image.  The code section for the TOY Program was a little small for fitting three tags on one row



Consent form – Updated from P4:



Interface Feedback and Demographic form –



The demographics are on the second page of the form.


P6 – Group 12 – Do You Even Lift?

a. Your group number and name.

Group Number: 12

Group Name: Do You Even Lift?

b. First names of everyone in your group.

Adam, Andrew, Matt, Peter.

c. A 1-sentence project summary.

Our project is a Kinect based system that watches people lift weights and gives instructional feedback to help people lift more safely and effectively.

d. Introduction

We have designed a Kinect-based virtual lifting coach. The system has a “TeachMe” mode, where it provides step-by-step instruction in how to perform a lift with proper technique, along with real time feedback on the user’s form. It also has a “JustLift” mode, where more experienced users can quickly begin a lifting session, taking advantage of the Kinect to track their reps and evaluate their form. Information about the set is tracked for each user. We will run a series of three experiments, with one experiment to evaluate each of the above tasks. To test TeachMe, we will have users perform squats before and after using the TeachMe feature, and e evaluate the form of their squats before and after. The purpose of the TeachMe functionality is to quickly improve a user’s form, so we will expect to see an improvement in their form after TeachMe. To evaluate the JustLift functionality, we will have the users perform a quick lift session. The primary feature of JustLift is how quickly a user is able to begin a lifting session. A secondary feature is seamless, unobtrusive feedback. To evaluate these, we will time how long it takes the user to begin a JustLift, then assess the quality of JustLift’s feedback using a post-hoc usability survey. Finally, to evaluate the session tracker, we will provide users with a “fake” dataset and ask them to perform specific tasks, then assess the functionality using a post-hoc usability survey.

e. Implementation and Improvements

Link to p5: https://blogs.princeton.edu/humancomputerinterface/2013/04/22/do-you-even-lift-p5/

  • Created a graph view for data on the “My Progress” Section of the application

  • Created a place for users to input login information

  • Write user data to logs after each set of exercises in “Just Lift”

  • Added voice commands to progress through the “TeachMe” section

f. Method 

i. Participants:

We selected three participants, with a variety of experience and backgrounds. We tried to find users who had had different types of instruction previously, as well as both male and female users. We also selected users from different majors, (Psychology, English, and Computer Science) to cover a wide range of technical experience. All users were in the same age range, as we selected users from the pool of college students around us. If we had had more participants in the study, we would have selected participants from broader range of ages, as well as those who had never lifted before.

ii. Apparatus: 

We used a Kinect for Windows, driven an HP Pavillion dv6000 laptop. We conducted the test in an empty room at Terrace, to maximize convenience for our selected participants.

iii. Tasks:

Our first task is to provide users with feedback on their lifting technique. In the current iteration, this functionality comprised the “Just Lift” mode for the back squat. Here, the user performs a set of exercises and the system will give them feedback about their performance on each repetition.

Our second task is to create a guided tutorial for new lifters. In this prototype, we implemented this in the “Teach Me” mode for the back squat. Here, the system takes the user through each step of the exercise  The user demonstrated is required to perform each step correctly before they are able to learn the next one.

Our third task is to track users between sessions  Our system stores data and feedback from the user’s lifts and allows them to view their data in the “My Progress” page. We created a fake data set to use while testing with users, as they will not have the opportunity to generate their own data for multiple workouts over the course of several days.

iv. Procedure

We first asked the user to perform several squats to establish a baseline for comparison. Next, we asked the user to navigate through the “Teach Me” progression and then perform three more back squats (again without the Kinect).Next, we had the user leverage the “Just Lift” section of the application by performing three sets of three squats each, and then use the interface to learn about flaws in their form.Lastly, the user used the “My Progress” section of the application to answer questions about past performance using a set of pre-fabricated data. These questions were in the form of “How much weight did you lift on the first set?” and “What advice was recommended for rep X of set Y?”

g. Test Measures

  • Teach me

    • We asked the user to perform three squats before and after completing the “TeachMe” component. We noted problems in the users technique before using the system, then looked to see if the problems were still present after instruction

      • The primary function of the “TeachMe” page is to teach users how to correctly do a squat, including fixing pre-existing problems with their form. If our system is successful in teaching them how to do a squat and fixing problems with their form, we shouldn’t observe these problems after they have used “TeachMe”

    • We recorded the time it took for the user to navigate through the “Teach Me” progression.

      • The time it took will give insight into how easy it was for the user.

  • Just Lift

    • We asked the user survey questions related to how useful the data from “Just Lift” was and how easy it was to access.

      • We measured this because we wanted to better understand if the user thought the system met its goals of providing useful, easy to understand lifting feedback.

    • We asked the user to complete a variety of tasks, which encompassed many of the most common features a real user of the system might use. We noted mistakes that they made, as well as tasks that the user found difficult

      • We designed the tasks to cover many common real world use cases. If users had problems with these tasks in the study, it’s likely that users would have difficulty with them in the real world.

  • My Progress

    • We asked the user survey questions related to how easy it was to accomplish tasks related to data retrieval and how useful the data was.

      • Easy of use and usefulness of data are two of the most important metrics for evaluating this section of the interface

    • We recorded the time it took for the user to look up data.

      • The time it took will give insight into how easy it was for the user.

    • We asked the user to complete a variety of tasks, which encompassed many of the most common features a real user of the system might use. We noted mistakes that they made, as well as tasks that the user found difficult

      • We designed the tasks to cover many common real world use cases. If users had problems with these tasks in the study, it’s likely that users

h. Results and Discussion

Survey Results*

Format: Value – (Mean, Std Dev)

Age – (21.666, 0.577)

Ease of Feedback Navigation – (3.666, 1.52)

Usefulness of Feedback – (4, 1)

Helpfulness of feedback – (4, 0)

Likelihood of Use – (4.3333, 0.577)

Usefulness of Data Tracking –  (3.666, 0.577)

Ease of Data Tracking Navigation – (4.555, 0.577)

* Note these numbers came from a sample size of three, and thus cannot be taken with any statistical significance

In our first test, the user performed 3 squats and had good form. During the “Teach Me“ progression, he encountered some bugs with the voice control but was able to get through and finish the tutorial. When he performed several squats after the “Teach Me” progression, he said “Shoulder width apart, knees out, hips back” before squatting, an indication that our system had made an impression even on an experienced lifter. In the “Just Lift” test, the user was not sure how to end a set (this is done by walking out of the Kinect frame). In the “My Progress” page, he did not realize that we could display advice about a repetition by clicking the colored tabs.

In our second test, when the user performed his 3 initial squats, his knees were very close together. During the “Teach Me” he said, “my knees are too close together, that’s interesting.” In his squats after the “Teach Me” progression, his knees were wider. This was exciting to see, as he had taken advice from our system and adjusted his form accordingly. Unlike our previous user, this user, clicked the colored tabs right away to show his advice/feedback in the “My Progress” page.

In our third test, our user had average knee depth and narrow knee width. Her movement was a bit restricted by jeans so she had difficulty achieving our system’s standards. Likewise, she questioned whether our advice concerning knee depth aligned with correct squat form. In her squats after the “Teach Me” progression, her knee depth was adequate and knee width was a bit wider than previously but still narrow. In the other 2 tests, “Just Lift” and “My Progress”she navigated through the tasks more quickly than the other 2 users.

Mean completion times:

Teach me – 3:40

Just Lift – 2:50

My Progress – 1:40

It was interesting watching users actually interact with our system. Certain aspects of it that we took for granted, such as how to make different sets or examine progress, were less intuitive than we hoped with one of the users. On the other hand, we had a user who flew through the menus and pages and had no trouble with the system. There are certainly improvements that we can make that would allow all users to have that same quick and easy interaction with the prototype. Based on these results and our observations, several changes are necessary. The “Teach Me” page needs some more work to make the audio cues for moving to the next step less obtrusive. We will add a page before “Just Lift” that is displayed to users who are not logged in, that will display basic instructions to the user (such as a reminder that sets are ended by walking out of the frame). We would also like to make the colored tabs on the “Just Lift” and “My Progress” page appear more interactive. Having a tips screen before the just lift page will also help alleviate these pain points. We will also improve the graphing on the Progress Page to include labels on the x-axis, as it wasn’t immediately evident to one user what they the bars represented.

i. Appendices

i. Provide all things you read or handed to the participant: consent form, demographic questionnaire, demo script, post-task questionnaire and/or interview script.

Link to Demograpic Questionaire/Post-task Questionaire : https://docs.google.com/forms/d/1xZzlkckqPsCYrMtvMhjsGpaRMbEu6Q7DTgcpefDYKLE/viewform

Link to consent form: https://www.dropbox.com/s/v7c78zbzqvk1rrk/Do%20you%20even%20lift%20–%20consent%20form.doc

Link to script: http://pastebin.com/WATL9pAV

ii. Also provide raw data (i.e., your merged critical incident logs, questionnaire responses, etc.)

Link to raw data (user notes): https://www.dropbox.com/s/n7rvhj2ccxq9i7h/HCI%20-%20P6%20-%20Group%2012%20-%20User%20observations.docx

Link to Demographic Questionnaire/Post-task Questionnaire Results :