AuraCarbon

Made by Shengzhi Wu, Laura Rodriguez and adev

an AI home assistant that acts as an extension of you, by learning your voice, speech patterns, and tone in order to communicate as you in situations that take up excess time and energy. The aim is to free up time for richer interactions with people that are important to you.

Created: May 10th, 2019

0

Intention

For this project, we were interested in looking at the advancement of artificial intelligence and natural language processing technology and the potential misuse that can come from placing this technology in the hands of users. Taking signals from current technology, such as the Google Duplex, we envision a future where AI agents are able to mimic the tone and speech patterns of our voices becoming an extension of us. We were also exploring the tension of what designers, developers, and engineers intend the technology they create to be used for and the ways that users can misappropriate this technology for their benefit that is outside of its original intended use. 

We designed the Aura Carbon, an AI home assistant that acts as an extension of you, by learning your voice, speech patterns, and tone in order to communicate as you in situations that take up excess time and energy. The aim is to free up time for richer interactions with people that are important to you.

0
0

Prototype

Exhibit Design

For the exhibition, we decided to build a story that showed the progression of the development, release, and recall of the Aura Carbon. Since we were telling a time-based story, we decided to utilize multiple touchpoints as a way to focus on different elements of the story. We also decided to incorporate a timeline which would be an easy way for the viewer to see the story in entirety. 


Designing/testing the exhibit space

Aura Carbon Prototypes

We prototyped 3 versions of the Aura Carbon for our exhibition to help support the story we were telling.

(1)  A model to represent a first prototype of the Aura Carbon that was built by the Aura engineers. 

The purpose of this model was to support the development phase of the Aura Carbon. We decided to make the prototype look distinctly different than the product released version of the Aura Carbon. This contrast makes the product released version look for real and finished. 

(2) A model to represent the first publically released version of the Aura Carbon. 

The purpose of this model was to support the released portion of the story. This prototype was to be used for the interactive/conversational element of our exhibit design. Because this was supposed to represent a commercially released product, we 3d printed and finished it by sanding it smooth.


Making the Aura Carbon Model

(3) A model to represent a hacked version of the Aura Carbon that was used in the financial fraud scandal that was part of our timeline story

The purpose of this model was to support the hacking scandal portion of our story. We made the prototype look like someone tried to crack into it by making a finished version and breaking into ourselves. We also incorporate things like wires coming out of it to imply it was hacked. 

Physical Computing Prototype

For the final product, we added a few lighting effects to communicate different states of the device, including a listening state, a speaking state, and a rest state. We used a Particle Photon and NeoPixel 24 to prototype the LED lighting effects. The listening state is a spinning ring; the speaking state is a randomly changed color from blue to purple that matches with Aura's branding; the rest state is a slow breathing effect that also transitions from blue to purple color. 

Conversational Interaction

For the conversational interface, we decided to pre-record a few key scenarios rather than prototype a functional voice recognition system. Because it is hard to anticipant how the audiences interact with our device, and we expect the exhibition would be very noisy for real voice recognition. But allowing audiences to trigger different scenario individually, we can demonstrate all the key contexts of how this device would be used.

Then we finalized four key scenarios, including onboarding of voice mimicking, access to a user's personal data, reserve a restaurant and answer the user's mom call. We write down the scripts for each dialogue and record audios for each. 

Img 6542We also want to build a visual interface that helps audiences see the conversation, and understand the role of each dialogue.  As such, we built a processing prototype, that can display captions along with the recorded audios. The processing code reads audio files paired with text files, and the text files contain all the scripts, roles, as well as the timing of every dialogue. Therefore, when an audio file is playing, a related caption will display simultaneously.

In order to allow audiences to interact with the dialogue easily, we used a physical dial to trigger it. Daragh borrowed us a PowerMate dial, and we coded it to trigger different contexts of dialogue. 

Ultimately,  we use projector mapping to project images onto the table from the top down. So the image aligns with the physical objects as if the projected image is an extension of the objects. The processing code is also real-time communicate with Particle Photon, so it can trigger different lighting mode while playing the dialogue.

Timeline and Supporting Articles


Product Video

We developed a product video that was incorporated into the timeline to support the product release Aura Carbon prototype. This video mimicked the product videos that current tech companies release to promote their products. The purpose of the video was to help show the tension and differences between the intention of the product and the misuse of it.

0
Aura Exhibition
sz wu - https://youtu.be/R5ZQ8vdxflA
0
Processing code
/* This code is created by Shengzhi Wu (www.wushengzhi.xyz) for Responsive Mobile Environment class project Aura Carbon
It used a PowerMate dial to trigger various pieces of conversations and communicate with a Particle Photon to change light effect of a 24 NeoPixel
All you need to change is to find the correct serial port, it is in line 67, just change the number
You can use line 72 to print all the availiable port,  and find the one starts with usb
the PowerMate uses a U and D key to trigger the changes, so you need to use PowerMate app and change it to mimick press key "U" and "D"
*/


import processing.serial.*;
import ddf.minim.*;
import ddf.minim.analysis.*;
Minim minim;
AudioPlayer[] audio;
String[][] lines;
String[][] time;
String[][] speaker;
String[][] dialog;
float a =0;
float ringChange;
float speed =1;
float ringSize =1;
BeatDetect beat;
int  totalNum =4;
int index =0;
Serial myPort;  // Create object from Serial class
String val;     // Data received from the serial port
char state ='0';
PFont font;
int normal = 22;
int selected = 21;
void setup() {
  font = loadFont("Futura-Medium-48.vlw");

  //size(1000, 1000);
  fullScreen();
  lines = new String[totalNum][];
  time = new String[totalNum][];
  speaker = new String[totalNum][];
  dialog = new String[totalNum][];
  audio = new AudioPlayer[totalNum];
  minim = new Minim(this);

  for (int j=0; j<totalNum; j++) {
    lines[j] = loadStrings("dialog" + (1+j)+ ".txt");
    time[j] = new String[lines[j].length];
    speaker[j] =new String[lines[j].length];
    dialog[j] = new String[lines[j].length];
    println("there are " + lines[j].length + " lines");
    for (int i = 0; i < lines[j].length; i++) {
      String[] words = split(lines[j][i], ' ');
      speaker[j][i] = words[0];
      time[j][i] = words[1];
      //println(words[1]);
      String remove = words[0] + " " + words[1]  + " ";
      String[] dialogs = split(lines[j][i], remove);
      dialog[j][i] = dialogs[1];
      //println(dialogs[1]);
    }
    audio[j] = minim.loadFile("aura" + (j+1)+ ".mp3");
  }
  // Load a soundfile from the /data folder of the sketch and play it back


  //audio[0].play();
  beat = new BeatDetect();
  
  String portName = Serial.list()[13]; //change the 0 to a 1 or 2 etc. to match your usb port
  
  myPort = new Serial(this, portName, 9600);
  println(portName);
  printArray(Serial.list());
  //println(lines[0].length);
  // println(dialog[0].length);
  //printArray(time[0]);
  textFont(font, 32);
  noCursor();
}
void draw() {
  //scale(1.2);
  background(0);
  pushMatrix();
  translate(60,-150);
  int playTime = (int)audio[index].position()/1000;
  //println((int)audio[index].position()/1000, (int)audio[index].length()/1000);
  if (playTime >= (int)audio[index].length()/1000) {
    if (state != '0') {
      myPort.write('0');
      state = '0';
      //println("tirgger rest");
    }
    displayGuide();
    speed = 0.8;
  } else {
    for (int i = 1; i < lines[index].length; i++) {

      if (playTime >int(time[index][i-1]) && playTime < int(time[index][i])+1) {
        displayUsersDialog(speaker[index][i], dialog[index][i]);
      }
    }
  }
  breathing();
  drawRing();
  popMatrix();
  drawDialRing();
  //text("yes", 80, height/2, 200, 240);
  //println(frameRate);
  //displayGuide();
}

void displayUsersDialog(String _speaker, String s) {
  //print(_speaker);
  //println(s);
  textSize(normal);
  textAlign(LEFT, TOP);
  fill(255);
  if (int(_speaker) ==1) {
    fill(#FF82F8);
    s = "Aura:" + "\n"  + "\n"+ s;

    text(s, width/2 -600, height/2-100, 350, 340);
    ringSize =1;
    if ( beat.isOnset() ) {
      speed = 10;
    } else {
      speed = 2;
    }

    if (state != '2') {
      myPort.write('2');
      state = '2';
      println("tirgger on");
    }
  } else if (int(_speaker) ==2)
  {
    fill(255);
    //println("it is U");
    s = "User:" + "\n"  + "\n"+ s;
    text(s, width/2 +250, height/2 -100, 350, 340);
    speed = 0.8;
    ringSize = 0.8f;

    if (state != '1') {
      myPort.write('1');
      state = '1';
      println("tirgger off");
    }
  } else if (int(_speaker) ==3)
  {
    fill(255);
    //println("it is U");
    s = "Restaurant:" + "\n"  + "\n"+ s;
    text(s, width/2 +250, height/2 -100, 350, 340);
    speed = 0.8;
    ringSize = 0.8f;

    if (state != '1') {
      myPort.write('1');
      state = '1';
      println("tirgger off");
    }
  } else if (int(_speaker) ==4)
  {
    fill(255);
    //println("it is U");
    s = "Mom:" + "\n"  + "\n"+ s;
    text(s, width/2 +250, height/2 -100, 350, 340);
    speed = 0.8;
    ringSize = 0.8f;

    if (state != '1') {
      myPort.write('1');
      state = '1';
      println("tirgger off");
    }
  }
}
void drawRing() {
  noFill();
  beat.detect(audio[index].mix);
  stroke(255);
  ellipse(width/2, height/2, 330 + ringChange, 330 + ringChange);
}
void breathing() {
  ringChange = sin(a) * 20 * ringSize;
  a += 0.05 * speed;
  ;
}
//void mousePressed() {
//  postRequest();
//}
void keyPressed() {
  println(index);
  if (keyCode == 'U') {
    ChangeIndexPlus();
    println("U is pressed");
  }
  if (keyCode == 'D') {

    println("D is pressed");
    ChangeIndexMinus();
  }
  if (keyCode == 'P') {

    println("P is pressed");
    if (audio[index].isPlaying()) {
      audio[index].pause();
    } else {
      audio[index].play();
    }
  }
}
void ChangeIndexPlus() {
  audio[index].pause();
  audio[index].rewind();
  if (index  < totalNum-1) {
    index ++;
  } else {
    index =totalNum -1;
  }
  audio[index].play();
}

void ChangeIndexMinus() {
  audio[index].pause();
  audio[index].rewind();
  if (index  > 0) {
    index --;
  } else {
    index =0;
  }
  audio[index].play();
}

void displayGuide() {
  fill(255, ringChange *5 + 150);
  textSize(normal);
  textAlign(CENTER);
  text("Please turn the dial to trigger another dialogue.", width-280, height -350);
}

void drawDialRing() {
  pushMatrix();
  translate(-500, -90);
  scale(1.2);
  textSize(normal - 5);
  textAlign(RIGHT, CENTER);
  //ellipse(width-180, height -200, 180, 180);
  if (index==0) {
    fill(255);
    textSize(selected);
    ellipse(width-280, height -150, 8, 8);
  } else {
    fill(255, 120);
    textSize(normal- 3);
  }
  text("Learning voice", width-300, height -150);
  if (index==1) {
    fill(255);
    textSize(selected);
    ellipse(width-290, height -200, 8, 8);
  } else {
    fill(255, 120);
    textSize(normal- 3);
  }
  text("Accessing data", width-310, height -200);
  if (index==2) {
    fill(255);
    textSize(selected);
    ellipse(width-280, height -250, 8, 8);
  } else {
    fill(255, 120);
    textSize(normal- 3);
  } 
  text("Booking restaurant", width-300, height -250);
  

  if (index==3) {
    fill(255);
    textSize(selected);
    ellipse(width-260, height -300, 8, 8);
  } else {
    fill(255, 120);
    textSize(normal- 3);
  } 
  text("Call from mom", width-280, height -300);
  popMatrix();
}
Click to Expand
0
// This #include statement was automatically added by the Particle IDE.
#include <neopixel.h>
#include <math.h>
/*
 * Project NeoPixel
 * Description:
 * Author:
 * Date:
 */
#include "neopixel.h"
// IMPORTANT: Set pixel COUNT, PIN and TYPE
#define PIXEL_PIN D0
#define PIXEL_COUNT 24
#define PIXEL_TYPE SK6812RGBW

Adafruit_NeoPixel pixels = Adafruit_NeoPixel(PIXEL_COUNT, PIXEL_PIN, PIXEL_TYPE);
int brightness =0;
int dir =5;
char val; // Data received from the serial port

// setup() runs once, when the device is first turned on.
void setup(){
  pixels.begin();
  Serial.begin(9600); // Start serial communication at 9600 bps
  Particle.variable("val",val);
}
void loop(){
    
    
   if (Serial.available()) 
   { // If data is available to read,
     val = Serial.read(); // read it and store it in val
      
   }
   
    
    if (val == '1') 
    { // If 1 was received
        loopLed();
         
    } else if(val == '2') {
         //ChangeBrightness();
        speak();
        
   }else if(val == '0'){
        ChangeBrightness();
   }else{
        loopLed();
   }
}

void loopLed(){
 for(byte x = 0; x < 20; x++){
    rotatePixels(true);
    delay(50);
  }
  for(byte x = 0; x < 20; x++){
    rotatePixels(false);
    delay(50);
  }
}

void rotatePixels(bool pixColour){
  static char currentPos = PIXEL_COUNT;



  currentPos++;

  // if(pixColour == true){
  //   colour = 0xFF00; // Green
  // }
  uint32_t colour = pixels.Color(50,0,0,0); // Default to Red
  pixels.setPixelColor((currentPos - 1) % PIXEL_COUNT, 0);
  colour = pixels.Color(0,10,10,5); // Default to Red
  pixels.setPixelColor((currentPos + 0) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,20,20,10); // Default to Red
  pixels.setPixelColor((currentPos + 1) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,30,30,15); // Default to Red
  pixels.setPixelColor((currentPos + 2) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,50,50,20); // Default to Red
  pixels.setPixelColor((currentPos + 3) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,30,30,15); // Default to Red
  pixels.setPixelColor((currentPos + 4) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,20,20,10); // Default to Red
  pixels.setPixelColor((currentPos + 5) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,10,10,5); // Default to Red
  pixels.setPixelColor((currentPos + 6) % PIXEL_COUNT, colour);
  colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 8) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 9) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 10) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 11) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
 pixels.setPixelColor((currentPos + 12) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 13) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
 pixels.setPixelColor((currentPos + 14) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 15) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
 pixels.setPixelColor((currentPos + 16) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
  pixels.setPixelColor((currentPos + 17) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
 pixels.setPixelColor((currentPos + 18) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
   pixels.setPixelColor((currentPos + 19) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
   pixels.setPixelColor((currentPos + 20) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
   pixels.setPixelColor((currentPos + 21) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
 pixels.setPixelColor((currentPos + 22) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
   pixels.setPixelColor((currentPos + 23) % PIXEL_COUNT, colour);
   colour = pixels.Color(0,5,5,5); // Default to Red
 
 
  pixels.show();

}



void ChangeBrightness(){
  float val = (exp(sin(millis()/2000.0*3.14159)) - 0.36787944)*15.0;

  uint32_t c = pixels.Color(0,val,30,5); // Blue

  for( int i = 0; i < pixels.numPixels(); i++ ){
     pixels.setPixelColor(i, c); // set a color

  }
  pixels.show();
  delay( 50 );

}
void speak(){
  float val = (exp(sin(millis()/2000.0*3.14159)) - 0.36787944)* random(10,50);

  uint32_t c = pixels.Color(0,30,val,5); // Blue

  for( int i = 0; i < pixels.numPixels(); i++ ){
     pixels.setPixelColor(i, c); // set a color
  }
  pixels.show();
  delay( 100 );

}
void LoopingRing(){


  for( int i = 0; i < pixels.numPixels(); i++ ){
     uint32_t c = pixels.Color(255,0,0,0); // Blue
     pixels.setPixelColor(i, c); // set a color
  }
  pixels.show();
  delay( 10 );
}
Click to Expand
0

Precedents

To build our future scenario, we took cues from current applications of technology, specifically Google Duplex. We decided to use the Google Duplex was an existing technology to build off of because it provided a basis to help ground our idea for the use of the technology we were working with. We thought about how this technology would continue to expand and evolve with the advancement of natural language processing and artificial intelligence. Currently, Google Duplex can sound more human-like and can be used to do things like book an appointment. We believed that the human-like sound would eventually become advanced enough to mimic the voice of the user. 


0

Process

Early on we had a clear idea of the topic and question we wanted to explore for this project. We focused on refining that idea and working in a scope that would make it easy to execute and build a scenario that would feel believable. We decided to have a narrowed scope because we felt that it would be more approachable and believable especially because we wanted to design a product that was a finished released product as opposed to a research experiment. Originally we wanted to have the design of the actual product be more organic looking, like a pebble shape. However, once we decided on using the round led ring, we changed our design of the shape so that it would work better with the movement of the light. We also originally had our backboard timeline with dates, however, after our first crit, we decided to remove the dates and use "versions" instead, which references the product cycles that current tech companies use. 

0

Reflection

Reflect on making this project. What did you learn? What would you do differently? Did you get where you wanted to? If not, why not? What do you need to get there, etc?

1. One thing works really well is displaying the dialogue caption on the table. It helps audiences understand who is speaking, and clearly see what is saying in a noisy environment. Also, triggering the prerecorded audio avoids users from using voice to interact with a voice system, but still convey the context clearly. We've seen many audiences enjoyed the interactivity and spent a lot of time on our booth.

2. What did not work as expected is that we have multiple pieces of content, so it is confusing what to look at first and what to look at the next. Our initial expectation is to let our audiences read the introduction and watch the product video to get a sense of what the product is about, then listen to the dialogue and fake media news to understand how this is being misused. But we found that most audiences directly interacted with the dial, so it is difficult for them to understand the rationale and concept at the very first time. 

3. Another big challenge is the multiple sound pieces are playing at the same time. We have a product video and several dialogues. They are all very loud and noisy,  so we have to turn down the volume of product video to make the dialogue audio clear. Without an audio narrative, the product video does not make much sense. Therefore, we believe using a headphone and adding captions for our product video would largely solve the problem.

0

Attribution and References

Reference any sources or materials used in the documentation or composition.

x
Share this Project

Courses

48528 Responsive Mobile Environments

· 18 members

This 15-week course introduces students to responsive mobile environments and encourages them to explore speculative terrains that intersect art, technology and design and space. Iteratively, intro...more


About

an AI home assistant that acts as an extension of you, by learning your voice, speech patterns, and tone in order to communicate as you in situations that take up excess time and energy. The aim is to free up time for richer interactions with people that are important to you.