Sherlock

Sherlock as it watches you.

Intent

In this investigation, we were tasked with studying how A.I. and technology intersect with everyday interactions and domestic rituals. We approached this by exploring the ritualistic behavior of staring at a screen. In the past, when smart devices and televisions were nowhere to be found, people spent more time socializing, reading books, engaging in outdoor activities, pursuing the arts, and generally diversifying their time among a number of activities. Now, we live in a day and age where our devices captivate us and siphon significant portions of our time and attention. We readily spend hours staring at screens with little to no understanding about the innerworkings of the mechanisms and electronic systems that lie beneath. Our goal is to create an unsettling interaction between people and screens to provoke more intentional thinking about behaviors involving screens such as subconsciously checking your phone for new notifications.

Context

The theme of superstition and machine assisted beliefs didn’t quite resonate with us. So, we decided to approach the module without it and instead focus on the everyday rituals in our lives.
The Unroll by Meijie Hu was the first inspiration for us, as we both agreed that social media usage was a part of many people’s daily ritual. https://meijie-hu.com/Oueksmorphism-1
We also were heavily inspired by the eyes in the museum following unsuspecting patrons. In the museum, much of the spookiness we felt came from the patrons being unaware they were being recorded. We wondered how it would feel to have something watch you, and while you know about the recording you don’t understand what it’s getting from the recording.

Prototype/Outcome

A short demonstration of how our artifact operates.

- https://youtube.com/shorts/wyWNBQ1QbbI?feature=share

A Tweet from Sherlock

Our artifact is a bot-like entity that watches a person standing in front of it, tracks that person’s movements, and generates a tweet about what it thinks that person is doing. It works by using ultrasonic sensors to gather data on what objects are in its environment. If the left sensor reads a value that is of higher priority than the other sensors, the eyes projected on the TFT screen are instructed to move left. If the left sensor and middle sensor both read a similar priority that is higher than the priority of the right sensor, the eyes are instructed to move toward the center left area of the screen. In general, the location of the eyes on the TFT screen is mapped to input data from each of the three ultrasonic sensors.

A schematic drawing of our artifact with dimensions (in millimeters) and a bill of materials.

To generate tweets (example above), we used the PySerial, OpenAI, and OpenCV Python libraries. First, the bot starts a timer when somebody is in front of it. After 5 seconds, the bot prints a command to the serial monitor to generate a tweet. When that happens, Python prompts GPT-3 to generate a tweet from the perspective of a robot of what the human it's watching might be doing. Then, that text is extracted from GPT-3, overlaid onto a template image (attached below) of a fake tweet, and displayed on a separate laptop screen.

Code

//____________________________________________________________________________________________________________________________
//LIBRARIES AND DEFINITIONS---------------------------------------------------------------------------------------------------
#include <Adafruit_GFX.h>    // Core graphics library
#include <Adafruit_ST7789.h> // Hardware-specific library for ST7789
#include <SPI.h>
#include <Adafruit_NeoPixel.h>

#define TFT_CS        D10
#define TFT_RST        -1 // Or set to -1 and connect to Arduino RESET pin
#define TFT_DC         D9
#define TFT_COPI D11  // Data out
#define TFT_SCLK D13  // Clock out

#define BLACK    0x0000
#define BLUE     0x001F
#define RED      0xF800
#define GREEN    0x07E0
#define CYAN     0x07FF
#define MAGENTA  0xF81F
#define YELLOW   0xFFE0 
#define WHITE    0xFFFF




//____________________________________________________________________________________________________________________________
//GLOBAL VARIABLES------------------------------------------------------------------------------------------------------------
Adafruit_ST7789 tft = Adafruit_ST7789(TFT_CS, TFT_DC, TFT_COPI, TFT_SCLK, TFT_RST);
float p = 3.1415926;
uint16_t screenWidth = 320;
uint16_t screenHeight = 172;
uint16_t cx = screenWidth/2;
uint16_t cy = screenHeight/2;

//Eye Constants
uint16_t numLines = 10; //Number of vertical lines that make up eye
uint16_t space = 5; //Distance between vertical lines that make up eye
uint16_t eyeWidth = (numLines-1)*space; //Height of vertical lines that make up eye
uint16_t eyeHeight = 80; //Height of vertical lines that make up eye
uint16_t eyeColor = WHITE;
uint16_t eyex = cx;
uint16_t eyey = cy;
int screenLimit = 6; //Controls when eye has reached edge of screen
bool eyesWereClosed = false; //Keeps track of last time eyes were closed

//Ultrasonic Sensor Constants
const int echoPin1 = 8;
const int trigPin1 = 7;
const int echoPin2 = 6;
const int trigPin2 = 5;
const int echoPin3 = 4;
const int trigPin3 = 3;

//LED
#define LED_PIN    2
#define LED_COUNT 12
Adafruit_NeoPixel strip(LED_COUNT, LED_PIN, NEO_GRB + NEO_KHZ800);


//Other Stuff
unsigned long inViewTime = 0; //keeps track of how long someone has been in view
bool inViewTimerStarted = false; //keeps track of whether or not a timer has already been started
unsigned long inViewTimerStartTime = millis(); //beginning time for inViewTimer
bool tweet = false; //Tells us whether or not we should tweet
bool hasPrintedTweet = false; //Tells us whether or not a tweet has already been made for the person in view.




//____________________________________________________________________________________________________________________________
//INITIALIZATION--------------------------------------------------------------------------------------------------------------
void setup(void) {

  //SPI.begin();  // init SPI

  Serial.begin(9600);
  Serial.print(F("Hello! ST77xx TFT Test"));

  // OR use this initializer (uncomment) if using a 1.47" 172x320 TFT:
  tft.init(172, 320);
  tft.setSPISpeed(32000000);
  tft.setRotation(3);
  //tft.setSPIFreqency(1000000); // set 1 MHz
  //tft.initR(); // init @ 1 MHz.
  Serial.println(F("Initialized"));

  //Screen initialization
  tft.fillScreen(BLACK);
  for (int i=0;i<numLines;i++) {
    tft.drawFastVLine((eyex-eyeWidth/2)+(i*space), eyey-(eyeHeight/2), eyeHeight, WHITE);
  }

  //Ultrasonic sensor initialization
  pinMode(trigPin1, OUTPUT);
  pinMode(echoPin1, INPUT);
  pinMode(trigPin2, OUTPUT);
  pinMode(echoPin2, INPUT);
  pinMode(trigPin3, OUTPUT);
  pinMode(echoPin3, INPUT);

  //LED
  strip.begin();           // INITIALIZE NeoPixel strip object (REQUIRED)
  strip.show();            // Turn OFF all pixels ASAP
  strip.setBrightness(255); // Set BRIGHTNESS to about 1/5 (max = 255)
  for (int i=0; i < 12; i++) {
    strip.setPixelColor(i, 0, 0, 255);
  }
}




//____________________________________________________________________________________________________________________________
//LOOP------------------------------------------------------------------------------------------------------------------------
void loop() {
  //Ultrasonic sensors
  int dist1 = getDist(trigPin3, echoPin3); //Right
  int dist2 = getDist(trigPin2, echoPin2); //Mid
  int dist3 = getDist(trigPin1, echoPin1); //Left
  //Serial.print("Left: ");
  //Serial.print(dist3);
  //Serial.print("cm           ");
  //Serial.print("Mid: ");
  //Serial.print(dist2);
  //Serial.print("cm           ");
  //Serial.print("Right: ");
  //Serial.print(dist1);
  //Serial.print("cm           ");
  //Serial.print("Time in View (s): ");
  //Serial.print(inViewTime);
  //Serial.print("s            ");
  //Serial.print("Print tweet?   ");
  //Serial.println(tweet);

  //Moving eye
  if (dist1 < 60 || dist2 < 60 || dist3 < 60) {
    startInViewTimer();
    strip.fill(0x0000FF);
    strip.show();
    if (eyesWereClosed == true) openEyes(tft, eyex, eyey);
    if (abs(dist3 - dist2) <= 8 && abs(dist2 - dist1) <= 8) moveTowardCenter(tft, eyex, eyey);
    else {
      if (dist3 < dist2 && dist3 < dist1 && abs(dist3 - dist2) >= 8) moveLeft(tft, eyex, eyey);
      else if (dist1 < dist2 && dist1 < dist3 && abs(dist2 - dist1) >= 8) moveRight(tft, eyex, eyey);
      else {
        if (abs(dist3 - dist2) <= 8) moveTowardCenterLeft(tft, eyex, eyey);
        else if (abs(dist2 - dist1) <= 8) moveTowardCenterRight(tft, eyex, eyey);
        else {
          moveTowardCenter(tft, eyex, eyey);
        }
      }
    }
  }
  else {
    strip.clear();
    strip.show();
    closeEyes(tft, eyex, eyey);
    endInViewTimer();
  }

  //Decides whether or not a tweet should be made
  printTweet();
  if (tweet) Serial.println("Generate tweet");
}






//____________________________________________________________________________________________________________________________
//HELPER FUNCTIONS------------------------------------------------------------------------------------------------------------

//Decides whether or not a tweet should be made
void printTweet() {
  if (inViewTime >= 4 && hasPrintedTweet == false) {
    tweet = true;
    hasPrintedTweet = true;
  }
  else {
    tweet = false;
  }
}

//Begins a timer of how long somebody has been in view
void startInViewTimer() {
  if (inViewTimerStarted == false) {
    inViewTimerStartTime = millis();
    inViewTimerStarted = true;
  }
  inViewTime = abs(millis() - inViewTimerStartTime)/1000;
}

//Ends the timer for how long somebody has been in view
void endInViewTimer() {
  inViewTime = 0;
  inViewTimerStarted = false;
  hasPrintedTweet = false;
}

//Eye close animation
void closeEyes(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  uint16_t x = xmid - (eyeWidth/2);
  uint16_t y = ymid - (eyeHeight/2);
  for (int i=0; i < numLines; i++) {
    screen.drawFastVLine(x + (i*space), y, eyeHeight - 10, BLACK);
  }
  eyesWereClosed = true;
}

//Eye open animation
void openEyes(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  uint16_t x = xmid - (eyeWidth/2);
  uint16_t y = ymid - (eyeHeight/2);
  for (int i=0; i < numLines; i++) {
    screen.drawFastVLine(x + (i*space), y, eyeHeight - 10, WHITE);
  }
  eyesWereClosed = false;
}

//Takes center position of eye and moves it toward the center of the screen
void moveTowardCenter(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  if (xmid < cx) moveRight(screen, xmid, ymid);
  if (xmid > cx) moveLeft(screen, xmid, ymid);
}

//Takes center position of eye and moves it toward center left part of screen
void moveTowardCenterLeft(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  int xCenterLeft = screenWidth/3;
  if (xmid < xCenterLeft) moveRight(screen, xmid, ymid);
  if (xmid > xCenterLeft) moveLeft(screen, xmid, ymid);
}

//Takes center position of eye and moves it toward center right part of screen
void moveTowardCenterRight(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  int xCenterRight = (screenWidth*2)/3;
  if (xmid < xCenterRight) moveRight(screen, xmid, ymid);
  if (xmid > xCenterRight) moveLeft(screen, xmid, ymid);
}

//Takes center position of eye and moves it to the right
void moveRight(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  uint16_t x = xmid - (eyeWidth/2);
  uint16_t y = ymid - (eyeHeight/2);
  if(x + eyeWidth + space <= screenWidth - screenLimit) {
    screen.drawFastVLine(x, y, eyeHeight, BLACK);
    screen.drawFastVLine(x+eyeWidth+space, y, eyeHeight, WHITE);
    eyex+=space;
  }
}

//Takes center position of eye and moves it to the left
void moveLeft(Adafruit_ST7789 screen, uint16_t xmid, uint16_t ymid) {
  uint16_t x = xmid - (eyeWidth)/2;
  uint16_t y = ymid - (eyeHeight/2);
  if(x - space >= screenLimit) {
    screen.drawFastVLine(x - space, y, eyeHeight, WHITE);
    screen.drawFastVLine(x + eyeWidth, y, eyeHeight, BLACK);
    eyex-=space;
  }
}

//Returns distance detected by ultrasonic sensor
int getDist(int trigPin, int echoPin) {
  long duration, cm;
  digitalWrite(trigPin, LOW);
  delayMicroseconds(2);
  digitalWrite(trigPin, HIGH);
  delayMicroseconds(10);
  digitalWrite(trigPin, LOW);
  duration = pulseIn(echoPin, HIGH);
  cm = microsecondsToCentimeters(duration);
  return cm;
}

//For Ultrasonic Sensor From https://www.tutorialspoint.com/arduino/arduino_ultrasonic_sensor.htm
long microsecondsToInches(long microseconds) {
   return microseconds / 74 / 2;
}

long microsecondsToCentimeters(long microseconds) {
   return microseconds / 29 / 2;
}

Click to Expand

#Tweet Generation Code in Python

import serial
import openai 
import cv2 
#tweak how it needs you to close the image
openai.api_key = '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'
image = cv2.imread('example_post.png')

# Window name in which image is displayed
window_name = 'Image'

ser = serial.Serial()
ser.port = 'COM4'
ser.baudrate = 115200
ser.timeout=0.1
ser.open()

#make call to OpenAI
def generate_tweet(input):
    completion = openai.ChatCompletion.create(model="gpt-3.5-turbo-0301", messages=[{"role": "user", "content": input}])
    text = completion.choices[0].message.content
    return text

#convert OpenAI output into multiple strings in an array in order to overlay onto image on seperate lines
def format_tweet(tweet):
    text_arr = tweet.split()
    text_arr_reformatted = []
    current_line = ""
    line_char_count = 0
    for i in text_arr:
        line_char_count = line_char_count + len(i)
        if line_char_count > 42:
            text_arr_reformatted.append(current_line)
            line_char_count = len(i) + 1
            current_line = i + " "
        else:
            current_line = current_line + i + " "
            print(current_line)
            line_char_count = line_char_count + 1
    text_arr_reformatted.append(current_line)
    return text_arr_reformatted

#turns set area of template white then writes formatted text over the white space
def edit_image(text_arr):
    new_image = cv2.imread('example_post.png')
    white = (255, 255, 255)
    image[55:200] = white
    # font
    font = cv2.FONT_HERSHEY_SIMPLEX
    
    # org (x, y)
    org = (25, 80)
    
    # fontScale
    fontScale = 0.55
    
    #Color in BGR
    color = (0, 0, 0)
    
    # Line thickness of 2 px
    thickness = 1
    y0, dy = 80, 15
    for i, line in enumerate(text_arr):
        y = y0 + i*dy
        cv2.putText(image, line, (25, y ), font, fontScale, color, thickness)
    return new_image


prompt = "Without using emojis, pick a random activity and make a tweet in the style of a robot about watching someone doing the activity"
#loops constantly checking for serial monitor output until ctrl+C in terminal
demo = True
while(demo):
    output = ser.readline().decode('ASCII')
    print(output)
    if "Generate tweet" in output:
        print(prompt)
        tweet = generate_tweet(prompt)
        print("New Tweet - " + tweet)
        new_image = edit_image(format_tweet(tweet))
        cv2.imwrite('new_tweet' + str(test) + ".png", image)
        cv2.imshow('New Tweet', image)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
ser.close()

Click to Expand

Process

We took a parallel development approach to the project. While David worked more with the Arduino, I focused on the tweet generation. For the tweet generation, I broke the project down into components. I needed to be able to read the Arduino output, use OpenAI to generate a tweet, and take an image template and edit it with the generated text. I started by doing preliminary research on each component to determine feasibility. From there, I worked on each component one by one until I finished it. At the same time, David worked on the sensor/Arduino side of things, primarily spending most of his time on the TFT screens, which was causing serious issues with its screen clearing speed.
When faced with major obstacles both David and I relied heavily on documentation that existed online to try to solve our issues.
One major design decision we had to make was whether or not to generate the tweet based on randomness or based on a trained dataset using Edge Impulse. We decided on going with a random tweet as we decided it’d be too difficult to classify different actions based just on the proximity sensors' movement detection.

Tweet Template we used

Using the TFT Screen to create an eye.

Wiring the ultrasonic rangers to the TFT screens and using the data from the sensors to change the output on the screens.

Assembling the laser cut parts.

Placing the electronics in the shell and wiring the device.

The final product.

Open Questions and Next Steps:

Next Steps - Based on the feedback we received during the in-class demo, we feel the next step would be to add a third screen for a mouth. This would allow Sherlock to better express himself, whether it’s in a nice or evil way. Another suggestion we received was pivoting slightly to a “mind reading” perspective, which we thought was very interesting. It would certainly add to the spookiness of our project. Sherlock would try to follow you and look into your eyes, and its output would be a mind reading instead of a tweet. Finally, we also loved the idea of adding a texting feature, where if you walked past Sherlock he would let you know he saw you.
One thing we don’t really know how to approach is how to make our project more interactive. Despite our best efforts it still required a little bit of input from our end during the demo in order to make things work, even without the tech errors. How could we make the design work seamlessly without us there?

Reflection

We are both happy with the success of the project, although we certainly did not achieve all of our ambitions. We started with a very complicated idea, and we managed to simplify it while still keeping the key elements that we wanted. However, the tweet generation was not as detailed as we would’ve liked and the ultrasonic sensors were not nearly as cooperative as we’d hoped they’d be. Ultimately, with additional time and perhaps a camera instead of the ultrasonic sensor, we could’ve achieved our initial vision.