Tracking Object Movements with Machine Learning (Open CV)

Note: This is very nascent research and until now, not externally validated.

‘Plants respond to positive and negative emotional voices differently.’ - Wait! What? Plants respond? What do they do? The later would be my reaction when I got confronted with the research that I am presenting hereafter.

Exemplary object tracking of a flying Paprika

Initial Results from Manual Object Detection

Initial research from my colleague Josephine indicates that specific plants (we are conducting experiments with the mimosa, i.e. Mimosa Pudica, and the codario, i.e. Codariocalyx Motorius) raise and accelerate their leaves faster if they are exposed to different types of music (1), or vocal emotions (2):

(1) Codario and Different Music Types

Music Types

These graphs illustrate the time the codario takes to respond to different auditive stimuli such as ‘silence’, ‘metal’, ‘techno’, ‘classical’, and ‘yodeling’ (the latter deserves credits to our mentor who is Swiss). One may notice that yodeling was considerably faster than silence. The next analysis shows whether we may attribute this to a plant’s happiness or sadness when being exposed to yodeling. :smile:

(2) Codario and Different Emotions

Surprisingly enough, the following analysis indicates that the codario is actually lifting its leaves four times higher and also longer when hearing happy emotions compared to sad ones. For this experiment, we use an artistic data set with professionals that repeat the same sentences in both very positive and negative ways (you don’t even want to know how often we had to listen to the same sentence :laughing:).

Emotions The colors are different leaves of the same plant, and one may realize that the left y-axis is different for the different plants.

Data Collection

It was quite surprising to learn that plants move differently when a person speaks happily or sadly. Hence, we replicate these early results with increasing amounts of data and automatize the data processing with Machine Learning techniques. Therefore, we created more of the world’s most important currency of modern times: data.

Sound File Preparations

With coding a :snake: script, we were able to create different sound files usually constructed in a way that an auditive stimulus is followed by 10s or 60s of silence until the next comes. The emotions (positive vs. negative) would be altered to keep contingent variables like sunlight exposure, time of the day, or room temperature, constant.

Exemplary part snippet of the sound file creating algorithm:

while counter < (len(happy_sounds) + len(sad_sounds))/2:
    wavset.append(happy_sounds[counter])
    wavset.append("extra_sounds/silence_60s.wav")
    wavset.append(sad_sounds[counter])
    wavset.append("extra_sounds/silence_60s.wav")
    counter += 1
wavs = [AudioSegment.from_wav(wav) for wav in wavset]
combined = wavs[0]

During creating these audio files, I learned:

Video Screening

Our mentor provided us with data sets by filming his plants while they were exposed to our different hour-long sound files. We are super grateful for the resulting video files as they contain a lot of visual and audio noise. This is particularly useful because you want realistic data sets, and the world - as we all know - is messy and full of noises. Hence, we have to do a significant amount of data wrangling.

My Quest: Automatizing ‘Happiness Analysis with Machine Learning’

We evaluated different Machine Learning tools (dlib, OpenCV, sci-kit image), and the whole process of moving image aka video analysis to get indirect emotions out of the plants.

A Cloud Supercomputer, Machine Learning Tools, and Modern Coding Techniques

Next, we set up a cloud-based infrastructure to not torture our little MacBooks. This cloud service is named Google Colab and ensures we have a lot of processing power (e.g., 26 GBs of RAM) to process our videos fast. This is how the development infrastructure looks like:

Google Colab Overview

The current version of the object tracker is introduced hereafter with which we can track many regions of interest at the same time producing comprehensive amounts of data. Please find an example hereafter:

Loads of Tracking Points

Overview of Current Object Tracker

The current object tracker tool consists of five data processing steps. These are uploading a video, choosing hyperparameters for different videos (still kind of an iterative approach), processing and showing the video, the actual data analysis respectively manipulation, and as an output an interactive visualization of the movement. In the following, a brief overview of what is happening in each step, and what I learned when developing it is presented. Please find a video hereafter, of an early version of the object tracker deployed on the mimosa to get an idea about what is happening:

0. Step: Upload Video

Upload Feature

What is going on?

What did I learn?

1. Step: Choose Hyperparameters

Hyperparams

What is going on?

What did I learn?

2. Step: Process & Regard the Video

Video Processing

What is going on?

What did I learn?

3. Step: Analysis

Data Analysis

What is going on?

What did I learn?

Data manipulation and analysis with :panda_face:

4. Step: Interactive Visualization

Altair

What is going on?

What did I learn?

5. Additional Features

Additional Features

For developing and testing purposes, I incorporated additional functionality which I will not dive into in a deeper manner as of now. These encompass:

Next Steps

Now, that the data pipeline is functional and initial data analyses are operating, a tool for concluding happy or sad emotions will be added. Hereafter image depicts promising tracking spots.

Proper Tracking Spots

Thank you for comments and your interest!