TP 4: Intent recognition

Exercise 1: Automatic detection of speaker’s intention from supra-segmental features

The aim of this exercice is to develop a human feedback classifier : positive (approval) / negative (prohibition). This classifier might be used to teach robots and/or to guide robot’s learning.

In [0]:
import urllib.request
import numpy as np
import pandas as pd
from google.colab import files as google_files

import itertools
import matplotlib.pyplot as plt
In [2]:
!pip install ggplot
import ggplot
Collecting ggplot
  Downloading https://files.pythonhosted.org/packages/48/04/5c88cc51c6713583f2dc78a5296adb9741505348c323d5875bc976143db2/ggplot-0.11.5-py2.py3-none-any.whl (2.2MB)
    100% |████████████████████████████████| 2.2MB 6.2MB/s 
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from ggplot) (0.19.1)
Collecting brewer2mpl (from ggplot)
  Downloading https://files.pythonhosted.org/packages/84/57/00c45a199719e617db0875181134fcb3aeef701deae346547ac722eaaf5e/brewer2mpl-1.4.1-py2.py3-none-any.whl
Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from ggplot) (0.22.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (from ggplot) (2.1.2)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from ggplot) (1.11.0)
Requirement already satisfied: statsmodels in /usr/local/lib/python3.6/dist-packages (from ggplot) (0.8.0)
Requirement already satisfied: cycler in /usr/local/lib/python3.6/dist-packages (from ggplot) (0.10.0)
Requirement already satisfied: patsy>=0.4 in /usr/local/lib/python3.6/dist-packages (from ggplot) (0.5.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from ggplot) (1.14.5)
Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->ggplot) (2018.4)
Requirement already satisfied: python-dateutil>=2 in /usr/local/lib/python3.6/dist-packages (from pandas->ggplot) (2.5.3)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->ggplot) (2.2.0)
Installing collected packages: brewer2mpl, ggplot
Successfully installed brewer2mpl-1.4.1 ggplot-0.11.5
/usr/local/lib/python3.6/dist-packages/ggplot/utils.py:81: FutureWarning: pandas.tslib is deprecated and will be removed in a future version.
You can access Timestamp as pandas.Timestamp
  pd.tslib.Timestamp,
/usr/local/lib/python3.6/dist-packages/ggplot/stats/smoothers.py:4: FutureWarning: The pandas.lib module is deprecated and will be removed in a future version. These are private functions and can be accessed from pandas._libs.lib instead
  from pandas.lib import Timestamp
/usr/local/lib/python3.6/dist-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools
In [0]:
def list_from_URL(file_URL, function_applied=None):
  lines_bytes = urllib.request.urlopen(file_URL).readlines()
  lines = []

  for line in lines_bytes:
    line = line.decode("utf-8").rstrip()
    
    if function_applied is not None:
      line = function_applied(line)
    
    lines.append(line)
   
  return lines

def plot_confusion_matrix(cm, classes,
                          normalize=True,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    """
    From: http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    """
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')

    print(cm)

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

1. Extraction of prosodic features ($f_0$ and energy)

In [0]:
# # /!\ NO NEED TO EXECUTE THIS CELL AGAIN !!!
# 
# 
# filenames = list_from_URL('https://raw.githubusercontent.com/youqad/Neurorobotics_Intent-Recognition/master/filenames.txt')
# filenames = list(set(filenames))
# 
# files = []
# indices = []
# 
# for file in filenames:
# 
#     URL_f0 = 'https://raw.githubusercontent.com/youqad/Neurorobotics_Intent-Recognition/master/data_files/{}.f0'.format(file)
#     file_dicts = [{key:val for key, val in zip(['time', 'f0'], map(float, l.split()))} for l in list_from_URL(URL_f0)]
# 
#     URL_en = 'https://raw.githubusercontent.com/youqad/Neurorobotics_Intent-Recognition/master/data_files/{}.en'.format(file)
#     for l, d in zip(list_from_URL(URL_en), file_dicts):
#       d["file"] = file
#       d["en"] = float(l.split()[1])
#       d["label"] = file[-2:]
# 
#     files.extend(file_dicts)
# 
# # How `files` looks like:
# # files = [ 
# #           {"file": "cy0001at", "time": 0.02, "f0": 0., "en": 0.},
# #           {"file": "cy0001at", "time": 1.28, "f0": 0., "en": 0.},
# #           ...
# #           {"file": "li1450at", "time": 0.02, "f0": 0., "en": 0.},
# #           {"file": "li1450at", "time": 1.56, "f0": 404., "en": 65.}
# #         ]
# 
# pd.DataFrame(files).to_csv('data.csv', encoding='utf-8', index=False) # To reuse it next time
# google_files.download('data.csv')
In [5]:
# loading training data
df = pd.read_csv('https://raw.githubusercontent.com/youqad/Neurorobotics_Intent-Recognition/master/data.csv').set_index('file')

df1 = df.loc[df['label'] != 'at']
df1.head()
Out[5]:
en f0 label time
file
li1377pw 0.0 0.0 pw 0.02
li1377pw 39.0 0.0 pw 0.04
li1377pw 40.0 0.0 pw 0.06
li1377pw 39.0 0.0 pw 0.08
li1377pw 39.0 0.0 pw 0.10

2. Extraction of functionals (statistics) : mean, maximum, range, variance, median, first quartile, third quartile, mean absolute of local derivate

In [6]:
print(df1.columns.values)

#df.groupby('file').mean().head()
#df1.groupby('file').max().head()
#df1.groupby('file').var().head()
#df1.groupby('file').median().head()
df1.groupby('file').quantile([.25, .75]).head()
['en' 'f0' 'label' 'time']
Out[6]:
en f0 time
file
cy0007pw 0.25 41.00 0.0 0.525
0.75 66.00 189.5 1.535
cy0008pw 0.25 41.00 0.0 0.270
0.75 64.50 192.0 0.770
cy0009pw 0.25 40.75 0.0 0.395
In [7]:
list_features  = ['mean', 
                  'max',
                  ('range', lambda x: max(x)-min(x)),
                  'var',
                  'median',
                  ('1st_quantile', lambda x: x.quantile(.25)),
                  ('3rd_quantile', lambda x: x.quantile(.75)),
                  ('mean_absolute_local_derivate', lambda x: abs(x.diff()).mean())
                 ]

df1.groupby('file')['f0','en'].agg(list_features)
Out[7]:
f0 en
mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate
file
cy0007pw 92.284314 257.0 257.0 10372.542128 0.0 0.0 189.50 13.683168 52.313725 71.0 71.0 228.455057 52.0 41.00 66.00 2.970297
cy0008pw 78.431373 250.0 250.0 9930.090196 0.0 0.0 192.00 26.440000 47.725490 70.0 70.0 321.963137 43.0 41.00 64.50 3.960000
cy0009pw 69.065789 243.0 243.0 8927.182281 0.0 0.0 182.25 12.853333 49.473684 74.0 74.0 260.839298 42.0 40.75 66.00 3.520000
cy0010pw 29.196078 221.0 221.0 4696.178994 0.0 0.0 0.00 15.267327 46.049020 77.0 77.0 165.789652 42.0 41.00 50.75 3.306931
cy0011pw 110.743590 230.0 230.0 9290.400932 172.0 0.0 192.50 7.506494 53.653846 71.0 71.0 258.125375 62.0 41.25 66.00 2.337662
cy0012pw 74.539474 224.0 224.0 7363.451754 0.0 0.0 167.00 6.346667 50.250000 67.0 67.0 241.523333 42.5 41.00 66.00 2.320000
cy0013pw 81.343137 597.0 597.0 11985.257329 0.0 0.0 186.75 26.158416 48.960784 69.0 69.0 198.849932 42.5 40.00 63.00 2.871287
cy0014ap 114.578431 321.0 321.0 13517.612599 164.0 0.0 210.25 9.049505 53.558824 74.0 74.0 259.278684 55.0 41.00 70.00 2.752475
cy0015ap 107.976562 375.0 375.0 12282.180549 161.5 0.0 181.25 12.047244 52.109375 73.0 73.0 219.184793 56.0 41.00 64.00 2.314961
cy0016ap 87.125714 399.0 399.0 20819.478358 0.0 0.0 198.50 20.149425 48.354286 76.0 76.0 198.827783 42.0 41.00 59.00 2.655172
cy0017ap 143.225490 528.0 528.0 27929.225878 85.0 0.0 236.75 26.435644 51.931373 73.0 73.0 239.193263 51.5 41.00 65.50 3.089109
cy0018ap 101.934211 505.0 505.0 27713.342281 0.0 0.0 223.00 25.360000 48.078947 75.0 75.0 246.953684 42.0 41.00 60.50 3.706667
cy0019ap 123.401961 448.0 448.0 15774.599204 175.0 0.0 189.75 27.821782 53.411765 72.0 72.0 214.145603 57.0 42.00 65.00 3.980198
cy0020ap 128.509804 435.0 435.0 20712.628616 81.5 0.0 214.75 37.148515 53.637255 77.0 77.0 234.233450 53.0 42.00 66.00 4.396040
cy0021ap 129.179775 477.0 477.0 30549.717314 0.0 0.0 232.00 27.022727 51.887640 78.0 78.0 323.169050 47.0 41.00 70.00 3.431818
cy0022ap 128.205882 488.0 488.0 28859.373034 0.0 0.0 297.50 26.079208 50.411765 78.0 78.0 227.551543 43.0 41.00 64.75 3.465347
cy0023ap 79.775281 510.0 510.0 23343.562564 0.0 0.0 0.00 18.363636 45.820225 77.0 77.0 231.058223 42.0 40.00 49.00 3.454545
cy0024ap 130.674157 448.0 448.0 26981.835802 0.0 0.0 233.00 30.681818 50.584270 72.0 72.0 272.336568 47.0 40.00 64.00 3.840909
cy0025ap 138.426966 471.0 471.0 24146.383810 0.0 0.0 243.00 14.818182 51.719101 76.0 76.0 308.295199 51.0 41.00 69.00 3.000000
cy0111ap 109.398438 427.0 427.0 17347.895116 0.0 0.0 207.50 16.346457 51.445312 77.0 77.0 215.776513 48.0 42.00 64.00 2.661417
cy0120ap 122.052174 421.0 421.0 17961.120061 0.0 0.0 256.00 24.000000 54.156522 74.0 74.0 244.291076 53.0 43.00 67.50 4.245614
cy0121ap 145.980392 397.0 397.0 14935.563968 181.0 0.0 219.25 19.504950 55.205882 76.0 76.0 223.036401 56.5 43.00 67.75 3.128713
cy0122ap 100.887640 503.0 503.0 22841.850868 0.0 0.0 191.00 26.318182 50.101124 78.0 78.0 259.751021 45.0 42.00 62.00 3.636364
cy0123ap 133.500000 504.0 504.0 31086.066038 0.0 0.0 273.25 29.849057 50.611111 76.0 76.0 306.166667 45.0 41.25 66.00 4.452830
cy0124ap 154.225490 443.0 443.0 23927.206076 179.5 0.0 280.50 32.455446 53.656863 73.0 73.0 229.851388 55.0 42.00 67.00 4.039604
cy0125pw 130.565789 264.0 264.0 10602.648947 179.5 0.0 207.50 8.666667 54.460526 70.0 70.0 252.705088 63.0 42.00 66.00 2.320000
cy0126pw 125.565789 238.0 238.0 8605.822281 173.0 0.0 191.50 21.733333 53.276316 66.0 66.0 194.175965 58.5 44.75 63.00 3.040000
cy0127pw 70.039216 227.0 227.0 8007.899437 0.0 0.0 176.00 15.128713 49.078431 66.0 66.0 168.587847 44.0 42.00 61.00 2.594059
cy0128pw 85.280899 262.0 262.0 11070.158836 0.0 0.0 194.00 21.545455 49.123596 69.0 69.0 227.132278 46.0 41.00 62.00 3.272727
cy0129pw 56.921569 278.0 278.0 9794.793725 0.0 0.0 94.50 20.560000 44.843137 69.0 69.0 277.254902 42.0 40.50 56.00 4.200000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
li1369pw 43.865169 237.0 237.0 6894.958887 0.0 0.0 0.00 17.704545 48.123596 75.0 75.0 233.200460 44.0 41.00 58.00 4.477273
li1370pw 60.764045 219.0 219.0 7465.136874 0.0 0.0 164.00 14.113636 48.224719 77.0 77.0 275.835291 41.0 40.00 63.00 3.204545
li1371pw 59.637255 222.0 222.0 7603.659192 0.0 0.0 172.75 20.415842 50.823529 77.0 77.0 201.275480 47.5 43.00 61.00 4.217822
li1372pw 56.715789 228.0 228.0 7916.780067 0.0 0.0 166.50 14.191489 48.210526 76.0 76.0 281.189250 41.0 39.00 64.00 3.340426
li1373pw 33.089888 252.0 252.0 6119.810010 0.0 0.0 0.00 23.795455 49.247191 78.0 78.0 219.870020 48.0 44.00 54.00 4.386364
li1374pw 100.921739 276.0 276.0 13637.142944 0.0 0.0 226.50 10.385965 53.939130 81.0 81.0 377.075210 44.0 39.50 74.00 2.596491
li1375pw 60.656863 288.0 288.0 10495.494952 0.0 0.0 154.50 22.297030 52.117647 80.0 80.0 241.292953 49.5 42.00 63.75 4.336634
li1376pw 52.843750 244.0 244.0 7305.030512 0.0 0.0 166.00 13.968504 48.789062 77.0 77.0 238.766179 42.0 40.00 62.00 3.259843
li1377pw 46.807143 227.0 227.0 7228.128006 0.0 0.0 0.00 18.633094 47.535714 76.0 76.0 198.581449 40.5 39.00 57.00 3.021583
li1378pw 74.148438 456.0 456.0 10953.891179 0.0 0.0 182.00 19.291339 49.210938 78.0 78.0 247.868541 42.5 40.00 63.00 3.496063
li1379pw 80.452174 258.0 258.0 10093.162166 0.0 0.0 191.50 13.385965 49.304348 78.0 78.0 253.283753 46.0 39.00 62.50 3.438596
li1380pw 31.549020 262.0 262.0 5971.457969 0.0 0.0 0.00 18.118812 45.872549 78.0 78.0 195.379635 41.0 39.00 51.00 3.762376
li1381pw 74.558824 273.0 273.0 10618.229179 0.0 0.0 187.75 21.524752 49.107843 76.0 76.0 253.621918 41.5 39.00 65.75 3.207921
li1382pw 45.321739 233.0 233.0 7227.132418 0.0 0.0 0.00 11.929825 46.278261 76.0 76.0 243.062243 40.0 39.00 54.00 2.842105
li1383pw 101.613208 262.0 262.0 11907.610872 0.0 0.0 218.25 15.428571 54.537736 77.0 77.0 348.060467 56.0 40.00 73.00 2.780952
li1384pw 88.333333 283.0 283.0 12162.561056 0.0 0.0 199.75 21.128713 51.529412 75.0 75.0 257.776354 43.5 40.00 66.00 3.108911
li1385pw 69.943820 260.0 260.0 9993.576353 0.0 0.0 180.00 12.318182 48.314607 74.0 74.0 296.331716 41.0 39.00 65.00 3.136364
li1386pw 74.773913 239.0 239.0 9939.369489 0.0 0.0 195.00 23.807018 50.573913 76.0 76.0 272.843173 46.0 40.00 66.00 3.631579
li1387pw 79.200000 257.0 257.0 10927.528058 0.0 0.0 202.25 17.294964 51.400000 78.0 78.0 258.471942 43.0 40.00 67.00 3.007194
li1388pw 72.200000 259.0 259.0 9759.873381 0.0 0.0 179.00 17.381295 50.864286 79.0 79.0 221.830370 46.5 40.00 64.00 3.294964
li1389pw 98.813725 255.0 255.0 11732.252087 0.0 0.0 212.75 11.603960 54.960784 79.0 79.0 339.067754 58.5 40.00 72.00 2.871287
li1390pw 104.400000 319.0 319.0 12673.698246 0.0 0.0 207.50 10.666667 54.782609 79.0 79.0 349.943555 54.0 40.00 73.00 2.578947
li1391pw 44.450980 249.0 249.0 7208.844108 0.0 0.0 0.00 13.940594 46.303922 76.0 76.0 224.352262 40.0 39.00 55.50 3.207921
li1392pw 69.780488 228.0 228.0 8296.519121 0.0 0.0 167.50 15.851852 48.512195 72.0 72.0 258.104788 40.5 39.00 63.75 3.259259
li1393pw 98.276316 235.0 235.0 10066.495965 56.5 0.0 199.25 15.173333 51.815789 74.0 74.0 298.098947 53.0 40.00 67.00 3.386667
li1394pw 60.947368 342.0 342.0 9737.223860 0.0 0.0 168.00 31.600000 48.684211 75.0 75.0 250.885614 44.5 39.00 64.00 3.973333
li1395pw 44.381579 257.0 257.0 7569.119123 0.0 0.0 0.00 17.653333 46.618421 77.0 77.0 247.999123 41.0 39.75 56.75 4.106667
li1396pw 62.044944 286.0 286.0 9285.043412 0.0 0.0 162.00 15.613636 46.314607 71.0 71.0 243.672625 41.0 39.00 61.00 2.704545
li1399pw 52.876404 255.0 255.0 8754.155005 0.0 0.0 87.00 16.431818 47.719101 77.0 77.0 288.545199 41.0 39.00 62.00 3.227273
li1400pw 90.304348 443.0 443.0 10714.248665 0.0 0.0 187.00 20.438596 52.234783 77.0 77.0 239.268955 55.0 40.00 66.00 3.245614

373 rows × 16 columns

3. Check functionals for both voiced (i.e. $f_0\neq 0$) and unvoiced segments. Which segments are suited for the approach ?

In [8]:
voiced = df1.loc[df1['f0']!=0].groupby('file')['f0','en'].agg(list_features)
#voiced

voiced.all = df1.loc[df1['f0']!=0].groupby('label')['f0','en'].agg(list_features)
voiced.all
Out[8]:
f0 en
mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate
label
ap 289.501415 597.0 521.0 11013.029606 272.0 199.0 370.5 24.964026 73.309982 93.0 93.0 88.163801 74.0 68.0 79.0 3.470493
pw 192.443702 597.0 522.0 2702.078031 191.0 170.0 218.0 14.381682 71.629624 91.0 91.0 84.713679 72.0 65.0 79.0 3.028357
In [9]:
unvoiced = df1.loc[df1['f0']==0].groupby('file')['en'].agg(list_features)
#unvoiced

unvoiced.all = df1.loc[df1['f0']==0].groupby('label')['f0','en'].agg(list_features)
unvoiced.all
Out[9]:
f0 en
mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate
label
ap 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 46.383407 94.0 94.0 239.212611 43.0 41.0 55.0 3.981904
pw 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 47.554654 91.0 91.0 231.134295 47.0 40.0 58.0 3.481900

4. Build two databases by randomly extracting examples : Learning database ($60\%$) and Test database

In [0]:
def train_test(df=df1, train_percentage=.4, seed=1):
  
  voiced = df.loc[df['f0']!=0].groupby('file')['f0','en'].agg(list_features)
  unvoiced = df.loc[df['f0']==0].groupby('file')['en'].agg(list_features)

  X, Y = {}, {}

  X['voiced'], Y['voiced'] = {}, {}
  X['unvoiced'], Y['unvoiced'] = {}, {}


  X['voiced']['all'] = np.array(df.groupby('file')['f0','en'].agg(list_features))
  Y['voiced']['all'] = np.array(df.loc[df['f0']!=0].groupby(['file']).min().label.values)

  X['unvoiced']['all'] = np.array(unvoiced)
  Y['unvoiced']['all'] = np.array(df.loc[df['f0']==0].groupby(['file']).min().label.values)
  
  np.random.seed(seed)
  
  for type in ['voiced', 'unvoiced']:
    n = len(X[type]['all'])
    ind_rand = np.random.randint(n, size=int(train_percentage*n)) # random indices
    train_mask = np.zeros(n, dtype=bool)
    train_mask[ind_rand] = True
    X[type]['train'], X[type]['test'] = X[type]['all'][train_mask],  X[type]['all'][~train_mask]
    Y[type]['train'], Y[type]['test'] = Y[type]['all'][train_mask],  Y[type]['all'][~train_mask]
  
  return X, Y

X1, Y1 = train_test()
In [11]:
col = ['mean', 'max', 'range', 'var', 'median', '1st_quantile', '3rd_quantile', 'mean_absolute_local_derivate']
col = ['f0_'+c for c in col]+['en_'+c for c in col]

voi = pd.DataFrame(X1['voiced']['all'], columns=col).assign(label=Y1['voiced']['all'])

ggplot.ggplot(voi, ggplot.aes(x='f0_mean', y='f0_var', color='label')) +\
    ggplot.geom_point() +\
    ggplot.scale_color_brewer(type='qual', palette='Set1') +\
    ggplot.xlab("Mean") + ggplot.ylab("Var") + ggplot.ggtitle("Voiced: $f_0$")
Out[11]:
<ggplot: (-9223363274381464551)>
In [12]:
col = ['mean', 'max', 'range', 'var', 'median', '1st_quantile', '3rd_quantile', 'mean_absolute_local_derivate']

unvoi = pd.DataFrame(X1['unvoiced']['all'], columns=col).assign(label=Y1['unvoiced']['all'])

ggplot.ggplot(unvoi, ggplot.aes(x='var', y='mean_absolute_local_derivate', color='label')) +\
    ggplot.geom_point() +\
    ggplot.scale_color_brewer(type='qual', palette='Set1') +\
    ggplot.xlab("Variance") + ggplot.ylab("Mean absolute of local derivate") + ggplot.ggtitle("Unvoiced: $en$")
Out[12]:
<ggplot: (-9223363274381464446)>

5. Train a classifer (k-NN method)

In [13]:
# Scikit Learn's kNN classifier:
# Just to test, but we will implement it ourselves of course!
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

def sklearn_knn(k, X, Y):
  for type in ['voiced', 'unvoiced']:
    kNN = KNeighborsClassifier(n_neighbors=k)
    kNN.fit(X[type]['train'], Y[type]['train'])

    print("Accuracy score for {}: {:.2f}".format(type, accuracy_score(Y[type]['test'],
                                                                  kNN.predict(X[type]['test']))))
sklearn_knn(3, X1, Y1)
Accuracy score for voiced: 0.92
Accuracy score for unvoiced: 0.60
In [14]:
# Our own implementation!
from scipy.spatial.distance import cdist
from sklearn.metrics import confusion_matrix
from collections import Counter

def kNN(k, X, Y, labels=["pw", "ap"]):
    # auxiliary function: label prediction (by majority vote)
    # based on the nearest neighbors
    def predicted_label(ind_neighbors):
        label_neighbors = tuple(Y['train'][ind_neighbors])
        return Counter(label_neighbors).most_common(1)[0][0]
    
    # Pairwise distances between test and train data points
    dist_matrix = cdist(X['test'], X['train'], 'euclidean')
    y_predicted = []

    for i in range(len(X['test'])):
        ind_k_smallest = np.argpartition(dist_matrix[i, :], k)[:k]
        y_predicted.append(predicted_label(ind_k_smallest))
    
    # Confusion matrix: C[i, j] is the number of observations 
    # known to be in group i but predicted to be in group j
    return confusion_matrix(Y['test'], np.array(y_predicted), labels=labels)

plt.figure()
cm = kNN(10, X1['voiced'], Y1['voiced'])
plot_confusion_matrix(cm, classes=["pw", "ap"],
                      title='Confusion matrix, with normalization')
plt.show()

cm2 = kNN(3, X1['unvoiced'], Y1['unvoiced'])
plot_confusion_matrix(cm2, classes=["pw", "ap"],
                      title='Confusion matrix, with normalization')
plt.show()
Normalized confusion matrix
[[0.95934959 0.04065041]
 [0.09836066 0.90163934]]
Normalized confusion matrix
[[0.515625   0.484375  ]
 [0.30434783 0.69565217]]

Exercice 2: Detection of multiple intents

We consider the following intents : "Approval", "Prohibition" and "Attention"

1. Extract the prosodic features ($f_0$ and energy) and their functionals

In [15]:
# Easy-peasy! All the work has been done before: all we have to do now is to use 
# the DataFrame `df` instead of `df1`

df.groupby('file')['f0','en'].agg(list_features).head()
Out[15]:
f0 en
mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate mean max range var median 1st_quantile 3rd_quantile mean_absolute_local_derivate
file
cy0001at 110.609375 402.0 402.0 27607.511657 0.0 0.0 331.0 23.777778 47.296875 73.0 73.0 327.323165 41.5 40.0 67.0 3.809524
cy0002at 105.640449 430.0 430.0 27108.528345 0.0 0.0 251.0 15.636364 47.337079 76.0 76.0 269.112360 41.0 40.0 62.0 3.272727
cy0005at 110.609375 402.0 402.0 27607.511657 0.0 0.0 331.0 23.777778 47.296875 73.0 73.0 327.323165 41.5 40.0 67.0 3.809524
cy0006at 105.640449 430.0 430.0 27108.528345 0.0 0.0 251.0 15.636364 47.337079 76.0 76.0 269.112360 41.0 40.0 62.0 3.272727
cy0007pw 92.284314 257.0 257.0 10372.542128 0.0 0.0 189.5 13.683168 52.313725 71.0 71.0 228.455057 52.0 41.0 66.0 2.970297

2. Develop a classifier for these three classes

In [0]:
X, Y = train_test(df=df)
In [17]:
sklearn_knn(3, X, Y)
Accuracy score for voiced: 0.72
Accuracy score for unvoiced: 0.46
In [18]:
plt.figure()
cm = kNN(3, X['voiced'], Y['voiced'], labels=["pw", "ap", "at"])
plot_confusion_matrix(cm, classes=["pw", "ap", "at"],
                      title='Confusion matrix, with normalization')
plt.show()

cm2 = kNN(3, X['unvoiced'], Y['unvoiced'], labels=["pw", "ap", "at"])
plot_confusion_matrix(cm2, classes=["pw", "ap", "at"],
                      title='Confusion matrix, with normalization')
plt.show()
Normalized confusion matrix
[[0.89344262 0.09016393 0.01639344]
 [0.07633588 0.5648855  0.35877863]
 [0.01886792 0.32075472 0.66037736]]
Normalized confusion matrix
[[0.44094488 0.31496063 0.24409449]
 [0.2295082  0.47540984 0.29508197]
 [0.2300885  0.36283186 0.40707965]]