Commit e35e5537 authored by stehess's avatar stehess
Browse files

initial commit of source code for publication

parent d07e324f
MIT License
Copyright (c) 2020 pdz
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# sample-data-repo
# Hand-eye-coordination (HEC or EHC)
This Readme shall provide all necessary insights to use the framework in order to recognize hand eye-coordination (HEC) patterns in eye tracking videos.
FYI: The software code has been developed and tested in a Windows 10 Education OS environment only.
## License
The code and the models in this repo are released under the [MIT License](https://gitlab.ethz.ch/pdz/eye-hand-coordination/-/blob/master/LICENSE).
## Installation
# First step: Download and install anaconda
# Second step: Create environment from .yml file
conda env create -f HEC_CNN_env.yml
# Third step: Activate conda environment
conda activate HEC_CNN_env
# FYI - Save an environment with
conda env export > HEC_CNN_env.yml
## Citation
If you use our code in your research or wish to refer to the baseline results, please use the following BibTeX entry.
>>>>>>>> CHECK THIS CITATION BEFORE RELEASE <<<<<<<<<<<<<<<
@article{NonLocal2020,
author = {Stephan Wegner, Felix Wang, Sophokles Ktistakis, Julian Wolf, Quentin Lohmeyer, Mirko Meboldt},
title = {FILL IN TITLE HERE},
journal = {PlosONE},
year = {2020}
}
## Structure of the data
**Video files:**
.avi files with name *name_base{i}.avi*
**Gaze coordinate files:**
.txt files with name *name_base{i}.txt*
*Headings in file:*
| RecordingTime [ms] | Point of Regard Binocular X [px] | Point of Regard Binocular Y [px] | Video Time [h:m:s:ms] |
**Labels for Mask-RCNN:**
.json file with name *labels_yps.json*
*Structure in file:*
{"bg": 0, "Obj1": 1, "Ojb2": 2, "Obj3":3, "Obj4": 4}
*(You can have as many object as you wish, according to your trained model)*
**Ground truth files for 3D-ConvNet:**
.csv files with name *behaviour_ground_truth_name_base{i}.csv*
*Headings in file (exported from SMI BeGaze 3.6):*
| Frame time | Frame number | Behaviour |
**Video times to cut orginal videos:**
.txt file with name *video_times.txt*
*Headings in file:*
| Name | start [s] | end [s] |
##Defintions in definitions.py
Open definitions.py in your favorite source code editor (e.g. [Atom](https://atom.io/) or [Pycharm](https://www.jetbrains.com/de-de/pycharm/))
###Define the name base for your files (video (.avi), gaze coordinates (.txt), and behaviour ground truth (.csv))
# As an example:
name_base = 'EHC_Y_P'
###Define, which scripts you want to run
#Please choose, which oparation(s) you want to start by choosing 0 (operation is not started) or 1 (operation will be started)
operation = {'2DCNN_Inference': 1,
'extract_features': 1,
'create_segments': 1,
'3DCNN_train_class': 1,
'3DCNN_predict': 1,
'post-processing': 1
}
###Define, which IDs are in the training and in the test set
# the rule for naming is name_base{i}, both for video and gaze coordinate files
train_val_nums = [2,3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,21,24,25,26,27,28,29,30,31,32,33,34]
test_nums = [1,10,11,22,23]
###Define if you want to run training or test
# choose mode: "train" or "test"
mode = "train"
###For the 2D-ConvNet, definitions are
# choose: "original" or "black" background
image_type='original'
# weights for paper use case work well, please adapt to your specific use case
# 'w_pen': 1.0,'w_phone':0.96,'w_pillow':1.3, 'w_smart': 1.6
class_weights = [1, 0.96, 1.3, 1.6]
###For the 3D-ConvNet, definitions are
#### For inference
path_to_load_classification_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\22i_TCNN_class_acc_0.65652174.h5"
HEC_classes = ['Background', 'Guiding', 'Directing', 'Checking', 'Observing']
#### For training
* During the training, the model is saved every 5 epochs.
* Define learning rate and number of epochs for the training
* The training set is devided into k folds for training/ val split.
* You can define, which fold you want to start in the training
# Parameters for training the 3D CNN
hyperparam=hyperparam #Hyperparameters for tuning the 3D CNN
length=length # number of hyperparameter sets
#k-fold cross validation, start and end point defined for validation set, remaining samples are collected in training set
starts=[] #start of the split, for k=5: 0.01, 0.21, 0.41, 0.61, 0.81
ends=[] #end of the split, for k=5: 0.20, 0.40, 0.60, 0.80, 1.00
k = 5
step = 1 / k
a = 0
while a < k:
start = round((1 / 100 + 10 * step * a / 10), 2)
end = round((start + step - 1 / 100), 2)
ends.append(end)
starts.append(start)
a+=1
runs=len(starts)
# learning rate for training the model
learning_rate= 1e-3
epochs = 50
# decision, which fold is start fold (0,1,2,3,4)
start_fold=0
# choose if undersampling should be applied (True or False)
undersampling = False
# Choose model = 'None'
model = 'None'
##### Restart training
* You can restart the training from one of the saved models
## Definitions path to to-be retrained model
path_to_load_retrain_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\temp_training\BGtrain_05s\3DCNN_color_f1_GP_epoch_40.h5"
# choose model to re-train
model = path_to_load_retrain_network
# number of epoch to restart training
if model == 'None':
start_epoch = 0
else:
start_epoch = 40
###Definitions for post-procession
use_bg = False
## Run code
# Go to folder of the project and open the command line
python main.py
## Structure of the project
definitiony.py
LICENSE
main.py
opti.py
README.md
data
|
-- datasets
| |
| -- dadaset_gt
| | |
| | - behaviour_ground_truth_name_base{i}.txt
| |
| -- extracted_images
| | |
| | -- test
| | -- train_val
| |
| -- filled_values_segment
| | |
| | -- test
| | -- train_val
| |
| -- masked_videos
| | |
| | -- blacked_mask_videos
| | | |
| | | - name_base{i}_black.avi
| | |
| | -- labels_mask
| | | |
| | | - name_base{i}.csv
| | |
| | -- original_mask_videos
| | |
| | - name_base{i}_masked.avi
| |
| - test_filled_values_id_label_map.csv
| - test_segment_dataset.csv
| - train_filled_values_id_label_map.csv
| - test_segment_dataset.csv
|
-- raw
|
-- gaze
| |
| - name_base{i}.txt
|
-- ground_truth
| |
| - behaviour_ground_truth_name_base{i}.csv
|
-- labels
| |
| - labels_yps.json
|
-- video_times
| |
| - video_times.txt
|
-- videos
|
- name_base{i}.avi
logs
models
|
-- ThreeDCNN
| |
| -- classification_NN
| |
| --temp_train
|
-- TwoDCNN
|
-mrcnn
|
- mask_rcnn_hands.h5
- mask_rcnn_yps.h5
reports
|
-- figures
| |
| -- acc
| -- loss
| -- ClassifiactionRep
| -- ComparisonPredTrue
| -- ConfusionMat
| -- temp_acc_loss
|
-- predictions
src
|
-- ThreeDCNN
| |
| -- dataset_creation
| | |
| | - __init__.py
| | - create_segments.py
| | - extract_features.py
| | - utils.py
| |
| -- models
| |
| -- classifiaction_network
| | |
| | - __init__.py
| | - classifiaction_model.py
| | - train_model.py
| | - utils.py
| |
| -- data_generator
| | |
| | - __init__.py
| | - ThreeDimCNN_datagenerator.py
| |
| -- post_processing
| | |
| | - __init__.py
| | - post_process.py
| | - utils.py
| |
| -- prediction
| | |
| | - __init__.py
| | - predict.py
| | - utils.py
| |
| - __init__.py
|
-- TwoDCNN
|
-- models
| |
| - 2DCNN_inference.py
| - __init__.py
| - makse_mask_gaze_video.py
| - utils.py
|
-- mrcnn
|
- __init__.py
- config.py
- LICENSE
- model.py
- parakkek_model.py (?)
- utils-py
- visualize
venv
|
- HEC_CNN_env.yml
{"bg": 0, "Pen": 1, "Phone": 2, "Pillow":3, "Smart": 4}
"""
Stephan Wegner & Vithurjan Visuvalingam
pdz, ETH Zürich
2020
This file contains all definitions, you might want to use for running the software code
"""
import os
import json
import numpy as np
from opti import hyperparam, length
#path
ROOT_DIR = os.path.dirname(os.path.abspath(__file__))
CONFIG_PATH = os.path.join(ROOT_DIR, 'configuration.conf')
name_base = 'EHC_Y_P'
#Please choose, which oparation(s) you want to start by choosing 0 (operation is not started) or 1 (operation will be started)
operation = {'2DCNN_Inference': 0,
'extract_features': 0,
'create_segments': 0,
'3DCNN_train_class': 0,
'3DCNN_predict': 1,
'post-processing': 1
}
#configurations
train_val_nums = [2,3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,21,24,25,26,27,28,29,30,31,32,33,34]
test_nums = [1,10,11,22,23]
# choose: "train" or "test"
mode = "test"
##############################
### Definitions for 2D CNN ###
##############################
#choose: "original" or "black"
image_type='original'
# 'w_pen': 1.0,'w_phone':0.96,'w_pillow':1.3, 'w_smart': 1.6
class_weights = [1, 0.96, 1.3, 1.6]
#Read out the OOIs for the 2D_CNN, Hands are added subsequent to OOIs
path_to_labels = ROOT_DIR + '\\data\\raw\\labels\\labels_yps.json'
with open(path_to_labels) as json_file:
data = json.load(json_file)
OOI_data = np.array(list(data.items()))
OOIs=[]
for i in range(len(OOI_data)):
OOIs.append(str(OOI_data[i][0]))
OOIs.append('Hands')
##############################
### Definitions for 3D CNN ###
##############################
###################
#### For inference
path_to_load_classification_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\2020Dec24_0035_0.81_TCNN_class_acc_0.6784.h5"
HEC_classes = ['Background', 'Guiding', 'Directing', 'Checking', 'Observing']
#################
#### For training
# During the training, the model is saved every 5 epochs.
# Define learning rate and number of epochs for the training
# The training set is devided into k folds for training/ val split.
# You can define, which fold you want to start in the training
# Parameters for training the 3D CNN
hyperparam=hyperparam #Hyperparameters for training the 3D CNN
length=length # number of hyperparameter sets
#k-fold cross validation, start and end point defined for validation set, remaining samples are collected in training set
starts=[] #start of the split, for k=5: 0.01, 0.21, 0.41, 0.61, 0.81
ends=[] #end of the split, for k=5: 0.20, 0.40, 0.60, 0.80, 1.00
k = 5
step = 1 / k
a = 0
while a < k:
start = round((1 / 100 + 10 * step * a / 10), 2)
end = round((start + step - 1 / 100), 2)
ends.append(end)
starts.append(start)
a+=1
runs=len(starts)
# decision, which fold is start fold (0,1,2,3,4)
start_fold=0
# choose if undersampling should be applied (True or False)
undersampling = False
##### Restart training
# You can restart the training from one of the saved models
path_to_load_retrain_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\temp_training\epoch_15.h5"
# choose model via path_to_load_retrain_network or 'None'
model = 'None'
# number of epoch to restart training
if model == 'None':
start_epoch = 0
else:
start_epoch = 15
# new learning rate for training the model
learning_rate= 1e-3
epochs = 50
##################################
###Definitions for post-procession
##################################
# Define if you want to use BG in the OOI distribution - HEC plot
use_bg = False
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment