To receive notifications about scheduled maintenance, please subscribe to the mailing-list gitlab-operations@sympa.ethz.ch. You can subscribe to the mailing-list at https://sympa.ethz.ch

Commit eb81b9ee authored by stehess's avatar stehess
Browse files

Adapted format of README.md

parent e35e5537
# Hand-eye-coordination (HEC or EHC)
This Readme shall provide all necessary insights to use the framework in order to recognize hand eye-coordination (HEC) patterns in eye tracking videos.
FYI: The software code has been developed and tested in a Windows 10 Education OS environment only.
## License
The code and the models in this repo are released under the [MIT License](https://gitlab.ethz.ch/pdz/eye-hand-coordination/-/blob/master/LICENSE).
## Installation
# First step: Download and install anaconda
# Second step: Create environment from .yml file
conda env create -f HEC_CNN_env.yml
# Third step: Activate conda environment
conda activate HEC_CNN_env
# FYI - Save an environment with
conda env export > HEC_CNN_env.yml
## Citation
If you use our code in your research or wish to refer to the baseline results, please use the following BibTeX entry.
>>>>>>>> CHECK THIS CITATION BEFORE RELEASE <<<<<<<<<<<<<<<
@article{NonLocal2020,
author = {Stephan Wegner, Felix Wang, Sophokles Ktistakis, Julian Wolf, Quentin Lohmeyer, Mirko Meboldt},
title = {FILL IN TITLE HERE},
journal = {PlosONE},
year = {2020}
}
## Structure of the data
**Video files:**
.avi files with name *name_base{i}.avi*
**Gaze coordinate files:**
.txt files with name *name_base{i}.txt*
*Headings in file:*
| RecordingTime [ms] | Point of Regard Binocular X [px] | Point of Regard Binocular Y [px] | Video Time [h:m:s:ms] |
**Labels for Mask-RCNN:**
.json file with name *labels_yps.json*
*Structure in file:*
{"bg": 0, "Obj1": 1, "Ojb2": 2, "Obj3":3, "Obj4": 4}
*(You can have as many object as you wish, according to your trained model)*
**Ground truth files for 3D-ConvNet:**
.csv files with name *behaviour_ground_truth_name_base{i}.csv*
*Headings in file (exported from SMI BeGaze 3.6):*
| Frame time | Frame number | Behaviour |
**Video times to cut orginal videos:**
.txt file with name *video_times.txt*
*Headings in file:*
| Name | start [s] | end [s] |
##Defintions in definitions.py
Open definitions.py in your favorite source code editor (e.g. [Atom](https://atom.io/) or [Pycharm](https://www.jetbrains.com/de-de/pycharm/))
###Define the name base for your files (video (.avi), gaze coordinates (.txt), and behaviour ground truth (.csv))
# As an example:
name_base = 'EHC_Y_P'
###Define, which scripts you want to run
#Please choose, which oparation(s) you want to start by choosing 0 (operation is not started) or 1 (operation will be started)
operation = {'2DCNN_Inference': 1,
'extract_features': 1,
'create_segments': 1,
'3DCNN_train_class': 1,
'3DCNN_predict': 1,
'post-processing': 1
}
###Define, which IDs are in the training and in the test set
# the rule for naming is name_base{i}, both for video and gaze coordinate files
train_val_nums = [2,3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,21,24,25,26,27,28,29,30,31,32,33,34]
test_nums = [1,10,11,22,23]
###Define if you want to run training or test
# choose mode: "train" or "test"
mode = "train"
###For the 2D-ConvNet, definitions are
# choose: "original" or "black" background
image_type='original'
# weights for paper use case work well, please adapt to your specific use case
# 'w_pen': 1.0,'w_phone':0.96,'w_pillow':1.3, 'w_smart': 1.6
class_weights = [1, 0.96, 1.3, 1.6]
###For the 3D-ConvNet, definitions are
#### For inference
path_to_load_classification_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\22i_TCNN_class_acc_0.65652174.h5"
HEC_classes = ['Background', 'Guiding', 'Directing', 'Checking', 'Observing']
#### For training
* During the training, the model is saved every 5 epochs.
* Define learning rate and number of epochs for the training
* The training set is devided into k folds for training/ val split.
* You can define, which fold you want to start in the training
# Parameters for training the 3D CNN
hyperparam=hyperparam #Hyperparameters for tuning the 3D CNN
length=length # number of hyperparameter sets
#k-fold cross validation, start and end point defined for validation set, remaining samples are collected in training set
starts=[] #start of the split, for k=5: 0.01, 0.21, 0.41, 0.61, 0.81
ends=[] #end of the split, for k=5: 0.20, 0.40, 0.60, 0.80, 1.00
k = 5
step = 1 / k
a = 0
while a < k:
start = round((1 / 100 + 10 * step * a / 10), 2)
end = round((start + step - 1 / 100), 2)
ends.append(end)
starts.append(start)
a+=1
runs=len(starts)
# learning rate for training the model
learning_rate= 1e-3
epochs = 50
# decision, which fold is start fold (0,1,2,3,4)
start_fold=0
# choose if undersampling should be applied (True or False)
undersampling = False
# Choose model = 'None'
model = 'None'
##### Restart training
* You can restart the training from one of the saved models
## Definitions path to to-be retrained model
path_to_load_retrain_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\temp_training\BGtrain_05s\3DCNN_color_f1_GP_epoch_40.h5"
# choose model to re-train
model = path_to_load_retrain_network
# number of epoch to restart training
if model == 'None':
start_epoch = 0
else:
start_epoch = 40
###Definitions for post-procession
use_bg = False
## Run code
# Go to folder of the project and open the command line
python main.py
## Structure of the project
definitiony.py
LICENSE
main.py
opti.py
README.md
data
|
-- datasets
| |
| -- dadaset_gt
| | |
| | - behaviour_ground_truth_name_base{i}.txt
| |
| -- extracted_images
| | |
| | -- test
| | -- train_val
| |
| -- filled_values_segment
| | |
| | -- test
| | -- train_val
| |
| -- masked_videos
| | |
| | -- blacked_mask_videos
| | | |
| | | - name_base{i}_black.avi
| | |
| | -- labels_mask
| | | |
| | | - name_base{i}.csv
| | |
| | -- original_mask_videos
| | |
| | - name_base{i}_masked.avi
| |
| - test_filled_values_id_label_map.csv
| - test_segment_dataset.csv
| - train_filled_values_id_label_map.csv
| - test_segment_dataset.csv
|
-- raw
|
-- gaze
| |
| - name_base{i}.txt
|
-- ground_truth
| |
| - behaviour_ground_truth_name_base{i}.csv
|
-- labels
| |
| - labels_yps.json
|
-- video_times
| |
| - video_times.txt
|
-- videos
|
- name_base{i}.avi
logs
models
|
-- ThreeDCNN
| |
| -- classification_NN
| |
| --temp_train
|
-- TwoDCNN
|
-mrcnn
|
- mask_rcnn_hands.h5
- mask_rcnn_yps.h5
reports
|
-- figures
| |
| -- acc
| -- loss
| -- ClassifiactionRep
| -- ComparisonPredTrue
| -- ConfusionMat
| -- temp_acc_loss
|
-- predictions
src
|
-- ThreeDCNN
| |
| -- dataset_creation
| | |
| | - __init__.py
| | - create_segments.py
| | - extract_features.py
| | - utils.py
| |
| -- models
| |
| -- classifiaction_network
| | |
| | - __init__.py
| | - classifiaction_model.py
| | - train_model.py
| | - utils.py
| |
| -- data_generator
| | |
| | - __init__.py
| | - ThreeDimCNN_datagenerator.py
| |
| -- post_processing
| | |
| | - __init__.py
| | - post_process.py
| | - utils.py
| |
| -- prediction
| | |
| | - __init__.py
| | - predict.py
| | - utils.py
| |
| - __init__.py
|
-- TwoDCNN
|
-- models
| |
| - 2DCNN_inference.py
| - __init__.py
| - makse_mask_gaze_video.py
| - utils.py
|
-- mrcnn
|
- __init__.py
- config.py
- LICENSE
- model.py
- parakkek_model.py (?)
- utils-py
- visualize
venv
|
- HEC_CNN_env.yml
# Hand-eye-coordination (HEC or EHC)
This Readme shall provide all necessary insights to use the framework in order to recognize hand eye-coordination (HEC) patterns in eye tracking videos.
FYI: The software code has been developed and tested in a Windows 10 Education OS environment only.
## License
The code and the models in this repo are released under the [MIT License](https://gitlab.ethz.ch/pdz/eye-hand-coordination/-/blob/master/LICENSE).
## Installation
# First step: Download and install anaconda
# Second step: Create environment from .yml file
conda env create -f HEC_CNN_env.yml
# Third step: Activate conda environment
conda activate HEC_CNN_env
# FYI - Save an environment with
conda env export > HEC_CNN_env.yml
## Citation
If you use our code in your research or wish to refer to the baseline results, please use the following BibTeX entry.
>>>>>>>> CHECK THIS CITATION BEFORE RELEASE <<<<<<<<<<<<<<<
@article{NonLocal2020,
author = {Stephan Wegner, Felix Wang, Sophokles Ktistakis, Julian Wolf, Quentin Lohmeyer, Mirko Meboldt},
title = {FILL IN TITLE HERE},
journal = {PlosONE},
year = {2020}
}
## Structure of the data
**Video files:**
.avi files with name *name_base{i}.avi*
**Gaze coordinate files:**
.txt files with name *name_base{i}.txt*
*Headings in file:*
| RecordingTime [ms] | Point of Regard Binocular X [px] | Point of Regard Binocular Y [px] | Video Time [h:m:s:ms] |
**Labels for Mask-RCNN:**
.json file with name *labels_yps.json*
*Structure in file:*
{"bg": 0, "Obj1": 1, "Ojb2": 2, "Obj3":3, "Obj4": 4}
*(You can have as many object as you wish, according to your trained model)*
**Ground truth files for 3D-ConvNet:**
.csv files with name *behaviour_ground_truth_name_base{i}.csv*
*Headings in file (exported from SMI BeGaze 3.6):*
| Frame time | Frame number | Behaviour |
**Video times to cut orginal videos:**
.txt file with name *video_times.txt*
*Headings in file:*
| Name | start [s] | end [s] |
## Defintions in definitions.py
Open definitions.py in your favorite source code editor (e.g. [Atom](https://atom.io/) or [Pycharm](https://www.jetbrains.com/de-de/pycharm/))
### Define the name base for your files (video (.avi), gaze coordinates (.txt), and behaviour ground truth (.csv))
# As an example:
name_base = 'EHC_Y_P'
### Define, which scripts you want to run
#Please choose, which oparation(s) you want to start by choosing 0 (operation is not started) or 1 (operation will be started)
operation = {'2DCNN_Inference': 1,
'extract_features': 1,
'create_segments': 1,
'3DCNN_train_class': 1,
'3DCNN_predict': 1,
'post-processing': 1
}
### Define, which IDs are in the training and in the test set
# the rule for naming is name_base{i}, both for video and gaze coordinate files
train_val_nums = [2,3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,21,24,25,26,27,28,29,30,31,32,33,34]
test_nums = [1,10,11,22,23]
### Define if you want to run training or test
# choose mode: "train" or "test"
mode = "train"
### For the 2D-ConvNet, definitions are
# choose: "original" or "black" background
image_type='original'
# weights for paper use case work well, please adapt to your specific use case
# 'w_pen': 1.0,'w_phone':0.96,'w_pillow':1.3, 'w_smart': 1.6
class_weights = [1, 0.96, 1.3, 1.6]
### For the 3D-ConvNet, definitions are
#### For inference
path_to_load_classification_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\22i_TCNN_class_acc_0.65652174.h5"
HEC_classes = ['Background', 'Guiding', 'Directing', 'Checking', 'Observing']
#### For training
* During the training, the model is saved every 5 epochs.
* Define learning rate and number of epochs for the training
* The training set is devided into k folds for training/ val split.
* You can define, which fold you want to start in the training
# Parameters for training the 3D CNN
hyperparam=hyperparam #Hyperparameters for tuning the 3D CNN
length=length # number of hyperparameter sets
#k-fold cross validation, start and end point defined for validation set, remaining samples are collected in training set
starts=[] #start of the split, for k=5: 0.01, 0.21, 0.41, 0.61, 0.81
ends=[] #end of the split, for k=5: 0.20, 0.40, 0.60, 0.80, 1.00
k = 5
step = 1 / k
a = 0
while a < k:
start = round((1 / 100 + 10 * step * a / 10), 2)
end = round((start + step - 1 / 100), 2)
ends.append(end)
starts.append(start)
a+=1
runs=len(starts)
# learning rate for training the model
learning_rate= 1e-3
epochs = 50
# decision, which fold is start fold (0,1,2,3,4)
start_fold=0
# choose if undersampling should be applied (True or False)
undersampling = False
# Choose model = 'None'
model = 'None'
##### Restart training
* You can restart the training from one of the saved models
## Definitions path to to-be retrained model
path_to_load_retrain_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\temp_training\BGtrain_05s\3DCNN_color_f1_GP_epoch_40.h5"
# choose model to re-train
model = path_to_load_retrain_network
# number of epoch to restart training
if model == 'None':
start_epoch = 0
else:
start_epoch = 40
### Definitions for post-procession
use_bg = False
## Run code
# Go to folder of the project and open the command line
python main.py
## Structure of the project
definitiony.py
LICENSE
main.py
opti.py
README.md
data
|
-- datasets
| |
| -- dadaset_gt
| | |
| | - behaviour_ground_truth_name_base{i}.txt
| |
| -- extracted_images
| | |
| | -- test
| | -- train_val
| |
| -- filled_values_segment
| | |
| | -- test
| | -- train_val
| |
| -- masked_videos
| | |
| | -- blacked_mask_videos
| | | |
| | | - name_base{i}_black.avi
| | |
| | -- labels_mask
| | | |
| | | - name_base{i}.csv
| | |
| | -- original_mask_videos
| | |
| | - name_base{i}_masked.avi
| |
| - test_filled_values_id_label_map.csv
| - test_segment_dataset.csv
| - train_filled_values_id_label_map.csv
| - test_segment_dataset.csv
|
-- raw
|
-- gaze
| |
| - name_base{i}.txt
|
-- ground_truth
| |
| - behaviour_ground_truth_name_base{i}.csv
|
-- labels
| |
| - labels_yps.json
|
-- video_times
| |
| - video_times.txt
|
-- videos
|
- name_base{i}.avi
logs
models
|
-- ThreeDCNN
| |
| -- classification_NN
| |
| --temp_train
|
-- TwoDCNN
|
-mrcnn
|
- mask_rcnn_hands.h5
- mask_rcnn_yps.h5
reports
|
-- figures
| |
| -- acc
| -- loss
| -- ClassifiactionRep
| -- ComparisonPredTrue
| -- ConfusionMat
| -- temp_acc_loss
|
-- predictions
src
|
-- ThreeDCNN
| |
| -- dataset_creation
| | |
| | - __init__.py
| | - create_segments.py
| | - extract_features.py
| | - utils.py
| |
| -- models
| |
| -- classifiaction_network
| | |
| | - __init__.py
| | - classifiaction_model.py
| | - train_model.py
| | - utils.py
| |
| -- data_generator
| | |
| | - __init__.py
| | - ThreeDimCNN_datagenerator.py
| |
| -- post_processing
| | |
| | - __init__.py
| | - post_process.py
| | - utils.py
| |
| -- prediction
| | |
| | - __init__.py
| | - predict.py
| | - utils.py
| |
| - __init__.py
|
-- TwoDCNN
|
-- models
| |
| - 2DCNN_inference.py
| - __init__.py
| - makse_mask_gaze_video.py
| - utils.py
|
-- mrcnn
|
- __init__.py
- config.py
- LICENSE
- model.py
- parakkek_model.py (?)