To receive notifications about scheduled maintenance, please subscribe to the mailing-list gitlab-operations@sympa.ethz.ch. You can subscribe to the mailing-list at https://sympa.ethz.ch

README.md 8.88 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
# Hand-eye-coordination (HEC or EHC)

This Readme shall provide all necessary insights to use the framework in order to recognize hand eye-coordination (HEC) patterns in eye tracking videos.
FYI: The software code has been developed and tested in a Windows 10 Education OS environment only.

## License
The code and the models in this repo are released under the [MIT License](https://gitlab.ethz.ch/pdz/eye-hand-coordination/-/blob/master/LICENSE).

## Installation

    # First step:   Download and install anaconda
    
    # Second step:  Create environment from .yml file
    conda env create -f HEC_CNN_env.yml
        
    # Third step:   Activate conda environment
    conda activate HEC_CNN_env
        
    # FYI - Save an environment with 
    conda env export > HEC_CNN_env.yml

## Citation
If you use our code in your research or wish to refer to the baseline results, please use the following BibTeX entry.   
>>>>>>>> CHECK THIS CITATION BEFORE RELEASE <<<<<<<<<<<<<<<

    @article{NonLocal2020,
        author =   {Stephan Wegner, Felix Wang, Sophokles Ktistakis, Julian Wolf, Quentin Lohmeyer, Mirko Meboldt},
        title =    {FILL IN TITLE HERE},
        journal =  {PlosONE},
        year =     {2020}
    }


## Structure of the data
**Video files:**    
.avi files with name *name_base{i}.avi*

**Gaze coordinate files:**  
.txt files with name *name_base{i}.txt*  
*Headings in file:*  
| RecordingTime [ms] | Point of Regard Binocular X [px] | Point of Regard Binocular Y [px] | Video Time [h:m:s:ms] |

**Labels for Mask-RCNN:**   
.json file with name *labels_yps.json*  
*Structure in file:*    
{"bg": 0, "Obj1": 1, "Ojb2": 2, "Obj3":3, "Obj4": 4}    
*(You can have as many object as you wish, according to your trained model)*

**Ground truth files for 3D-ConvNet:**     
.csv files with name *behaviour_ground_truth_name_base{i}.csv*    
*Headings in file (exported from SMI BeGaze 3.6):*     
| Frame time | Frame number | Behaviour |

**Video times to cut orginal videos:**   
.txt file with name *video_times.txt*   
*Headings in file:*     
| Name | start [s] | end [s] |
   
    
##Defintions in definitions.py
Open definitions.py in your favorite source code editor (e.g. [Atom](https://atom.io/) or [Pycharm](https://www.jetbrains.com/de-de/pycharm/))

###Define the name base for your files (video (.avi), gaze coordinates (.txt), and behaviour ground truth (.csv))
    # As an example:
    name_base = 'EHC_Y_P'

###Define, which scripts you want to run
        
    #Please choose, which oparation(s) you want to start by choosing 0 (operation is not started) or 1 (operation will be started)
    operation = {'2DCNN_Inference':     1,
                 'extract_features':    1,
                 'create_segments':     1,
                 '3DCNN_train_class':   1,
                 '3DCNN_predict':       1,
                 'post-processing':     1
                }
    

###Define, which IDs are in the training and in the test set    

    # the rule for naming is name_base{i}, both for video and gaze coordinate files
    train_val_nums = [2,3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,21,24,25,26,27,28,29,30,31,32,33,34]
    test_nums = [1,10,11,22,23]

###Define if you want to run training or test
    # choose mode: "train" or "test"
    mode = "train"
    
###For the 2D-ConvNet, definitions are  
    # choose: "original" or "black" background
    image_type='original'

    # weights for paper use case work well, please adapt to your specific use case 
    # 'w_pen': 1.0,'w_phone':0.96,'w_pillow':1.3, 'w_smart': 1.6
    class_weights = [1, 0.96, 1.3, 1.6]

###For the 3D-ConvNet, definitions are    
    
#### For inference
    path_to_load_classification_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\22i_TCNN_class_acc_0.65652174.h5"    
    HEC_classes = ['Background', 'Guiding', 'Directing', 'Checking', 'Observing']
    
#### For training   
* During the training, the model is saved every 5 epochs.
* Define learning rate and number of epochs for the training
* The training set is devided into k folds for training/ val split.
* You can define, which fold you want to start in the training   
 

    # Parameters for training the 3D CNN
    hyperparam=hyperparam #Hyperparameters for tuning the 3D CNN
    length=length # number of hyperparameter sets
    
    #k-fold cross validation, start and end point defined for validation set, remaining samples are collected in training set
    starts=[]   #start of the split, for k=5: 0.01, 0.21, 0.41, 0.61, 0.81
    ends=[]     #end of the split,   for k=5: 0.20, 0.40, 0.60, 0.80, 1.00
    
    k = 5
    step = 1 / k
    
    a = 0
    while a < k:
        start = round((1 / 100 + 10 * step * a / 10), 2)
        end = round((start + step - 1 / 100), 2)    
        ends.append(end)
        starts.append(start)    
        a+=1
    runs=len(starts)
    
    # learning rate for training the model
    learning_rate= 1e-3
    epochs = 50
    
    # decision, which fold is start fold (0,1,2,3,4)
    start_fold=0
    
    # choose if undersampling should be applied (True or False)
    undersampling = False
    
    # Choose model = 'None'
    model = 'None'

##### Restart training    
* You can restart the training from one of the saved models
 
 
    ## Definitions path to to-be retrained model
    path_to_load_retrain_network = ROOT_DIR + "\\" + r"models\ThreeDCNN\classification_NN\temp_training\BGtrain_05s\3DCNN_color_f1_GP_epoch_40.h5"
 
    
    # choose model to re-train
    model = path_to_load_retrain_network
           
    # number of epoch to restart training
    if model == 'None':
        start_epoch = 0
    else:
        start_epoch = 40    
    
    
###Definitions for post-procession
    
    use_bg = False
    
## Run code
    
    # Go to folder of the project and open the command line
    python main.py 

## Structure of the project

    definitiony.py
    LICENSE
    main.py
    opti.py
    README.md
    
    data
    |
    -- datasets
    |   |
    |   -- dadaset_gt
    |   |   |
    |   |    - behaviour_ground_truth_name_base{i}.txt
    |   |
    |   -- extracted_images
    |   |   |
    |   |   -- test
    |   |   -- train_val
    |   |
    |   -- filled_values_segment
    |   |   |
    |   |   -- test
    |   |   -- train_val
    |   |
    |   -- masked_videos
    |   |   |
    |   |   -- blacked_mask_videos
    |   |   |   |
    |   |   |   - name_base{i}_black.avi
    |   |   |
    |   |   -- labels_mask
    |   |   |   |
    |   |   |   - name_base{i}.csv
    |   |   |
    |   |   -- original_mask_videos
    |   |       |
    |   |       - name_base{i}_masked.avi
    |   |
    |   - test_filled_values_id_label_map.csv
    |   - test_segment_dataset.csv
    |   - train_filled_values_id_label_map.csv
    |   - test_segment_dataset.csv
    |
    -- raw
        |
        -- gaze
        |   |
        |   - name_base{i}.txt
        | 
        -- ground_truth
        |   |
        |   - behaviour_ground_truth_name_base{i}.csv
        |
        -- labels
        |   |
        |   - labels_yps.json
        |
        -- video_times
        |   |
        |   - video_times.txt    
        |
        -- videos
            |
            - name_base{i}.avi
    
    logs
    
    models
    |
    -- ThreeDCNN
    |   |
    |   -- classification_NN
    |       |
    |       --temp_train
    |
    -- TwoDCNN
        |
        -mrcnn
            |
            - mask_rcnn_hands.h5
            - mask_rcnn_yps.h5
    
    reports
    |
    -- figures
    |   |
    |   -- acc
    |   -- loss
    |   -- ClassifiactionRep
    |   -- ComparisonPredTrue
    |   -- ConfusionMat
    |   -- temp_acc_loss
    |
    -- predictions
    
    src
    |
    -- ThreeDCNN
    |   |
    |   -- dataset_creation
    |   |   |
    |   |   - __init__.py
    |   |   - create_segments.py
    |   |   - extract_features.py
    |   |   - utils.py
    |   |
    |   -- models
    |       |
    |       -- classifiaction_network
    |       |   |
    |       |   - __init__.py
    |       |   - classifiaction_model.py
    |       |   - train_model.py
    |       |   - utils.py
    |       |
    |       -- data_generator
    |       |   |
    |       |   - __init__.py
    |       |   - ThreeDimCNN_datagenerator.py
    |       |
    |       -- post_processing
    |       |   |
    |       |   - __init__.py
    |       |   - post_process.py
    |       |   - utils.py
    |       |
    |       -- prediction
    |       |   |
    |       |   - __init__.py
    |       |   - predict.py
    |       |   - utils.py
    |       |
    |       - __init__.py
    |
    -- TwoDCNN
        |
        -- models
        |   |
        |   - 2DCNN_inference.py
        |   - __init__.py
        |   - makse_mask_gaze_video.py
        |   - utils.py
        |
        -- mrcnn
            |
            - __init__.py
            - config.py
            - LICENSE
            - model.py
            - parakkek_model.py (?)
            - utils-py
            - visualize
    
    venv
    |
    - HEC_CNN_env.yml
    
    
    
    

Administrator's avatar
Administrator committed
333