Chris Sarnowski · 1db4989e
--- a/How-to-run.md
+++ b/How-to-run.md
+## xTract How to run
+Note: All executable programs have a help section. Use -help to view usage instructions.
+
+### 1. Open a terminal or a command line and create a directory for the analysis
+
+```
+>mkdir xtract_analysis
+>cd xtract_analysis
+```
+
+### 2. Export your identifications (IDs) in the xtract format using the xQuest/xProphet viewer (v. 2.2.3).
+
+Note that you need to install the latest version of xQuest (V2_1_3) to have the export option available.
+For exporting the IDs from the viewer, select "Select type of report: xtract csv", select your FDR cutoff and type of cross-link (use only target cross-links). Important: Uncheck the box "Filter by unique ids (top hit)".
+Copy the file e.g IDfile.csv to the xtract_analysis folder.
+
+### 3. Convert your raw mass spectrometer data to mzXML files as profile mzXMLs.
+
+To convert the RAW mass spectrometer files to the mzXML format, we recommend the tool msconvert which is availaible from [http://proteowizard.sourceforge.net](http://proteowizard.sourceforge.net), this step needs to be performed on a Windows system. This tool provides a graphical user interface for converting raw MS data to different formats. When you are converting the raw files with msconvert, select the mzXML format as output format, use 32-bit binary encoding precision and uncheck ‘Use zlib compression’. It is mandatory that the spectral data be encoded with 32-bit binary precision.
+
+## xTract pipeline
+### 4. Add the precursor intensities to your IDs file.
+
+`>xtract_add_precursor_intensity.pl -input IDfile.csv -mzxmlpath /path/to/profile/mzXMLs/`
+
+Note: Adapt the path to your mzXML files. This program generates the output file: IDfile.precursor_intensity.csv
+
+### 5. Preparations for xTract
+#### 5.a Get the standard definition file.
+
+`>xtract_prep.pl`
+
+This copies the standard definition file to your currend working directory. Edit the file if needed.
+
+#### 5.b Generate a list file (e.g. name it "files") with the files you want to extract.
+
+```
+	wathomas_C1305_191.mzXML
+
+	wathomas_C1305_192.mzXML
+
+	...
+```
+
+#### 5.c Run xtract_prep.pl to generate the folderstructure, decoy IDs and parameter files for each extraction.
+
+`>xtract_prep.pl -mzxmlpath /path/to/mzXMLs/ -decoy -listdef files -idfile IDfile.precursor_intensity.csv -nq`
+
+Note: This program carries out several tasks.
+
+```
+	1. Read definition file.
+	2. Creation of ID and IPC folders.
+	3. Generation of decoy and IPC commands.
+	4. Edit and copy the definition file to the folders.
+	5. Generation and execution of decoy commands.
+	6. Generation and execution of IPC commands.
+	7. Generation of command files for IPC.
+	8. Create xTract commands for batch submission.
+```
+	
+### 6. Run xTract.
+
+`>xtract_run.pl -listdef files`
+
+Note: This executes xtract.pl sequentially for each mzXML file.
+
+If a queuing system like qsub is available on your server you can use the flags -submit and -qcommand to submit the jobs to the queue.
+
+To check the current progress you can use the program xtract_progress_c.pl.
+
+### 7. Merge the results, generate the peak group statistics by mProphet and generate the retention time normalization file.
+Note: Prepare a file called "consensusruns" which contains MSruns (the basenames, i.e. without .mzXML ending) that represent the sample well, for example one run from each fraction. They are used for retention time (Tr) TR regression to project the features.
+
+```
+>xtract_merger.pl -pjx TR_Normalization -list files -runstats -exp_ms2basedIDs_only
+>cp consensusruns ./TR_Normalization/consensusruns
+>cd TR_Normalization
+>xtract_normalizer.pl -pjx TR_Normalization -consensusruns consensusruns
+>cp TR_Normalization.normalizer.csv ../
+```
+
+Note: This command first merges a subset of the xTract results (those IDs with MS2 evidence in the xtracted run) and runs the mProphet statistics.
+
+In the next step these IDs are used to generate a retention time normalization file.
+
+This file is required, to optimize the etraction procedure, which is carried out a second time in the next step.
+
+### 8. Prepare folders again with the TR normalization file and re-run xTract.
+
+```
+>xtract_prep.pl -mzxmlpath /path/to/mzXMLs/ -decoy -norm_csv TR_Normalization.normalizer.csv -noforce -noexec -listdef files -idfile IDfile.precursor_intensity.csv
+>xtract_run.pl -list files
+```
+
+### 9. Merge the results and run the mProphet statistics.
+
+`xtract_merger.pl -list files -pjx EXP_NAME -runstats`
+
+Note: This generates the result folder EXP_NAME
+
+### 10. Run xTract-analyzer
+Description: xTract-analyzer is the last step in the pipeline. It summarizes the results and calculates the fold changes and p-values for the individual identifications.
+
+#### 10.1 Prepare an experiment definition file.
+A sample definition file can be downloaded here.
+
+[EXP_def_T.xls](uploads/896765668c57335fce5cb67a981e4447/EXP_def_T.xls)
+
+The file is a tab separated file, describes the experiments and can be edited with a spreadsheet viewer program.
+Adapt the definition file with your MS runs and translate it to a def file using:
+
+`xtract_write_exp_def.pl -input EXP_def_T.xls`
+
+Note: The program will ask you to define the reference experiment and the workflow that should be used. The program generates the file EXP_def_T.def which also contains the analysis parameters at the end. Adapt these parameters if necessary.
+
+#### 10.2 Run xtract-analyzer.pl
+
+Copy the EXP_def_T.def file to the EXP_NAME folder and change directory to this folder:
+
+```
+>cp EXP_def_T.def ./EXP_NAME
+>cd /EXP_NAME
+>xtract_analyzer.pl -pjx EXP_NAME -def EXP_def_T.def
+```
+
+The quantification results are written to the EXP_NAME.analyzer.quant.xls file
+
+The column headers are described here: [Column headers description](uploads/334c724669a5d89cf45ef70cefcea5c4/description_columns_quant_file.txt)
\ No newline at end of file