Align script
For aligning a sequence there are a few steps:
- decompress the snapshot '.provision.json.xz' format
- load a lookup file (key:seq_hash; value:aligned_sequence) with the alignment for all the previously aligned sequences
- split all the sequences which could not be found in the lookup file in N smaller snapshot modules
- run NextAlign against each snapshot module in parallel
- merge together all the aligned sequences (aligned using the lookup table or NextAlign) into a final
.provision.json
file - update the lookup file with the new aligned sequences