Add preprocessing script for aws genbank (!55) · Merge requests · rmuntean / Tracking Changes

rmuntean requested to merge aws_s3 into main Jul 16, 2022

We will use AWS S3 (based on GenBank, by Nextstrain) as the dataset from which the snapshots are created. This PR implements the required preprocessing steps for all AWS S3 data to compose the snapshot suitable for our tracking changes tool. This consists of a JSON file that contains the following fields for each sequence:

Accession ID
Collection Date
Location
Genome Sequence (in the aligned format)
Owner Lab

Admin message

Add preprocessing script for aws genbank

Merge request reports