README.md 330 Bytes
Newer Older
adriahof's avatar
adriahof committed
1
2
# wip_kaggle

adriahof's avatar
adriahof committed
3
4
5
6
the 'most_similar_papers_via_tfidf.py' script is the heart of this WIP. All other
files are just here to see what I have been exploring in the dataset. Sadly, 
'most_similar_papers_via_tfidf.py' has a problem with parsing some of the papers. 
So it is advised to only try the code with the data from biorxiv_medrxiv.