My initial thoughts on where folks in this community might be able to help is in the realm of NLP. The tasks are all about sifting through a large amount of articles to locate information that can be used to answer a list of questions. This looks like a good candidate for word SDRs created through semantic folding.
Word SDRs can be stacked and sparsified (Cortical IO has some videos on YouTube which explain how to do this), allowing one to create semantically relevant SDRs for sentences, paragraphs, papers, and/or collections of papers. SDRs can have the bits from other SDRs subtracted from them, to allow one to identify the main islands of semantic meaning.
Some places where this could be useful for CORD-19 are:
- A question can be broken down by a person into a set of keywords, which can then be encoded into an SDR to use for searching potentially relevant documents
- If one article is found which is of particular interest, one could easily search and return other documents in the data set which are most similar to it in semantic meaning.
- A set of similar documents can be encoded into an SDR, find the most similar word, subtract the SDR for that word, and repeat (identifying the islands of semantics that are common to the group of documents). This could be used to quickly identify correlations between documents that might otherwise require a lot of deeper reading and study of the documents to identify. This could then be used to feed back into the search keywords.
If anyone is interested in working with me on this strategy to try and answer some of the questions, let me know. I will put some code up on GitHub if we find that it is actually useful for these sorts of tasks.