Current progress in Mark2Cure
Thanks to our wonderful community of citizen scientists and volunteers, the NFE2L1 entity recognition mission is now 99% complete! If you haven’t already completed quests in this mission, please help us finish it! The other entity recognition missions are also over two thirds complete and are in need of your reading skills. If entity recognition is too easy, please try out our relationship extraction module which can be quite challenging and may require more critical reading skills.
New project in development needs Mark2Curators!
Greg Stupp, a research scientist in the Su Lab here at TSRI is working on structuring clinical/drug indication data which is currently in free-form, unstructured text. This data has important implications for bioinformatics research aimed at drug repurposing, but we can’t build this database without your help. Greg will be building a new MediaWiki-based platform for crowdsourcing annotating clinical/drug indication text, which has the added advantage of structuring information so that it can be widely (and openly) disseminated by importing it into Wikidata.
What needs to be done?
Greg is currently working on the interface, and will need a few volunteers to provide feedback on the beta version as soon as it is available. After that, we will need our community of Mark2Curators to put what they’ve learned into practice. Annotating clinical/drug indication text will need Mark2Curators with BOTH the entity recognition AND relationship extraction skills. The rules for entity recognition are expected to be very similar for this task; however, we expect that more detailed relationships will be available so additional relationship extraction training may be necessary.
The first data set is expected to be generated from 1,100 paragraphs which will be very short but densely packed with information.
An example text may look like this:
MEKINIST® is indicated, as a single agent or in combination with dabrafenib, for the treatment of patients with unresectable or metastatic melanoma with BRAF V600E or V600K mutations, as detected by an FDA-approved test [see Clinical Studies (14.1)].
How will this work?
The interface is still under construction, but will primarily use queries and drop down selection menus so that you can select the best representation of the concepts in the text and how they are related. As a Mark2Curator, you’ve already been trained to recognize most of the entities involved in this task; however, the interface may require a little more detail in your selection. We’ll delve into the interface a little more as it gets fleshed out (it’s still very preliminary at this point).
How can you help?
If you are interested in providing feedback for improving the interface, please contact me. I will be compiling a list of beta testers for improving the user experience prior to the launch of the task. Given that the project is being built in the mediawiki framework, there may be huge limitations in the features we can modify; however, your feedback will be crucial in ensuring that we provide users with the information they need to successfully complete the task in spite of any constraints/limitations caused by the interface.