The Zebrafish Model Organism Database (ZFIN) has served the scientific community for over twenty years. As with the Rat Genome Database, which was previously featured in our Spotlight series, ZFIN also manages an InterMine-based resource called ZebrafishMine. The ZFIN team is based at University of Oregon, and their data curation manager, Doug Howe, has kindly answered our questions about this important resource.
- In one tweet or less, introduce us to ZFIN:
ZFIN is the community resource for expertly curated genetic and genomic data involving the zebrafish (Danio rerio) as a model organism.
- Who is your target audience? Has your audience changed over time, and how has ZFIN evolved to meet the needs of your audience?
In the beginning, ZFIN was lovingly tagged as “zphone,” because of its role as a community contact for zebrafish researchers, students and publishers. Curators took an early role in establishing nomenclature guidelines and extracting gene models, loading mapping data and adding mutation details from papers even as early as 1996. In the early 2000s, ZFIN began cross linking with several other genomic data sets including GenBank, Vega, UniProt, Ensembl and many, many more resources.
In 2003, ZFIN began representing additional biological details including orthology to human, mouse and fly genes, more complicated genotypes including double and triple mutants, transgenics, morpholinos (and in 2013, CRISPRs and TALENs), gene ontology annotations, expression, phenotype and experimental conditions details. During the next decade, ZFIN curators also developed (and continue to develop) the Zebrafish Anatomical Ontology and several other biological term hierarchies that help define the language with which zebrafish data can be annotated. This ontological development and subsequent curation has allowed many other research scientists to develop more complex data retrieval pipelines, some of which have even begun identifying human diseases with aggregated, model organism phenotypes curated from papers and data loads using ontologies. It has also contributed to the unification of data from disparate sources in ZFIN.
Most recently, ZFIN has started incorporating human disease annotations both directly from papers and via zebrafish models of disease. Curators are excitedly discussing cross linking fish and their phenotypes with annotations to human diseases in order to help clinicians understand disease phenotypes in humans.
Throughout ZFIN’s history, researchers, students, data aggregators, bioinformaticians and other online genetic resources and databases, have been central to the development of this resource. ZFIN was founded with a partnership of biological scientists and user interface developers. Maintaining this important cross-disciplinary and connected relationship with user-centered design as its core has helped ZFIN remain connected to the research community even in its most recent developments. With the advent of the Alliance of Genome Resources, ZFIN hopes to be even more useful to the basic science research communities of zebrafish and other model organisms, and increasingly to communities including clinicians.
- It looks like ZFIN originates from a 1994 Cold Spring Harbor meeting on zebrafish. Can you elaborate on this story? When did ZFIN begin collaborating with ZIRC and at what point in ZFIN’s history was InterMine integrated for the creation of ZebrafishMine?
ZFIN was started by Monte Westerfield after the 1994 Cold Spring Harbor meeting (the first open international zebrafish meeting). Initially it was funded by the NSF. In 1998 Monte received an NIH grant for ZIRC that included a database aim that supported ZFIN. This started the collaboration between the two resources. Independent funding for ZFIN was first secured in 2002.
ZebrafishMine was a collaboration between staff at ZFIN and the Intermine project starting in 2009. By 2009, ZFIN was nearly 15 years old. We created our intermine instance to provide functionality for advanced ZFIN users and in response to wish lists we’d heard from researchers in our community. ZebrafishMine was up and running in 2013.
In Zebrafishmine a user can search using lists, like a gene list, and create personalized queries and templates to run at their convenience. This kind of personalization and querying of data at ZFIN can be essential to research that requires large data sets. ZFIN provides generic search interfaces that we believe are helpful to the majority of users. Alternatively, ZebrafishMine provides advanced searching and downloading (as well as programmatic access) to this same data for users who have more specific use cases in mind, or the desire to deal with data in bulk. In addition, the InterMine system promotes easy access to many model organism species including: fly, rat, mouse and yeast. Zebrafish users can see orthology in mouse and human, directly linked in ZebrafishMine.
Finally, the user interface of ZebrafishMine is the same as the user interface for many other organism mines. Once a user learns to use ZebrafishMine, it is quite easy to use all the other mines as well. This reduces barriers to inter-species data retrieval and helps satisfy one of ZFIN’s main goals: to aid researchers in determining genetic and environmental causes of human disease.
- Why is ZFIN unique and special? How does it differ from ZIRC?
ZFIN is the only freely available resource focused on integrating zebrafish research data in a searchable, downloadable, and easily accessible framework that allows the community to work together to turn data into knowledge. ZFIN is a curated resource: we work with researchers to quality check data, we accept direct data submissions, and we cross link data with other databases. ZFIN also provides the official nomenclature support for zebrafish genes and mutants, and hosts a community wiki for exchange and discussion of protocols and antibodies. We maintain the definitive reference data sets of zebrafish research information and link this information extensively to corresponding data in other model organism and human databases.
The Zebrafish International Resource Center (ZIRC; zebrafish.org), also located at the University of Oregon, is the primary stock center in the United States distributing fish strains and providing veterinary and pathology support services. ZFIN and ZIRC are independently funded by the NIH but work in close collaboration to provide fish strains and associated data in a coordinated fashion to the research community. Two other resource centers are also available: the Chinese Zebrafish Resource Center (CZRC) and the European Zebrafish Resource Center (EZRC). ZFIN provides links to all three of these stock centers when they have fish strains available for purchase.
- What level of traffic does ZFIN typically see?
ZFIN typically has two-thousand users per day who spend an average of four minutes on various pages. Our most heavily used tools are “expression search” and our summary gene pages.
- Looking at the committees involved in ZFIN, it’s clear that ZFIN is an important hub for zebrafish-related knowledge. ZFIN’s job board activity for zebrafish-related positions is another testament to its value as a community resource. What is one of ZFIN’s greatest success story so far?
Basic research in model organisms is done to provide insights into mechanisms of human disease and ultimately to illuminate paths to disease remediation or cure. The role ZFIN plays in this process is to gather zebrafish research data in a quality controlled and readily sharable format. ZFIN serves the research and clinical communities both as a data processor and repository and also, critically, as a source for the high quality expertly curated data used to facilitate biomedical discovery based on prior work. Researchers and clinicians are now using data obtained from ZFIN (and other model organism databases) combined with human clinical data as inputs to new algorithms designed to discover disease causing genes and define novel treatments. The critical role we play in this cyclical process of discovery is one of our great successes.
- Good resources have good documentation on how to use them; great resources have documentation on how to integrate and improve them. When did ZFIN decide to create a comprehensive guide for users to contribute data, and how long did it take to realize this guide?
Direct submission and integration of data has been a core function of ZFIN since it’s inception. As data, and hence the ZFIN database, have become more complex, it has become increasingly challenging for researchers to gather data and provide it in a readily sharable format. Data prepared without enough forethought on how it will be shared can lead to significant work on data transformation before data can be loaded into the ZFIN database. In 2015 we had the opportunity to contribute the chapter in Methods in Cell Biology, so we took that opportunity to provide the comprehensive data submission guide.
- What is in store for ZFIN? Does the uncertainty in Model Organism Database funding (MODs) affect ZFIN’s planned developments, and has there been a response to the GSA’s letter of support for MODs (in which ZFIN was mentioned)?
The GSA letter of support garnered over 10,000 signatures in a very short time prior to the TAGC meeting in Orlando. The letter, presented to NIH Director Francis Collins during that meeting, brought a clear positive message to the NIH leadership regarding how significant the model organism databases and the Gene Ontology Consortium are in facilitating NIH funded research and furthering the mission of the NIH itself. We continue to put the needs of our research community first and are focused on maintaining our level of service while the funding models and sources shift. One significant positive outcome of ongoing discussions about how to fund database resources in general has been the formation of the Alliance of Genome Resources, a consortium of six model organism databases and the Gene Ontology Consortium, whose aim is providing better cross-organism data integration and standardization. This is the next step in the evolution of these resources, as we prepare to tackle together some of the most challenging data integration issues that hamper broad use of model organism data in furthering our understanding of human disease. These are very exciting times, and I think our best days in support of the research community are ahead of us!
- Who is the team behind ZFIN and ZebrafishMine?
Though the following list represents ZFIN staff in 2016, prior members of our team contributed significantly to the success of this resource.
- Principal Investigator – Monte Westerfield
- Data Curation Manager – Doug Howe
- Technical and Project Manager – Anne Eagle
- Technical Team:
- Patrick Kalita
- Prita Mani
- Ryan Martin
- Christian Pich
- Xiang Shao
- Kevin Shaper
- Sierra Taylor-Moxon
- Curation Team
- Yvonne Bradford
- David Fashena
- Ken Frazer
- Holly Paddock
- Sridhar Ramachandran
- Leyla Ruzicka
- Amy Singer
- Ceri Van Slyke
- Sabrina Toro
- Administrative Assistant – Jon Knight
- Literature Acquisition Assistant – Ruben Lancaster
ZebrafishMine is a collaboration between ZFIN at the University of Oregon and the InterMine project at the Cambridge Systems Biology Centre.
Thanks to Doug Howe and the rest of the ZFIN team for guiding us through this extremely useful and FREE tool. Be sure to check out the ZFIN-related plugins here and here in the plugin library. If you use ZFIN in your research, be sure to cite their publication:
ZFIN, the Zebrafish Model Organism Database: increased support for mutants and transgenics. Howe DG, Bradford YM, Conlin T, Eagle AE, Fashena D, Frazer K, Knight J, Mani P, Martin R, Moxon SA, Paddock H, Pich C, Ramachandran S, Ruef BJ, Ruzicka L, Schaper K, Shao X, Singer A, Sprunger B, Van Slyke CE, Westerfield M. (2013). Nucleic Acids Res. Jan;41(Database issue):D854-60. PMID: 23074187. DOI: 10.1093/nar/gks938
If you would like to contribute data to ZFIN, please see their recent guide here:
A scientist’s guide for submitting data to ZFIN. Howe DG, Bradford YM, Eagle A, Fashena D, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, Pich C, Ramachandran S, Ruzicka L, Schaper K, Shao X, Singer A, Toro S, Van Slyke C, Westerfield M. Methods Cell Biol. 2016;135:451-81. Doi: 10.1016/bs.mcb.2016.04.010. Epub 2016 May 12. PubMed PMID: 27443940.