Data from: Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis

Stoltzfus A, O'Meara B, Whitacre J, Mounce R, Gillespie EL, Kumar S, Rosauer DF, Vos RA

Date Published: October 23, 2012

DOI: http://dx.doi.org/10.5061/dryad.h6pf365t

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title Literature Sample1: conducted in April 2011, journals: American Journal of Botany & Evolution
Downloaded 84 times
Description LitSample1. To get a sense of current practices, AS and BO picked 2 journals, Evolution and Am J Bot, and looked at every one of the 32 regular articles in the April 2011 issues. Evolution is the premier trade journal for organismal evolutionary biologists. American Journal of Botany is a frequent venue for phylogenetic systematics.
Download LitSample1_Apr2011_AmJBot_Evol.csv (11.28 Kb)
Download README.txt (6.24 Kb)
Details View File Details
Title Literature Sample2: 40 recently published phylogeny-related articles
Downloaded 60 times
Description We searched Thomson Reuters Web of Science (WoS) in May of 2011 for articles matching 'phylogen*' in title or 'topic'. WoS sorted the results by 'relevance', and we picked 40 articles from the top of the list. We deliberately chose this approach to focus on articles likely to focus on phylogeny, rather than to mention it peripherally. However, because we do not know exactly what 'topic' and 'relevance' mean in this case (and WoS does not make its methodology clear to users), we cannot be certain what kind of a sample this represents. Of the 40 articles, 38 report new trees, considerably more than the 27/40 expected by chance for an article that matches 'phylogen*' anywhere (see below). The file "LitSample2_40RecentPhylogenInDepth.csv" contains extensive notes on the 40 articles. This spreadsheet was populated by an online fillable form that is available from the authors on request (in case any reader would like to analyze their own literature sample).
Download LitSample2_40RecentPhylogenInDepth.csv (33.57 Kb)
Download README.txt (6.241 Kb)
Details View File Details
Title Literature Sample3: 100 randomly-selected 'phylogen*' articles published in 2010
Downloaded 72 times
Description The sole purpose of this survey was to estimate the frequency of reports of new trees among 2010 publications. We first searched Web of Science for 2010 papers that matched 'phylogen*' in any field. Many of the 11,664 matching publications might be false positives, i.e., papers that refer to 'phylogen*' in some way, but do not report a new tree. To estimate this fraction, we picked 100 papers at random. Each paper was assigned to BO, AS or RM for individual evaluation, with the result that 66 of the 100 papers reported a new tree. The file "LitSample3_100RandomPhylogen2010.csv" contains results of the analysis of the sample of 100 publications. There is not much in this spreadsheet other than a determination of whether it has a new tree or not. This spreadsheet was populated by an online fillable form that is available from the authors on request (in case any reader would like to analyze their own literature sample). We also considered false negatives due to papers that report a new phylogeny, but avoid the term 'phylogen*', using instead some term such as 'dendrogram', 'cladogram' or 'tree'. Because 'tree' has many non-phylogenetic uses, we used a restricted search methodology based on other terms associated with phylogenies, such as 'SSU' or 'cytb' and so on. By comparing matches to 'SSU + tree -phylogeny' to those for 'SSU + phylogeny', we can estimate how often authors use 'tree' as a synonym while avoiding 'phylogeny'. We got only about 1/100 as many hits, and many of these referred to "trees" that were not phylogenetic trees. Thus, the results suggest that phylogeny synonyms would increase the yield by less than 1 %. We did not estimate false negatives due to poor indexing, or non-indexing, in Web of Science. Web of Science may contain information on articles that are indexed very incompletely, e.g., articles for which only the citation information is available, without keywords or abstract. A poorly indexed article that reports a phylogeny will only be found if 'phylogen*' appears in the title. We also did not estimate the number of false negatives due to phylogeny reports that are not indexed at all in Web of Science. It is difficult to see how this could be done. However, one way to do it would be to take a very carefully researched review article, e.g., on phylogeny of major reptile groups, and then assess what fraction of cited phylogeny articles can be found in WoS. Apropos, TimeTree has nearly a thousand articles in its database, and a substantial fraction are not indexed in PubMed.
Download LitSample3_100RandomPhylogen2010.csv (44.42 Kb)
Download README.txt (6.241 Kb)
Details View File Details
Title Archive Sample Analysis of All Dryad 2010 studies matching keyword: 'phylogen*'
Downloaded 69 times
Description All TreeBASE entries have trees, but not all Dryad packages for phylogeny papers have decodable (i.e., not graphic) trees. Using the Dryad search interface in August, 2011, AS found 32 entries for 2010 studies in Dryad that match "phylogen". In this group, AS found one server error: <http://datadryad.org/handle/10255/dryad.1786>. Among the remainder, there were 24 packages without any phylogeny in decodable form, and 7 packages with one or more phylogenies in decodable form. Note that most of the NEXUS files do not have trees, and that there are trees in non-NEXUS formats, e.g., some are just Newick strings in text files (e.g., http://datadryad.org/handle/10255/dryad.1965). The file "ArchiveSample_AllDryad_2010_Phylogen.csv" is a spreadsheet with the results of this very brief analysis.
Download ArchiveSample_AllDryad_2010_Phylogen.csv (1.882 Kb)
Download README.txt (6.241 Kb)
Details View File Details
Title User stories of barriers to data re-use encountered
Downloaded 113 times
Description As part of a MIAPA exercise we gathered and analyzed stories of phylogeny use & re-use, based on our own experiences, and those of colleagues who are sharing this information as a personal communication. This material provides a basis for many aspects of the barriers to re-use taxonomy in the text, and for individual comments about problems that users experience, such as inconsistent names, re-doing analyses, etc.
Download UserStories_BarriersToReUse.pdf (122.0 Kb)
Download README.txt (6.241 Kb)
Details View File Details
Title README
Downloaded 34 times
Description This file describes the contents of the supplementary data package for Stotlzfus, et al, Sharing Phylogenetic Trees. The package includes this README file, a PDF file with user stories, and 4 spreadsheets (for 3 literature samples plus 1 quick analysis of Dryad content): * LitSample1_Apr2011_AmJBot_Evol.csv - all pubs from 2 April issues * LitSample2_40RecentPhylogenInDepth.csv - sample of 40 recent phylogen* pubs * LitSample3_100RandomPhylogen2010.csv - random sample of 2010 phylogen* pubs * ArchiveSample_AllDryad_2010_Phylogen.csv * UserStories_BarriersToReUse.pdf - user stories and taxonomy of barriers * README = this file | This README file is Unicode (UTF-8) with Unix/Linux file endings. The .csv files are also in Unicode (UTF-8), with field delimiter symbol , (comma) and text delimiter symbol " (double quote mark)
Download README.txt (6.24 Kb)
Details View File Details

When using this data, please cite the original publication:

Stoltzfus A, O'Meara B, Whitacre J, Mounce R, Gillespie EL, Kumar S, Rosauer DF, Vos RA (2012) Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. BMC Research Notes 5: 574. http://dx.doi.org/10.1186/1756-0500-5-574

Additionally, please cite the Dryad data package:

Stoltzfus A, O'Meara B, Whitacre J, Mounce R, Gillespie EL, Kumar S, Rosauer DF, Vos RA (2012) Data from: Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.h6pf365t
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: