Reproducible and Reusable Science
Connecting research results to the underlying data and analysis is central to validate and extend scientific discoveries. Our tools encourage open data and clarity about methods, when possible, and promote and enable data citation.
Computationally Assisted Explorations
We build analytical tools, such as Consilience and TwoRavens, that assist a researcher to understand and discover new insights from their data by connecting their own knowledge, expertise and judgement with the vast array of quantitative methods available in computational analysis.
Interdisciplinary Quantitative Scientific Scope
While social science research informs many of our goals, our tools and research frameworks address broad methodological issues in quantitative science and are often employed in other research domains, including partnerships in the health and biomedical sciences, astronomy, and the humanities.
When Data are Not Open
While we support open data in all possible forms, the increasing ability of science to measure our lives, brings increasing ethical responsibiities to safeguard privacy. We need to find solutions to preserve privacy, while still providing science the fundamental ability to learn, access and replicate findings. DataTags and PrivateZelig are two of our solutions towards these goals.
Large-Scale Data Sets
In the coming years, the Data Science team will be expanding its software applications to handle large-scale data sets, as Big Data science reaches all disciplines. This means extending Consilience for millions of text documents, and Zelig and Dataverse to handle TB and PB-scale data sets.
Zelig: Everyone's Statistical Software allows a large body of different statistical models to be implemented and interpreted in a common framework and interface.
For almost a decade, Dataverse has been at the forefront of data publication, citation and preservation. We continue to innovate and expand to more domains, and interoperate with more systems.
This is the first public release of a new, interactive Web application to explore data, view descriptive statistics, and estimate statistical models.
DataTags guides data contributors through all legal regulations to appropriately set a level of sensitivity for dataset through a machine-actionable Tag, that can then be coupled, tracked and enforced with that data's future use.
Consilience allows you to simultaneously explore every existing clustering method in the literature to help you discover new clusters and patterns in your text documents.
The first public version of the new RBuild application will provide a continuous integration build solution, from freshly developed code in Git to archived published code in CRAN, for developers and contributors of R packages.