About Dataverse

P.I.: Gary King  --  Project Lead: Mercè Crosas   -- Team

Dataverse is a software application that enables institutions to host research data repositories. It provides a preservation and archival  infrastructure, while researchers can share, keep control of and get recognition for their data through an easy to access web browser interface. Dataverse supports the sharing of research data with a persistent data citation, and data publishing and management workflows with versioning and metadata standards.

The Harvard Dataverse Repository

Harvard Dataverse is a repository open and free* to all researchers worldwide to publish research data across all disciplines.

  • Each dataverse can contain any number of datasets
  • Each dataset contains metadata and any number of data files.
  • Each data file can have a maximum size of 2GB (currently).

* Please contact us if you are interested in deposit about 1TB of data or more.

The Dataverse Software

The Dataverse software is open-source and shared in GitHub. Each release is packaged and can be downloaded and installed by an institution or organization to host their own Dataverse repository.

History and Name

Dataverse development started in 2006 under the direction of Mercè Crosas. Dataverse was preceded by the Virual Data Center, led by Gary King, Sidney Verba and Micah Altman. 

We thank Ella King, who at age ten came up with this ingenious name for the project. Dataverse is theuniverse of data, a space where all data can be added, shared, reused, and expanded to advance research.

Beyond the PDF - DataverseThe name has even inspired some artists. This illustration was created as visual notes for the Beyond the PDF  workshop in Amsterdam in 2013, by De Jongens van de Tekeningen, and is licensed under a Creative Commons Attribution 3.0 Unported License.