HEPData Output Formats
Aside from the normal HTML view of HEPData, most of our pages and data have a JSON equivalent. The JSON format allows programmatic access, from scripts in different languages such as Python or from applications such as Mathematica.
&format=json to the search URL or detailed record URL to get the results back in JSON.
If a user goes to the search page and filters on Collaboration: ATLAS, they will land at the URL:
By default, the search results are returned with up to 10 records on each page (
size=10), but this number can be
altered by passing
size=1 up to
size=100 in the search URL.
Multiple URL arguments should each be separated by a
Record pages also have a JSON view, for example:
The light view reduces the size of the JSON by removing data tables and is useful for getting information about the whole record.
Data file formats
Conversion to various export formats is provided by a separate extensible package, hepdata-converter. Current output formats are:
.root file with each table in a
separate directory. For numeric data, a
TGraphAsymmErrors object is written for each
dependent variable. If the data has finite bin widths, then also separate
objects are written for the central value of the data points and each of the uncertainties. If
there is more than one independent variable, the appropriate
ROOT object is chosen, for example,
TH3F instead of a
yoda output is now
in the YODA2 format, but there is still an option
yoda1 to output the legacy YODA1 format.
To download in a specific format, append a
format query parameter to the URL of a
record, using one of:
Optional parameters can also be added (separated by a
table=Table%201: provide the table name in order to download a specific table instead of all tables.
Special characters should generally be URL-encoded,
%20 for a space. Omitting spaces also works, for example,
version=1: specify a particular version of a record. If omitted, the latest version
will be returned.
light=true: when using
format=json for a whole submission, this omits
the data tables from the response.
rivet=ALICE_2016_I1419244: when using
format=yoda1, specify the desired
Rivet analysis name to be written in the YODA files if it does not match the automatically generated name.
For example, https://www.hepdata.net/record/ins1419244?format=yaml&table=Table1&version=1 returns Table 1 of version 1 of the record, in YAML format.
HEPData record pages include JSON-LD metadata to provide machine-readable information about the record.
To download just the JSON-LD for a record page, use the HTTP header
Accept: application/ld+json in your request, e.g.:
curl -OJLH "Accept: application/ld+json" https://www.hepdata.net/record/ins1419244
If you are using the DOI for a record rather than the HEPData URL, passing
Accept: application/ld+json will give you the JSON-LD metadata directly from DataCite, which is
less complete than the metadata produced by HEPData. To get the HEPData metadata
via the DOI, use the header
Accept: application/vnd.hepdata.ld+json, e.g.:
curl -OJLH "Accept: application/vnd.hepdata.ld+json" https://doi.org/10.17182/hepdata.72886.v2
DOIs for resource files relating to a HEPData submission direct users to a landing page such as
landing_page=true URL argument will return JSON metadata for the resource file,
while replacing it with
view=true will download the resource file.
To download a resource file directly from its DOI, pass the relevant
Accept header. For example, for DOI 10.17182/hepdata.89408.v3/r2:
curl -OJLH "Accept: application/x-tar" https://doi.org/10.17182/hepdata.89408.v3/r2
If you pass an
Accept header that is not valid for the given resource, you will receive a
response with a 406 Not Acceptable
status code with an error message in JSON format such as:
"msg": "Accept header value 'application/zip' does not contain a valid media type for this resource. Expected Accept header to include one of 'application/x-tar', 'text/html', 'application/ld+json', 'application/vnd.hepdata.ld+json'"
file_mimetype field provides the correct media type to use in the
header to download the resource.
Note that both the
curl command and the Python
requests module have a
Accept header of
*/*, different from
most web browsers
*/* has a lower weighting than
text/html. The HEPData code will return the content
*/* has weight 1 in a request to a landing page, therefore it is not strictly necessary
to specify an explicit
Accept header when using
curl or Python
although it is still recommended to do so.