sparcl package

sparcl.client module

Client module for SPARCL. This module interfaces to the SPARC-Server to get spectra data.

class sparcl.client.SparclClient(*, url='https://astrosparcl.datalab.noirlab.edu', verbose=False, show_curl=False, connect_timeout=1.1, read_timeout=5400, announcement=True)[source]

Bases: object

Provides interface to SPARCL Server. When using this to report a bug, set verbose to True. Also print your instance of this. The results will include important info about the Client and Server that is usefule to Developers.

Parameters:
  • url (str, optional) – Base URL of SPARCL Server. Defaults to ‘https://astrosparcl.datalab.noirlab.edu’.

  • verbose (bool, optional) – Default verbosity is set to False for all client methods.

  • connect_timeout (float, optional) – Number of seconds to wait to establish connection with server. Defaults to 1.1.

  • read_timeout (float, optional) – Number of seconds to wait for server to send a response. Generally time to wait for first byte. Defaults to 5400.

  • announcement (bool, optional) – SPARCL announcements. Defaults to True.

Example

>>> client = SparclClient(announcement=False)
Raises:

Exception – Object creation compares the version from the Server against the one expected by the Client. Throws an error if the Client is a major version or more behind.

property all_datasets

Set of all DataSets available from Server

find(outfields=None, *, constraints={}, limit=500, sort=None, units=True, fmt=None, verbose=False)[source]

Find records in the SPARCL database.

Parameters:
  • outfields (list, optional) – List of fields to return. Only CORE fields may be passed to this parameter. Defaults to None, which will return only the sparcl_id and _dr fields.

  • constraints (dict, optional) – Key-Value pairs of constraints to place on the record selection. The Key part of the Key-Value pair is the field name and the Value part of the Key-Value pair is a list of values. Defaults to no constraints. This will return all records in the database subject to restrictions imposed by the limit parameter.

  • limit (int, optional) – Maximum number of records to return. Defaults to 500.

  • sort (list, optional) – Comma separated list of fields to sort by. Defaults to None. (no sorting)

  • units (bool, optional) – Set to True to include units in the header for each applicable field. Defaults to True.

  • fmt (str, optional) – Output format for the results. If 'pandas', the results are automatically converted to a Pandas Dataframe object. Defaults to None, which returns a Found object.

  • verbose (bool, optional) – Set to True for in-depth return statement. Defaults to False.

Returns:

Contains header and records,

unless fmt='pandas' is specified, in which case a Pandas Dataframe object is returned.

Return type:

Found

Example

>>> client = SparclClient(announcement=False)
>>> outs = ['sparcl_id', 'ra', 'dec']
>>> cons = {'spectype': ['GALAXY'], 'redshift': [0.5, 0.9]}
>>> found = client.find(outfields=outs, constraints=cons)
>>> sorted(list(found.records[0].keys()))
['_dr', 'dec', 'ra', 'sparcl_id']
get_all_fields(*, dataset_list=None)[source]

Get fields tagged as ‘all’ that are in DATASET_LIST. These are the fields used for the ALL value of the include parameter of client.retrieve().

Parameters:

dataset_list (list, optional) – List of data sets from which to get all fields. Defaults to None, which will return the intersection of all fields in all data sets hosted on the SPARCL database.

Returns:

List of fields tagged as ‘all’ from DATASET_LIST.

Example

>>> client = SparclClient(announcement=False)
>>> client.get_all_fields()
['data_release', 'datasetgroup', 'dateobs', 'dateobs_center', 'dec', 'exptime', 'flux', 'instrument', 'ivar', 'mask', 'model', 'ra', 'redshift', 'redshift_err', 'redshift_warning', 'site', 'sparcl_id', 'specid', 'specprimary', 'spectype', 'survey', 'targetid', 'telescope', 'wave_sigma', 'wavelength', 'wavemax', 'wavemin']
get_available_fields(*, dataset_list=None)[source]

Get subset of fields that are in all (or selected) DATASET_LIST. This may be a bigger list than will be used with the ALL keyword to client.retreive().

Parameters:

dataset_list (list, optional) – List of data sets from which to get available fields. Defaults to None, which will return the intersection of all available fields in all data sets hosted on the SPARCL database.

Returns:

Set of fields available from data sets in DATASET_LIST.

Example

>>> client = SparclClient(announcement=False)
>>> sorted(client.get_available_fields())
['data_release', 'datasetgroup', 'dateobs', 'dateobs_center', 'dec', 'exptime', 'extra_files', 'file', 'flux', 'instrument', 'ivar', 'mask', 'model', 'ra', 'redshift', 'redshift_err', 'redshift_warning', 'site', 'sparcl_id', 'specid', 'specprimary', 'spectype', 'survey', 'targetid', 'telescope', 'updated', 'wave_sigma', 'wavelength', 'wavemax', 'wavemin']
get_default_fields(*, dataset_list=None)[source]

Get fields tagged as ‘default’ that are in DATASET_LIST. These are the fields used for the DEFAULT value of the include parameter of client.retrieve().

Parameters:

dataset_list (list, optional) – List of data sets from which to get the default fields. Defaults to None, which will return the intersection of default fields in all data sets hosted on the SPARCL database.

Returns:

List of fields tagged as ‘default’ from DATASET_LIST.

Example

>>> client = SparclClient(announcement=False)
>>> client.get_default_fields()
['dec', 'flux', 'ra', 'sparcl_id', 'specid', 'wavelength']
login(email, password=None)[source]

Login to the SPARCL service.

Parameters:
  • email (str) – User login email.

  • password (str, optional) – User SSO password. If not given, the output will prompt the user to enter in their SSO password.

Returns:

None.

Example

>>>
>> client = SparclClient(announcement=False)
>> client.login('test_user@noirlab.edu', 'testpw')
Logged in successfully with email='test_user@noirlab.edu'
logout()[source]

Logout of the SPARCL service.

Parameters:

None.

Returns:

None.

Example

>>> client = SparclClient(announcement=False)
>>> client.logout()
Logged-out successfully.  Previously logged-in with email None.
missing(uuid_list, *, dataset_list=None, countOnly=False, verbose=False)[source]

Return the subset of sparcl_ids in the given uuid_list that are NOT stored in the SPARCL database.

Parameters:
  • uuid_list (list) – List of sparcl_ids.

  • dataset_list (list, optional) – List of data sets from which to find missing sparcl_ids. Defaults to None, meaning all data sets hosted on the SPARCL database.

  • countOnly (bool, optional) – Set to True to return only a count of the missing sparcl_ids from the uuid_list. Defaults to False.

  • verbose (bool, optional) – Set to True for in-depth return statement. Defaults to False.

Returns:

A list of the subset of sparcl_ids in the given uuid_list that are NOT stored in the SPARCL database.

Example

>>> client = SparclClient(announcement=False)
>>> ids = ['ddbb57ee-8e90-4a0d-823b-0f5d97028076',]
>>> client.missing(ids)
['ddbb57ee-8e90-4a0d-823b-0f5d97028076']
missing_specids(specid_list, *, dataset_list=None, countOnly=False, verbose=False)[source]

Return the subset of specids in the given specid_list that are NOT stored in the SPARCL database.

Parameters:
  • specid_list (list) – List of specids.

  • dataset_list (list, optional) – List of data sets from which to find missing specids. Defaults to None, meaning all data sets hosted on the SPARCL database.

  • countOnly (bool, optional) – Set to True to return only a count of the missing specids from the specid_list. Defaults to False.

  • verbose (bool, optional) – Set to True for in-depth return statement. Defaults to False.

Returns:

A list of the subset of specids in the given specid_list that are NOT stored in the SPARCL database.

Example

>>> client = SparclClient(announcement=False)
>>> found = client.find(outfields=['specid'], limit=2)
>>> specids = [f.specid for f in found.records]
>>> client.missing_specids(specids + ['6802933904984788992'])
['6802933904984788992']
retrieve(uuid_list, *, include='DEFAULT', dataset_list=None, limit=500, units=True, fmt=None, verbose=False)[source]

Retrieve spectra records from the SPARCL database by list of sparcl_ids.

Parameters:
  • uuid_list (list) – List of sparcl_ids.

  • include (list, optional) – List of field names to include in each record. Defaults to ‘DEFAULT’, which will return the fields tagged as ‘default’.

  • dataset_list (list, optional) – List of data sets from which to retrieve spectra data. Defaults to None, meaning all data sets hosted on the SPARCL database.

  • limit (int, optional) – Maximum number of records to return. Defaults to 500. Maximum allowed is 24,000.

  • units (bool, optional) – Set to True to include units in the header for each applicable field. Defaults to True.

  • fmt (str, optional) – Output format for the results. If 'specutils', the results are automatically converted to the most appropriate specutils object using to_specutils(). This may return a Spectrum, SpectrumCollection, or SpectrumList depending on the data. Requires the specutils package to be installed. If 'pandas', the results are automatically converted to a Pandas Dataframe object. Defaults to None, which returns a Retrieved object.

  • verbose (bool, optional) – Set to True for in-depth return statement. Defaults to False.

Returns:

Contains header and records,

unless fmt='specutils' or fmt='pandas' is specified, in which case a specutils or Pandas Dataframe object is returned.

Return type:

Retrieved

Example

>>> client = SparclClient(announcement=False)
>>> ids = client.find(limit=1).ids
>>> inc = ['sparcl_id', 'flux', 'wavelength', 'model']
>>> ret = client.retrieve(uuid_list=ids, include=inc)
>>> type(ret.records[0].wavelength)
<class 'numpy.ndarray'>
retrieve_by_specid(specid_list, *, svc='spectras', format='pkl', include='DEFAULT', dataset_list=None, limit=500, units=True, fmt=None, verbose=False)[source]

Retrieve spectra records from the SPARCL database by list of specids.

Parameters:
  • specid_list (list) – List of specids.

  • include (list, optional) – List of field names to include in each record. Defaults to ‘DEFAULT’, which will return the fields tagged as ‘default’.

  • dataset_list (list, optional) – List of data sets from which to retrieve spectra data. Defaults to None, meaning all data sets hosted on the SPARCL database.

  • limit (int, optional) – Maximum number of records to return. Defaults to 500. Maximum allowed is 24,000.

  • units (bool, optional) – Set to True to include units in the header for each applicable field. Defaults to True.

  • fmt (str, optional) – Output format for the results. If 'specutils', the results are automatically converted to the most appropriate specutils object using to_specutils(). This may return a Spectrum, SpectrumCollection, or SpectrumList depending on the data. Requires the specutils package to be installed. If 'pandas', the results are automatically converted to a Pandas Dataframe object. Defaults to None, which returns a Retrieved object.

  • verbose (bool, optional) – Set to True for in-depth return statement. Defaults to False.

Returns:

Contains header and records,

unless fmt='specutils' or fmt='pandas' is specified, in which case a specutils or Pandas Dataframe object is returned.

Return type:

Retrieved

Example

>>> client = SparclClient(announcement=False)
>>> sids = [4753625089450465280, 1254253099313293312]
>>> inc = ['specid', 'flux', 'wavelength', 'model']
>>> ret = client.retrieve_by_specid(specid_list=sids, include=inc)
>>> len(ret.records[0].wavelength)
4617
token_expired(renew=False)[source]

POST http://localhost:8050/api/renew_token/ Content-Type: application/json {

“refresh_token”: “…”

}

Returns an ‘access’ token

property version

Return version of Server Rest API used by this client. If the Rest API changes such that the Major version increases, a new version of this module will likely need to be used.

Returns:

API version (float).

Example

>>> client = SparclClient(announcement=False)
>>> client.version
13.0
class sparcl.client.TokenAuth(token, renew_check)[source]

Bases: AuthBase

Attaches HTTP Token Authentication to the given Request object.

sparcl.exceptions module

exception sparcl.exceptions.AccessNotAllowed(error_message, error_code=None)[source]

Bases: BaseSparclException

exception sparcl.exceptions.BadInclude(error_message, error_code=None)[source]

Bases: BaseSparclException

Include list contains invalid data field(s).

exception sparcl.exceptions.BadPath(error_message, error_code=None)[source]

Bases: BaseSparclException

A field path starts with a non-core field.

exception sparcl.exceptions.BadQuery(error_message, error_code=None)[source]

Bases: BaseSparclException

Bad find constraints.

exception sparcl.exceptions.BadSearchConstraint(error_message, error_code=None)[source]

Bases: BaseSparclException

exception sparcl.exceptions.BaseSparclException(error_message, error_code=None)[source]

Bases: Exception

Base Class for all SPARCL exceptions.

to_dict()[source]

Convert a SPARCL exception to a python dictionary

exception sparcl.exceptions.NoCommonIdField(error_message, error_code=None)[source]

Bases: BaseSparclException

The field name for Science id field is not common to all Data Sets

exception sparcl.exceptions.NoIDs(error_message, error_code=None)[source]

Bases: BaseSparclException

The length of the list of original IDs passed to the reorder method was zero

exception sparcl.exceptions.NoRecords(error_message, error_code=None)[source]

Bases: BaseSparclException

Results did not contain any records

exception sparcl.exceptions.ReadTimeout(error_message, error_code=None)[source]

Bases: BaseSparclException

The server did not send any data in the allotted amount of time.

exception sparcl.exceptions.ServerConnectionError(error_message, error_code=None)[source]

Bases: BaseSparclException

exception sparcl.exceptions.TooManyRecords(error_message, error_code=None)[source]

Bases: BaseSparclException

Too many records asked for in RETRIEVE

exception sparcl.exceptions.TooManyRequests(error_message, error_code=None)[source]

Bases: BaseSparclException

exception sparcl.exceptions.UnkDr(error_message, error_code=None)[source]

Bases: BaseSparclException

The Data Release is not known or not supported.

exception sparcl.exceptions.UnknownField(error_message, error_code=None)[source]

Bases: BaseSparclException

Unknown field name for a record

exception sparcl.exceptions.UnknownServerError(error_message, error_code=None)[source]

Bases: BaseSparclException

Client got a status response from the SPARC Server that we do not know how to decode.

exception sparcl.exceptions.UnknownSparcl(error_message, error_code=None)[source]

Bases: BaseSparclException

Unknown SPARCL error. If this is ever raised (seen in a log) create and use a new BaseSparcException exception that is more specific.

sparcl.exceptions.genSparclException(response, verbose=False)[source]

Given status from Server response.json(), which is a dict, generate a native SPARCL exception suitable for Science programs.

sparcl.Results module

Containers for results from SPARCL Server. These include results of client.retrieve() client.find().

class sparcl.Results.Found(dict_list, client=None)[source]

Bases: Results

Holds metadata records (and header).

append(item)

S.append(value) – append value to the end of the sequence

clear()

Delete the contents of this collection.

property count

Number of records in this collection.

extend(other)

S.extend(iterable) – extend sequence by appending elements from the iterable

property ids

List of unique identifiers of matched records.

index(value[, start[, stop]]) integer -- return first index of value.

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

property info

Info about this collection. e.g. Warnings, parameters used to get the collection, etc.

insert(i, item)

S.insert(index, value) – insert value before index

pop([index]) item -- remove and return item at index (default last).

Raise IndexError if list is empty or index is out of range.

property records

Records in this collection. Each record is a dictionary.

remove(item)

S.remove(value) – remove first occurrence of value. Raise ValueError if the value is not present.

reorder(ids_og)

Reorder the retrieved records to be in the same order as the original IDs passed to client.retrieve().

Parameters:

ids_og (list) – List of sparcl_ids or specIDs.

Returns:

Contains header and reordered records.

Return type:

reordered (Retrieved)

reverse()

S.reverse() – reverse IN PLACE

to_pandas()

Convert results to a pandas DataFrame object.

Returns:

a pandas DataFrame object.

Return type:

to_pandas (DataFrame)

to_specutils()

Convert results to a specutils object.

Returns:

a specutils object.

Return type:

to_specutils (Spectrum)

unit_for(fieldname, data_release=None)

Look up the unit string for a specific field in a SPARCL header.

Searches the UNITS block of a SPARCL Found or Retrieved header object and returns the unit string(s) associated with the requested field, optionally filtered to a single data release.

Parameters:
  • fieldname (str) – Science field name (e.g. ‘flux’, ‘wavelength’).

  • data_release (str, optional) – Data release str (e.g. ‘BOSS-DR17’). Defaults to None, which will return units for all available data releases.

Returns:

The unit string for the requested fieldname.

Returned when data_release is provided, or when all data releases share the same unit. Returns None if the field is dimensionless or categorical (e.g. 'spectype').

dict of {strstr or None}: A dict mapping each data release name

to its unit string (or None) for the request fieldname. Returned only when data_release is None and units differ across data releases.

Return type:

str or None

Examples

>>> results.unit_for('flux')
'1e-17 erg cm-2 s-1 AA-1'
>>> results.unit_for('wave_sigma')
{'SDSS-DR17': 'pixel', 'DESI-DR1': 'AA'}
>>> results.unit_for('dec', data_release='DESI-DR1')
'deg'
class sparcl.Results.Results(dict_list, client=None)[source]

Bases: UserList

append(item)

S.append(value) – append value to the end of the sequence

clear()[source]

Delete the contents of this collection.

property count

Number of records in this collection.

extend(other)

S.extend(iterable) – extend sequence by appending elements from the iterable

index(value[, start[, stop]]) integer -- return first index of value.

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

property info

Info about this collection. e.g. Warnings, parameters used to get the collection, etc.

insert(i, item)

S.insert(index, value) – insert value before index

pop([index]) item -- remove and return item at index (default last).

Raise IndexError if list is empty or index is out of range.

property records

Records in this collection. Each record is a dictionary.

remove(item)

S.remove(value) – remove first occurrence of value. Raise ValueError if the value is not present.

reorder(ids_og)[source]

Reorder the retrieved records to be in the same order as the original IDs passed to client.retrieve().

Parameters:

ids_og (list) – List of sparcl_ids or specIDs.

Returns:

Contains header and reordered records.

Return type:

reordered (Retrieved)

reverse()

S.reverse() – reverse IN PLACE

to_pandas()[source]

Convert results to a pandas DataFrame object.

Returns:

a pandas DataFrame object.

Return type:

to_pandas (DataFrame)

to_specutils()[source]

Convert results to a specutils object.

Returns:

a specutils object.

Return type:

to_specutils (Spectrum)

unit_for(fieldname, data_release=None)[source]

Look up the unit string for a specific field in a SPARCL header.

Searches the UNITS block of a SPARCL Found or Retrieved header object and returns the unit string(s) associated with the requested field, optionally filtered to a single data release.

Parameters:
  • fieldname (str) – Science field name (e.g. ‘flux’, ‘wavelength’).

  • data_release (str, optional) – Data release str (e.g. ‘BOSS-DR17’). Defaults to None, which will return units for all available data releases.

Returns:

The unit string for the requested fieldname.

Returned when data_release is provided, or when all data releases share the same unit. Returns None if the field is dimensionless or categorical (e.g. 'spectype').

dict of {strstr or None}: A dict mapping each data release name

to its unit string (or None) for the request fieldname. Returned only when data_release is None and units differ across data releases.

Return type:

str or None

Examples

>>> results.unit_for('flux')
'1e-17 erg cm-2 s-1 AA-1'
>>> results.unit_for('wave_sigma')
{'SDSS-DR17': 'pixel', 'DESI-DR1': 'AA'}
>>> results.unit_for('dec', data_release='DESI-DR1')
'deg'
class sparcl.Results.Retrieved(dict_list, client=None)[source]

Bases: Results

Holds spectra records (and header).

append(item)

S.append(value) – append value to the end of the sequence

clear()

Delete the contents of this collection.

property count

Number of records in this collection.

extend(other)

S.extend(iterable) – extend sequence by appending elements from the iterable

index(value[, start[, stop]]) integer -- return first index of value.

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

property info

Info about this collection. e.g. Warnings, parameters used to get the collection, etc.

insert(i, item)

S.insert(index, value) – insert value before index

pop([index]) item -- remove and return item at index (default last).

Raise IndexError if list is empty or index is out of range.

property records

Records in this collection. Each record is a dictionary.

remove(item)

S.remove(value) – remove first occurrence of value. Raise ValueError if the value is not present.

reorder(ids_og)

Reorder the retrieved records to be in the same order as the original IDs passed to client.retrieve().

Parameters:

ids_og (list) – List of sparcl_ids or specIDs.

Returns:

Contains header and reordered records.

Return type:

reordered (Retrieved)

reverse()

S.reverse() – reverse IN PLACE

to_pandas()

Convert results to a pandas DataFrame object.

Returns:

a pandas DataFrame object.

Return type:

to_pandas (DataFrame)

to_specutils()

Convert results to a specutils object.

Returns:

a specutils object.

Return type:

to_specutils (Spectrum)

unit_for(fieldname, data_release=None)

Look up the unit string for a specific field in a SPARCL header.

Searches the UNITS block of a SPARCL Found or Retrieved header object and returns the unit string(s) associated with the requested field, optionally filtered to a single data release.

Parameters:
  • fieldname (str) – Science field name (e.g. ‘flux’, ‘wavelength’).

  • data_release (str, optional) – Data release str (e.g. ‘BOSS-DR17’). Defaults to None, which will return units for all available data releases.

Returns:

The unit string for the requested fieldname.

Returned when data_release is provided, or when all data releases share the same unit. Returns None if the field is dimensionless or categorical (e.g. 'spectype').

dict of {strstr or None}: A dict mapping each data release name

to its unit string (or None) for the request fieldname. Returned only when data_release is None and units differ across data releases.

Return type:

str or None

Examples

>>> results.unit_for('flux')
'1e-17 erg cm-2 s-1 AA-1'
>>> results.unit_for('wave_sigma')
{'SDSS-DR17': 'pixel', 'DESI-DR1': 'AA'}
>>> results.unit_for('dec', data_release='DESI-DR1')
'deg'

sparcl.specutils module

Functions for converting SPARCL results to specutils objects.

sparcl.specutils.to_Spectrum(results, *, collection=False, flux_unit=None, wave_unit=None)[source]

Convert results to specutils.Spectrum.

Parameters:
  • results (sparcl.Results.Retrieved) – Retrieved results, or a single record from a set of results.

  • collection (bool, optional) – If True, attempt to convert to a SpectrumCollection instead.

  • flux_unit (str, optional) – Unit string for flux. If None, resolved via _get_units().

  • wave_unit (str, optional) – Unit string for wavelength. If None, resolved via _get_units().

Returns:

The requested object.

Return type:

Spectrum or SpectrumCollection

Raises:

ValueError – If results can’t be converted to a Spectrum object in a valid way. For example, if some of the spectra have a different wavelength solution.

sparcl.specutils.to_SpectrumList(results, *, flux_unit=None, wave_unit=None)[source]

Convert results to specutils.SpectrumList.

Parameters:
  • results (sparcl.Results.Retrieved) – Retrieved results.

  • flux_unit (str, optional) – Unit string for flux. If None, resolved via _get_units().

  • wave_unit (str, optional) – Unit string for wavelength. If None, resolved via _get_units().

Returns:

The requested object.

Return type:

SpectrumList

sparcl.specutils.to_specutils(results)[source]

Convert results to a specutils object.

Parameters:

results (sparcl.Results.Retrieved) – Retrieved results.

Returns:

Raises:

ValueError – If no valid conversion can be performed, or if unit information is missing from the header.