CSDM

class csdmpy.csdm.CSDM(filename='', version=None, description='', **kwargs)[source]

Bases: object

Create an instance of a CSDM class.

This class is based on the root CSDM object of the core scientific dataset (CSD) model. The class is a composition of the DependentVariable and Dimension instances, where an instance of the DependentVariable class describes a \(p\)-component dependent variable, and an instance of the Dimension class describes a dimension of a \(d\)-dimensional space. Additional attributes of this class are listed below.

Attributes Summary

version

Version number of the CSD model on file.

description

Description of the dataset.

read_only

If True, the data-file is serialized as read only, otherwise, False.

timestamp

Timestamp from when the file was last serialized.

geographic_coordinate

Geographic coordinate, if present, from where the file was last serialized.

dimensions

Tuple of Dimension instances.

dependent_variables

Tuple of DependentVariable instances.

tags

List of tags attached to the dataset.

application

Application metadata dictionary of the CSDM object.

data_structure

Json serialized string describing the CSDM class instance.

filename

Local file address of the current file.

Methods Summary

add_dimension(*args, **kwargs)

Add a new Dimension instance to the CSDM instance.

add_dependent_variable(*args, **kwargs)

Add a new DependentVariable instance to the CSDM instance.

to_dict([update_timestamp, read_only, version])

Serialize the CSDM instance as a python dictionary.

dumps([update_timestamp, read_only, version])

Serialize the CSDM instance as a JSON data-exchange string.

save([filename, read_only, version, …])

Serialize the CSDM instance as a JSON data-exchange file.

copy()

Create a copy of the current CSDM instance.

Attributes Documentation

version

Version number of the CSD model on file.

description

Description of the dataset.

The default value is an empty string, ‘’.

Example

>>> print(data.description)
A simulated sine curve.
Returns

A string of UTF-8 allows characters describing the dataset.

Raises

TypeError – When the assigned value is not a string.

read_only

If True, the data-file is serialized as read only, otherwise, False.

By default, the CSDM object loads a copy of the .csdf(e) file, irrespective of the value of the read_only attribute. The value of this attribute may be toggled at any time after the file import. When serializing the .csdf(e) file, if the value of the read_only attribute is found True, the file will be serialized as read only.

timestamp

Timestamp from when the file was last serialized.

The timestamp stamp is a string representation of the Coordinated Universal Time (UTC) formatted according to the iso-8601 standard.

Raises

AttributeError – When the attribute is modified.

geographic_coordinate

Geographic coordinate, if present, from where the file was last serialized.

The geographic coordinates correspond to the location where the file was last serialized. If present, the geographic coordinates are described with three attributes, the required latitude and longitude, and an optional altitude.

Raises

AttributeError – When the attribute is modified.

dimensions

Tuple of Dimension instances.

dependent_variables

Tuple of DependentVariable instances.

tags

List of tags attached to the dataset.

application

Application metadata dictionary of the CSDM object.

>>> print(data.application)
{}

By default, the application attribute is an empty dictionary, that is, the application metadata stored by the previous application is ignored upon file import.

The application metadata may, however, be retained with a request via the load() method. This feature may be useful to related applications where application metadata might contain additional information. The attribute may be updated with a python dictionary.

The application attribute is where an application can place its own metadata as a python dictionary object containing application specific metadata, using a reverse domain name notation string as the attribute key, for example,

Example

>>> data.application = {
...     "com.example.myApp" : {
...         "myApp_key": "myApp_metadata"
...      }
... }
>>> print(data.application)
{'com.example.myApp': {'myApp_key': 'myApp_metadata'}}
Returns

Python dictionary object with the application metadata.

data_structure

Json serialized string describing the CSDM class instance.

The data_structure attribute is only intended for a quick preview of the dataset. This JSON serialized string from this attribute avoids displaying large datasets. Do not use the value of this attribute to save the data to a file, instead use the save() methods of the instance.

Raises

AttributeError – When modified.

filename

Local file address of the current file.

Methods Documentation

add_dimension(*args, **kwargs)[source]

Add a new Dimension instance to the CSDM instance.

There are three ways to add a new independent variable.

From a python dictionary containing valid keywords.

>>> import csdmpy as cp
>>> datamodel = cp.new()
>>> py_dictionary = {
...     'type': 'linear',
...     'increment': '5 G',
...     'count': 50,
...     'coordinates_offset': '-10 mT'
... }
>>> datamodel.add_dimension(py_dictionary)

From a list of valid keyword arguments.

>>> datamodel.add_dimension(
...     type = 'linear',
...     increment = '5 G',
...     count = 50,
...     coordinates_offset = '-10 mT'
... )

From an Dimension instance.

>>> from csdmpy import Dimension
>>> datamodel = cp.new()
>>> var1 = Dimension(type = 'linear',
...                  increment = '5 G',
...                  count = 50,
...                  coordinates_offset = '-10 mT')
>>> datamodel.add_dimension(var1)
>>> print(datamodel.data_structure)
{
  "csdm": {
    "version": "1.0",
    "dimensions": [
      {
        "type": "linear",
        "count": 50,
        "increment": "5.0 G",
        "coordinates_offset": "-10.0 mT",
        "quantity_name": "magnetic flux density"
      }
    ],
    "dependent_variables": []
  }
}

For the last method, the instance var1 is added to the datamodel as a reference, i.e., if the instance var1 is destroyed, the datamodel instance will become corrupt. As a recommendation always pass a copy of the Dimension instance to the add_dimension() method. We provide the later alternative for it is useful for copying an Dimension instance from one CSDM instance to another.

add_dependent_variable(*args, **kwargs)[source]

Add a new DependentVariable instance to the CSDM instance.

There are again three ways to add a new dependent variable instance.

From a python dictionary containing valid keywords.

>>> import numpy as np

>>> datamodel = cp.new()

>>> numpy_array = (100*np.random.rand(3,50)).astype(np.uint8)
>>> py_dictionary = {
...     'type': 'internal',
...     'components': numpy_array,
...     'name': 'star',
...     'unit': 'W s',
...     'quantity_name': 'energy',
...     'quantity_type': 'pixel_3'
... }
>>> datamodel.add_dependent_variable(py_dictionary)

From a list of valid keyword arguments.

>>> datamodel.add_dependent_variable(type='internal',
...                                  name='star',
...                                  unit='W s',
...                                  quantity_type='pixel_3',
...                                  components=numpy_array)

From a DependentVariable instance.

>>> from csdmpy import DependentVariable
>>> var1 = DependentVariable(type='internal',
...                          name='star',
...                          unit='W s',
...                          quantity_type='pixel_3',
...                          components=numpy_array)
>>> datamodel.add_dependent_variable(var1)

If passing a DependentVariable instance, as a general recommendation, always pass a copy of the DependentVariable instance to the add_dependent_variable() method. We provide the later alternative as it is useful for copying a DependentVariable instance from one CSDM instance to another.

to_dict(update_timestamp=False, read_only=False, version='1.0')[source]

Serialize the CSDM instance as a python dictionary.

Parameters
  • update_timestamp (bool) – If True, timestamp is updated to current time.

  • read_only (bool) – If true, the read_only flag is set true.

  • version (str) – Serialize the dict with the given csdm version.

Example

>>> data.to_dict()
{'csdm': {'version': '1.0', 'timestamp': '1994-11-05T13:15:30Z', 'geographic_coordinate': {'latitude': '10 deg', 'longitude': '93.2 deg', 'altitude': '10 m'}, 'description': 'A simulated sine curve.', 'dimensions': [{'type': 'linear', 'description': 'A temporal dimension.', 'count': 10, 'increment': '0.1 s', 'quantity_name': 'time', 'label': 'time', 'reciprocal': {'quantity_name': 'frequency'}}], 'dependent_variables': [{'type': 'internal', 'description': 'A response dependent variable.', 'name': 'sine curve', 'encoding': 'base64', 'numeric_type': 'float32', 'quantity_type': 'scalar', 'component_labels': ['response'], 'components': ['AAAAABh5Fj9xeHM/cXhzPxh5Fj8yMQ0lGHkWv3F4c79xeHO/GHkWvw==']}]}}
dumps(update_timestamp=False, read_only=False, version='1.0')[source]

Serialize the CSDM instance as a JSON data-exchange string.

Parameters
  • update_timestamp (bool) – If True, timestamp is updated to current time.

  • read_only (bool) – If true, the file is serialized as read_only.

  • version (str) – The file is serialized with the given CSD model version.

Example

>>> data.dumps()  
save(filename='', read_only=False, version='1.0', output_device=None)[source]

Serialize the CSDM instance as a JSON data-exchange file.

There are two types of file serialization extensions, .csdf and .csdfe. In the CSD model, when every instance of the DependentVariable objects from a CSDM class has an internal subtype, the corresponding CSDM instance is serialized with a .csdf file extension. If any single DependentVariable instance has an external subtype, the CSDM instance is serialized with a .csdfe file extension. The two different file extensions are used to alert the end-user of the possible deserialization error associated with the .csdfe file extensions had the external data file becomes inaccessible.

In csdmpy, however, irrespective of the dependent variable subtypes from the serialized JSON file, by default, all instances of DependentVariable class are treated an internal after import. Therefore, when serialized, the CSDM object should be stored as a .csdf file.

To store a file as a .csdfe file, the user much set the value of the encoding attribute from the dependent variables to raw. In which case, a binary file named filename_i.dat will be generated where \(i\) is the \(i^\text{th}\) dependent variable. The parameter filename is an argument of this method.

Note

Only dependent variables with encoding="raw" will be serialized to a binary file.

Parameters
  • filename (str) – The filename of the serialized file.

  • read_only (bool) – If true, the file is serialized as read_only.

  • version (str) – The file is serialized with the given CSD model version.

  • output_device (object) – Object where the data is written. If provided, the argument filename become irrelevant.

Example

>>> data.save('my_file.csdf')
copy()[source]

Create a copy of the current CSDM instance.

Example

>>> data.copy()  
Returns

A CSDM instance.