Note
Go to the end to download the full example code.
Labeled Dataset¶
The CSD model also supports labeled dimensions. In the following example, we present a mixed linear and labeled two-dimensional dataset representing the population of the country as a function of time. The dataset is obtained from The World Bank.
Import the csdmpy model and load the dataset.
import csdmpy as cp
filename = "https://www.ssnmr.org/sites/default/files/CSDM/labeled/population.csdf"
labeled_data = cp.load(filename)
The tuple of dimension and dependent variable objects from labeled_data instance
are
Since one of the dimensions is a labeled dimension, let’s make use of the
type attribute of the dimension instances
to find out which dimension is labeled.
print(x[0].type)
linear
print(x[1].type)
labeled
Here, the second dimension is the labeled dimension with [1]
print(x[1].count)
263
labels, where the first five labels are
print(x[1].labels[:5])
['Aruba' 'Afghanistan' 'Angola' 'Albania' 'Andorra']
Note
For labeled dimensions, the coordinates
attribute is an alias of the labels
attribute.
print(x[1].coordinates[:5])
['Aruba' 'Afghanistan' 'Angola' 'Albania' 'Andorra']
The coordinates along the first dimension, viewed up to the first ten points, are
print(x[0].coordinates[:10])
[1960. 1961. 1962. 1963. 1964. 1965. 1966. 1967. 1968. 1969.] yr
Plotting the dataset
You may plot this dataset however you like. Here, we use a bar graph to
represent the population of countries in the year 2017. The data
corresponding to this year is a cross-section of the dependent variable
at index 57 along the x[0] dimension.
print(x[0].coordinates[57])
2017.0 yr
To keep the plot simple, we only plot the first 20 country labels along
the x[1] dimension.
import matplotlib.pyplot as plt
import numpy as np
x_data = x[1].coordinates[:20]
x_pos = np.arange(20)
y_data = y[0].components[0][:20, 57]
plt.bar(x_data, y_data, align="center", alpha=0.5)
plt.xticks(x_pos, x_data, rotation=90)
plt.ylabel(y[0].axis_label[0])
plt.yscale("log")
plt.title(y[0].name)
plt.tight_layout()
plt.show()

Footnotes
Total running time of the script: (0 minutes 0.569 seconds)