CPS-192: Design data store for Temporal Service
Data Model
Data model for CPS Temporal is limited to one entity to store configuration and state data with its timestamp.
The store keeps a new data instance each time a new data observation is made for a given dataspace, schema set and anchor.
Timestamp is used to partition time-series data.
Implementation
PostgreSQL database is proposed to implement CPS Temporal data store.
Timescale PostgreSQL extension is proposed for time-series data implementation.
By implementing PostgreSQL and its Timescale extension, CPS Temporal benefits from:
Time-series data features on top of standard well know Postgres database
All PostgreSQL relational database features
CPS previous experience with PostgreSQL (default reference implementation for CPS Core)
PostgreSQL JSONB data type for configuration and state data
Detailed schema information:
cpstemporaldb=# \d network_data
Table "public.network_data"
Column | Type | Collation | Nullable | Default
------------+--------------------------+-----------+----------+---------
timestamp | timestamp with time zone | | not null |
dataspace | character varying(255) | | not null |
anchor | character varying(255) | | not null |
schema_set | character varying(255) | | not null |
payload | jsonb | | not null |
version | timestamp with time zone | | not null |
Indexes:
"network_data_dataspace_anchor_timestamp_idx" UNIQUE, btree (dataspace, anchor, "timestamp" DESC)
"network_data_dataspace_schema_set_timestamp_idx" btree (dataspace, schema_set, "timestamp" DESC)
"network_data_payload_idx" gin (payload)
"network_data_timestamp_idx" btree ("timestamp" DESC)
cpstemporaldb=# SELECT * FROM timescaledb_information.dimensions;
hypertable_schema | hypertable_name | dimension_number | column_name | column_type | dimension_type | time_interval | integer_interval | integer_now_func | num_partitions
-------------------+-----------------+------------------+-------------+--------------------------+----------------+---------------+------------------+------------------+----------------
public | network_data | 1 | timestamp | timestamp with time zone | Time | 1 day | | |
(1 row)
Indexes above are designed to optimize following queries:
By dataspace, anchor. Order by timestamp desc.
By dataspace, schema set. Order by timestamp desc.
Adding a payload content condition to the 2 kinds of queries above. In this case, the gin index is used in combination with the index having the corresponding fields. See attached explain plan for an example.
The index for (timestamp) is automatically created by Timescale to support the main data partitioning by time.