This article needs additional citations for verification. (December 2018) |
A time series database is a software system that is optimized for storing and serving time series through associated pairs of time(s) and value(s).[1] In some fields, time series may be called profiles, curves, traces or trends.[2] Several early time series databases are associated with industrial applications which could efficiently store measured values from sensory equipment (also referred to as data historians), but now are used in support of a much wider range of applications. In many cases, the repositories of time-series data will utilize compression algorithms to manage the data efficiently.[3][4] Although it is possible to store time-series data in many different database types, the design of these systems with time as a key index is distinctly different from relational databases which reduce discrete relationships through referential models.[5]
Time series datasets are relatively large and uniform compared to other datasets―usually being composed of a timestamp and associated data.[6] Time series datasets can also have fewer relationships between data entries in different tables and don't require indefinite storage of entries.[6] The unique properties of time series datasets mean that time series databases can provide significant improvements in storage space and performance over general purpose databases.[6] For instance, due to the uniformity of time series data, specialized compression algorithms can provide improvements over regular compression algorithms designed to work on less uniform data.[6] Time series databases can also be configured to regularly delete (or downsample) old data, unlike regular databases which are designed to store data indefinitely.[6] Special database indices can also provide boosts in query performance.[6]
The following database systems have functionality optimized for handling time series data.
Name | License | Language | References |
---|---|---|---|
Amazon Timestream for LiveAnalytics | Commercial | Java | [7] |
Apache IoTDB | Apache License 2.0 | Java | [8] |
Apache Kudu | Apache License 2.0 | C++ | [9] |
Apache Pinot | Apache License 2.0 | Java | [10] |
CrateDB | Apache License 2.0 | Java | [11][12] |
eXtremeDB | Commercial | SQL, Python, C / C++, Java, and C# | [13] |
GreptimeDB | Apache License 2.0 | Rust | [14][15] |
InfluxDB | MIT.[16] Chronograf AGPLv3, Clustering Commercial[17] | Go (version 2), Rust (version 3)[18] | [13][19] |
Informix TimeSeries | Commercial | C / C++ | [13][20] |
Kx kdb+ | Commercial | Q | [13] |
MongoDB | Server Side Public License | C++, JavaScript, Python | [21] |
Prometheus | Apache License 2.0 | Go | [13] |
RedisTimeSeries | RSALv2/SSPLv1[22] | C | [23] |
Riak-TS | Apache License 2.0 | Erlang | [13] |
RRDtool | GPLv2 | C | [13] |
TimescaleDB | Apache License 2.0 | C | [24] |
VictoriaMetrics | Apache License 2.0 | Go | [13] |
Whisper (Graphite) | Apache License 2.0 | Python | [25] |
Definition 2:A Time Series Database(D)is an unordered set of m time series possibly of different lengths.
Relational databases and NoSQL databases can be used for time series data, but arguably developers will get better performance from purpose-built time series databases, rather than trying to apply a one-size-fits-all database to specific workloads.