Monday, 30 October 2023

Sustainability - Selecting right database technology to optimize carbon footprint

I have written a lot and have presented a lot on time series data management in past. Mostly, I focused on reduction of storage, cost, development efforts and improvement of performance. However, revisiting it with an angle of sustainability gives an interesting perspective.

Sustainability in IT is directly related to how efficiently you manage data. Primarily, how efficiently you process data, store data and transfer data over the network. Selection of right database technology can influence first two directly.

In this blog, I would like to talk about efficiently handling time series data which can help drastically reduce carbon footprint by reducing storage requirement up to 70% and improving processing by 30 times.

Often, timeseries database is mis-understood as No-SQL database, but unfortunately very few people know that timeseries is a specific technology and is all about how time series data can be stored and processed efficiently.

So let us understand this time series world and how selecting right database technology can help drastically reduce carbon footprint.

Who generates time series data?

Traditionally, time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, communication engineering and largely in domain of applied science and engineering which involves temporal measurements. With emergence of the Internet of Things (IoT) and proliferation of connected devices, we are seeing more and more time series data is generated via sensors. And hence, irrespective of the domain, time series data is generated almost everywhere: Capital Markets, Energy and Utility, Telecommunications, Manufacturing, Logistics, Scientific Research, Intelligent Transportation and many more.

How does it look like?

The time series data has internal structure that differs from relational data. Many applications require to store data at frequent intervals that require massive storage capacity. For these reasons, it is not sufficient to manage time series data using traditional relational approach of storing one row for each time series entry. This increase storage and processing requirements exponentially and increase the carbon footprint drastically.


Informix TimeSeries handles it efficiently:

IBM's Informix TimeSeries feature provides a solution to this problem with breakthrough technology. The Informix TimeSeries feature is a combination of a TimeSeries data type and a large set of in-built analytical functions. How it manages time series data can be understood from my decade old article published in DBTA Magazine - Managing Time Series Data with Informix - Database Trends and Applications (dbta.com)

It can reduce the storage requirement by more than 50%, improve performance by orders of magnitude. With the integration of the TimeSeries feature with NoSQL/JSON and In-memory datawarehouse capabilities, it can handle heterogeneous and unstructured time series data, and run real-time analytics at speed of thought. The capability to store data up to hertz frequency further enhances its reach in different industries. And the rolling window feature eases out challenge of purging the humongous data periodically. Moreover, the way TimeSeries is structured, one can just keep inserting millions of records and still the performance will remain consistent without performing any database tuning activities, which drastically reduces processing requirements.


For more details and more solutions refer the red book I co-authored - Solving Business Problems with Informix TimeSeries (ibm.com)

Conclusion

Time series doesn’t necessarily mean a NoSQL data. It has its own structure and if right database technology like Informix TimeSeries is chosen to handle this data, one can drastically reduce the carbon footprint.


Disclaimer: The postings on this site are my own and don't necessarily represent IBM's position, strategies or opinions