Monday, 14 March 2016

Power of Expression Based Virtual Table Interface

Its been long time I haven't posted any blog. Have been busy with customer engagements. But then thought of sharing few good timeseries use cases that I handled using Expression Based Virtual Table Interface (EVTI).

All the examples below shall be explaining different queries based of the  following table structure:
create row type series
(   ts datetime year to fraction (5),   val1 float,   val2 float,  val3 float );

create row type float4
(  ts datetime year to fraction(5),  f_val1 float,f_val2 float,f_val3 float,f_val4 float );

create table ops_data
   meter_id integer,
   data timeseries(row1)
) ;

- As simple use case, need realtime aggregate of 1 minute data to 15min. Need avg, max,first and last value of val1

execute procedure TSCreateExpressionVirtualTab ('ops_data_agg15',
'ops_data','aggregateby("avg(val1),max(val1),last(val1),first(val1)","cal_15min",series,0,$ts_begin_time, $ts_end_time)::timeseries(float4))','float4');

- Val1 might have -ve values, hence need to run aggregates on absolute value of val1

execute procedure TSCreateExpressionVirtualTab ('ops_data_agg15',
'ops_data','aggregateby("avg(val1),max(val1),last(val1),first(val1)","cal_15min",abs(series),0,$ts_begin_time, $ts_end_time)::timeseries(float4))','float4');

- Need (current - previous) value of average of val1 and along with max, last and first value.

execute procedure TSCreateExpressionVirtualTab ('ops_data_agg15',
'ops_data','apply("$f_val1 - tsprevious($f_val1), f_val2,f_val3_f_val4", aggregateby("avg(val1),max(val1),last(val1),first(val1)","cal_15min",abs(series),0,$ts_begin_time, $ts_end_time)::timeseries(float4))','float4');

Few things to be noted in above query:
- Function "apply" is used and the timeseries created by aggregateby function is used as an argument to apply function
- tsprevious function is used to keep track of previous record and then subtract it from current value
- The output of function aggregateBy type cased to timeseries(float4), hence the column names used in apply as the named available in float4 row.

Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions,strategies or opinions

Monday, 12 January 2015

Informix TimeSeries - Some Best Practices for Beginners

I have been working with quite a few partners and customers who are using Informix TimeSeries. However, I realized that there is a set of instructions which is common and very much useful for all new users to take best out of the TimeSeries implementation, we will call it as best practices. And hence, I thought of writing this blog which will be useful for those whom I am working with as well whom I don’t know but they are using Informix TimeSeries. These tips will at least get you things done right in first go and later you can tune your server or queries as per your setup. Basic assumption is that, that you are using Informix Enterprise or Advance Enterprise edition on multi-core server. If you are using lower version, fragmenting table and setting PDQPRIORITY won’t be applicable.

A)      About containers, you can create multiple containers across same dbspace but try to keep container which keeps meta information of container, different for each container.
execute procedure TSContainerCreate('ts_tab1', 'data8dbs_1','ts_row',1024, 1024,
'2012-01-01 00:00:00.00000'::datetime year to fraction(5),'month',0,0,'data8dbs1,data8dbs2',1024,1024,0)

execute procedure TSContainerCreate('ts_tab2', 'data8dbs_2','ts_row',1024, 1024,
'2012-01-01 00:00:00.00000'::datetime year to fraction(5),'month',0,0,'data8dbs3,data8dbs4',1024,1024,0)

B) Create timeseries table say ts_tab with first and next extent size 1024
create table ts_tab
    device_id varchar(32),
    meter_data timeseries(ts_row),
    primary key (device_id) 
  ) fragment by round robin in dbs1, dbs2
extent size 1024 next size 1024 lock mode row;

C) While initializing timeseries, you can assign containers in round robin fashion to each timeseries like:
insert into ts_tab values('tag1', TSCreateIrr('ts_1min','2014-10-01 00:00:00.00000',0,0,0, 'ts_tab1),
insert into ts_tab values('tag2', TSCreateIrr('ts_1min','2014-10-01 00:00:00.00000',0,0,0, 'ts_tab2'),
insert into ts_tab values('tag3', TSCreateIrss('ts_1min','2014-10-01 00:00:00.00000',0,0,0, 'ts_tab1'),
insert into ts_tab values('tag4', TSCreateIrr('ts_1min','2014-10-01 00:00:00.00000',0,0,0, 'ts_tab2'),
And so on..
This is example, use appropriate calendar, origin etc as per your setup.

Ensure you run “update statistics high for table ts_tab” after initializing all timeseries and before starting data upload.

D) As far as onconfig is concerned, I observed setting following parameters gives excellent performance. I have used onconfig which I used for 100K meter IDs. These values might be different for your setup, but this section gives you an idea about the parameters to be looked at:

Parameter             Values                    Remarks
LOGSIZE              102400                Just changing parameter in onconfig won't help. You have to create new log
 files with this size and delete older ones.
PHYSFILE            4194304             Create a dbspace of size 4GB with default page size. And use onparams
                                                          command to create new PHYSFILE of 4GB. You have to take 0 level backup
                                                          after this step. You can run ontape -s -L 0 -t /dev/null to take dummy backup.
CKPTINTVL        600
RTO_SERVER_RESTART               0
STACKSIZE         128
BUFFERPOOL     size=**,buffers=??,lrus=8,lru_min_dirty=80,lru_max_dirty=90

#?? - Set this value to 2.5 * number of meters. 
#** page size of a dbspace/s in which containers are created

VPCLASS cpu,num=??,noage      #Change ?? to the number of cores you have on your machine

E) Set PDQPRIORITY to 60 in your environment or in SQL sessions.

F) While querying data from timeseries set :
set isolation to dirty read;
set lock mode to wait 10;
<after this you can add your select query>

Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions,strategies or opinions

Friday, 3 October 2014

Informix is the only commercial database available on HP Moonshot

The new Moonshot servers ship with Canonical’s Ubuntu Linux pre-installed, as well as a stack of software from ngnix that includes memecache and other web programs. IBM’s Informix is also offered as an option - the only commercial database currently available for Moonshot..

Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions,strategies or opinions

Friday, 12 September 2014

Solving Business Problems with Informix TimeSeries - A Redbook

This IBM Redbooks® publication is for people who want to implement a solution that revolves around time-based data. It gives you the information that you need to get started and be productive with Informix TimeSeries.

Table of contents

Chapter 1. Overview of IBM Informix
Chapter 2. A use case for the Informix TimeSeries feature
Chapter 3. Defining your TimeSeries environment
Chapter 4. Implementing Informix TimeSeries
Chapter 5. Querying TimeSeries data
Chapter 6. Managing the ecosystem
Appendix A. Reference material
Appendix B. Enterprise historian database example
Appendix C. Distribution grid monitoring enabler