Posts Tagged ‘Data Compression’

Database Technology Roadmap 2009 And Beyond

Wednesday, July 1st, 2009

There are 2 major database vendors working on their next big version of their database engines. Microsoft and Oracle are getting ready to release their best database system yet. Well, in the near future at least. Nevertheless here are some preliminary infos that leaked from the development teams.

Microsoft SQL Server 2010

Just last year in August we were introduced to SQL Server 2008, which finally brought us backup compression and data compression amongst many other new features. Many SQL Server customers are still recovering from the SQL Server 2005 migration and find it difficult to keep up with this breath taking speed of new releases.

So what’s new in SQL Server 2010?

It will build on the data warehouse improvement of SQL Server 2008 and adds even more support for multi terabyte databases.

The main focus of SQL Server 2010 will be on “managed self services”. Self tuning will be achieved by interpreting the Dynamic Management Views (DMV). Is this the death of the DBA? Not at all, it will redefine the skills and duties of a DBA in the day to day operations. But then again, let’s see if and how this works.

Emphasis on policies is another big change. Many policies are already available since SQL Server 2005, but in SQL Server 2010 they will be enforced by default.

The last improvement is focused on better email integration and integration into the Web 2.0 environment. Imagine; SQL Server goes Twitter.

After all, these are preliminary information available through some rumor mills and a little bit from the Microsoft website. One thing is for sure, with the release of SharePoint 2010 (beta available now), SQL Server 2010 will become even more important.

I’m pretty sure that there will be more information available soon.

Oracle 12g

Yes, you heard right. Oracle 12g is around the corner. There is not much information available on this new release. The only detail that leaked so far is that Oracle 12g won’t support raw filesystems anymore. This is bad news for RAC environments. The OCR and the voting disk relay on raw filesystems via CFS like OCFS.

The word is that ASM will step in and close the gap in 12g. Also, more emphasis on NFS will be placed as well.

Other than that, there’s not much information regarding functionality enhancements available. As soon as I get more details I will post it.

Sybase is not in the radar to release yet another major release in the near future. The focus is on synchronizing the ASE 15.0.3 release with the Sybase ASE CE (Cluster Edition) version. There is also a new project that will replace Sybase Central with a web based management tool. The ASE (standard and cluster edition) is already available.

Sybase just released a couple of major new releases in their product line, Sybase IQ 15, Sybase Replication Server 15 and Sybase ASE Cluster Edition mid last year. There are new major releases in planning, but not released in the near future like Microsoft and Oracle. That’s at least to my knowledge. One thing is remarkable with Sybase; they had the best quarter in Q1 of 2009 and I can’t wait to get the results for Q2.

One thing is always interesting to observe. This constant competition and the need to outperform drive these vendors to constantly push the envelope and we as the consumer will get better, faster and cheaper products.

The downside is that we have to constantly upgrade our systems. Over time this creates enormous strains on IT staff and budgeting. It seems that the pace of new major database releases has picked up noticable and it remains up to the IT managers to make the right call at the right time. The current cutting in staff and budgets is no help either.

Database vendors are packing more and more value added features into their systems to gain more customers and sell their product. Hopefully we will see a speedy recovery of the economy to enable these companies to bring back staff and put all these great features to work soon.

Thanks,

Peter Dobler

Sybase IQ – What’s New in Version 15

Tuesday, June 16th, 2009

It was summer of 2000 when I first learned about Sybase IQ and its revolutionary column vector database technology. As a long time Sybase ASE and Oracle DBA I was used to database engines that organize data in a row by row method. For quite some time I had difficulties to think in column terms and not in row terms.

A column vector database requires totally different methodology for performance and tuning efforts. Nothing is straight forward and the message that more data volume doesn’t make a difference in the query performance is not easy to understand. For example: A traditional database engine allows the usage of only one index per table in the same query. Sybase IQ has no limits. If each column in the query requires a different index, it will use a different index. In fact, by default every column is an index.

Getting my hands around the fact of having queries perform up to a 1000 times faster on Sybase IQ than on traditional row based RDBMS systems is no easy matter either. Of course in an Oracle implementation with the OLAP technology similar results can be achieved. However, you are paying for the underlying OLTP engine regardless if you’re using it or not. Sybase IQ doesn’t have this overhead.

One of the key features of Sybase IQ is its data compression. I worked with Sybase IQ systems that easily exceeded 80% compression ratio. Meanwhile, every database vendor introduced data compression into their database engines, but Sybase IQ is the undisputed leader in the highest compression ratio of them all.

This post is not meant to explain how Sybase IQ works and why it is so superior in analytical query processing compared to its row processing based counter parts. Please click here to read more about Sybase IQ’s amazing technology.

I know that there are other data warehouse systems out there that are equally as fast as Sybase IQ and some are even faster, but in this article I am focused on the Sybase IQ engine and its recent setting of a new benchmark record for TPC-H transactions. This record is all about saving money while providing blazing fast performance. Please click here to read the detail report on this milestone.

 

OK, back to what’s new in version 15 of Sybase IQ.

 

There are two major improvements in the new release that are worth mentioning.

 

1. The overall query performance was once again dramatically improved and yields in an average 20%-50% performance gain, compared to the previous Sybase IQ release.

What does this mean for your business?

Analytical queries are typically CPU hungry monsters that can eat up your entire processing resources. Producing results faster means more queries will be processed in the same time window.

It also means the hardware upgrade can be postponed for a while. Considering that the associated QA requirements to move an entire production system to a new hardware platform can be a very expensive proposition and combined with the cost of the new hardware maybe not worth the investment. In comparison; a standalone upgrade of the database engine might be worth the effort.

It further means that cheaper server hardware on Linux can be used to build Sybase IQ multiplex systems that produce high end performance results on a slim budget. Due to Sybase IQ’s architecture there are no added network constraints either. 

 

 

2. Multiple writer nodes in a single multiplex environment.

This is an enormous step forward. Previously a typical Sybase IQ was build with one big server that acts as the writer node and many smaller servers for the reader nodes. The thinking was to provide the best hardware to the CPU intensive load jobs to minimize the load windows. The downside of this architecture was that in a failure situation, one of the smaller servers would take over the writer part and then would be helplessly overwhelmed in case the writer node couldn’t be fixed in time for the next load.

It is also economically not practical to devote high-end, expensive server hardware to a job that only last for a fraction of a daily work load. Having multiple writer nodes solves this problem once and for all.
Utilizing all the available processing power in a multiplex environment ultimately leads to faster load performance, which can be solved without upgrading the writer node server hardware over and over again.

Also, another data load performance improvement is the new feature of loading data directly from clients. This means that data can be loaded from files using a simple SQL statement instead of copying data files onto a server and then using the bcp command.

 

 

Of course there are other major improvements in security, flexibility and integration support, but the two improvements above are the major contributors to any cost savings or cost avoidance initiative a business is taking on these days.

Sybase also improved their client apps to better manage Sybase IQ, easier develop applications for Sybase IQ and more effectively monitor Sybase IQ. Once the Achilles heel of Sybase, these tools are now very usable and mature.

From a cost/performance point of view, Sybase IQ is a force to be reckoned with and due to its column vector architecture there is no other major database engine in the market like it. To support Sybase’s strong performance in technology they also had their best financial year ever in 2008 and the best quarter on record in Q1 2009.

I hope you enjoyed my brief introduction into Sybase’s data warehouse engine Sybase IQ and its latest version 15 features.

Take care,
Peter