Exponentially increasing data growth, extremely competitive global economies, and tightening compliance regulations across all industries made the challenge of data management more overwhelming than ever in 2007. In addition, there is a shift in the type of data being generated. IDC projects by 2010, 80% of storage will be for secondary data' data including backup, archive, and replication copies.
Data de-duplication was a concept relatively unknown at the beginning of 2007, yet ended the year as a $100 million market of its own. Known also as single instant store and duplicate data reduction, de-duplication identifies and stores only unique data at the sub-file level. If a data string or chunk has already been stored in the system, it is referenced by a pointer rather than stored a second time, third, forth and nth time.
Reducing disk capacity needs by 20:1 or more is where de-duplication technology changes the game: all the benefits of disk-based backup and archive can be achieved with significantly less disk capacity and at a cost similar to tape. Also, backups and archives can be retained for longer periods of time at no additional cost to support ever stringent regulatory requirements and eDiscovery needs. By implementing a de-duplication solution, IT organizations can reduce backup windows, RTOs, and have quick and reliable access to archives when needed, while gaining the ability to cost-effectively replicate data offsite for disaster protection, even in bandwidth-constrained environments.
However, there was a lack of education and a lot of hype regarding de-duplication technology in 2007. Storage vendors hyped ever larger 'de-dupe' ratios in an effort to show technological superiority. The danger is that de-dupe ratios alone do not measure or even indicate the final benefits of the solution. IT managers need to look at the larger picture and also evaluate storage systems with de-duplication on protection, performance and scope.