Posted by: pauditore | January 22, 2014

Big Data: a Day in the Life of the Database Administrator


In the age of big data, the three “Vs,” Volume, Velocity and Variety are king, making database management and data protection paramount in this millennium, but extremely complex.  Organizations are literally swimming in data and struggling to mange not only multiple databases, but new large data volumes, velocities and varieties of data to provide near real-time access to business critical data for decision-making.

Many organizations are challenged to manage, backup and secure hundreds of siloed databases.  The overall volume of data has increased dramatically and some  enterprises are now managing and moving more than 5 petabytes of data daily. More importantly many organizations lack data governance and security policies, which is why we are seeing large customer data thefts like Target most recently for example.

Late last fall I architected and helped conduct a primary market research survey of 200 IT professionals, of which 182 are directly involved in the management of databases. Nearly 50 respondents indicated they were managers of the IT groups involved in enterprise database management in their organization. The survey specifically excluded IT professionals not involved in database management, but does include some IT respondents that are knowledgeable about database management in their organizations.

  • Although this was an extremely long and highly technical, more than 150 respondents completed the entire survey. There was an average of approximately 150 respondents that answered each question, many of which were extremely comprehensive and complex much like the modern database management environment.
  • The majority of the data trended very early and was consistent with a preliminary analysis before the survey was completed. The overall results delivered an accurate and consistent assessment of current enterprise database management practices and the important challenges facing database administrators and their organizations.

Survey Architecture

The survey instrument explored a comprehensive set of questions directly related to database management, with the first half of the survey exploring the characteristics of organizational database infrastructure including:

  • Database solutions employed, number of databases and volume of data
  • Current backup strategy
  • Databases protected by HA technologies
  • Amount of data backed-up in near-real time
  • Disaster recovery challenges
  • Database availability infrastructures
  • Database protection solutions deployed
  • Challenges to maintaining database availability
  • SLA objectives and frequency of database update and backup

The second half of the survey explored the organizational issues and challenges currently encountered with databases including:

  • Organizational challenges in backup, data protection and recovery of databases
  • Organizational infrastructure for update and backup of databases
  • Current organizational strategy and management processes including testing
  • Success rate of backups, restores, recoveries, reasons for failure and times
  • Tools and solutions currently in use for backup and recovery and challenges
  • Planned organizational change of backup and recovery tools and reason for change.
  • Effectiveness of backup and recovery tools as a result of data volume and infrastructure change.

Key Findings  

A day in the life of today’s database administrator regularly involves: backing up approximately twenty databases, (although many enterprises have hundreds,) moving 200TB of data, in a combination of full, daily incremental to disk, daily full to disk and incrementally to disk or tape. Administrators are challenged by a highly complex database management infrastructure that includes onsite and offsite backups to tape, disk and appliances.

Major challenges also include: managing increasing data volumes, lack of funding, data governance and protection of business critical data, database performance and network bandwidth in addition to hard to use and complex testing tools. Although most organizations have a formal strategy in place for database management, and backup databases daily, few organizations regularly test all backups and employ HA technologies.

Net/Net: organizations perceive that the effectiveness of backup and recovery tools will become less effective because of increases in data volume, servers and infrastructure; but yet they are reluctant to change the slow, complex, and hard to use backup tools that require constant management.

  • Recovery manager, RAC and Data Guard were the top three solutions in use.
  • Less than one third of organizational databases are protected by HA technologies.
  • Organizations actively manage and/or backup <50 databases with an average of around 20 overall.
  • Organizations are managing 200 TB or less, and few are actively managing more than a PB.
  • Organizations are backing up databases in a combination of full, daily incremental to disk, daily full to disk and incrementally to disk or tape.
  • Organizations are not backing up all their data in real time, one third of the sample base indicated that <5% of their databases were backed up in real-time.

Most important disaster recovery challenges included:

  • Data security/governance and protecting business critical data
  • Backing up and managing increasingly large data volumes
  • Funding and budget constraints
  • Network bandwidth
  • Testing and quality assurance
  • Meeting RTOs
  • Organizational daily churn rate for database updates and modification for all databases varied widely, however, >20% indicated that <5% of their databases received daily updates and modification.
  • Onsite and offsite backups are the top database infrastructures employed for maintaining service and protecting data for all enterprise databases in the event of an outage.
  • Database performance, managing data volumes and increasing query loads/usage spikes were the top three challenges to maintaining database availability.
  • 80% of IT professionals indicated that they meet organizational database SLAs either all or most of the time, however not all backups are regularly tested.
  • Most organizations update and/or backup databases daily.
  • Nearly half of organizations (44%) backup <5 terabytes of data during normal backup while 30% regularly backup between 5 and 50 TB

Organizational challenges in data protection backup, and recovery of databases are:

  • Backing up and managing increasingly large data volumes
  • Funding and budget constraints
  • Network bandwidth and Reliability
  • Meeting RTOs
  • 53% of organizations have some type of integrated end-end solution for storage and backing up of databases.
  • Most organizations have a strategy and formal process for backup testing in place or have one under development.
  • 52% of organizations regularly test <10% of their database backups.
  • Most organizations indicated that <20% of enterprise backups were not successful in the last year.
  • Hardware and software bugs and user error were cited as the main reason for enterprise database backup failure.
  • Most organizations indicated that >10% of their restores and recoveries were not successful in the last year.
  • User error, hardware and software bugs and corrupt backups, were cited as the primary reasons for failure in restoring and recovering databases.
  • Enterprises normally backup and recover databases after and unplanned outage in four hours or less.
  • Organizations employ a wide range of solutions for backup and recovery of databases, tape and RMAN were employed most frequently followed closely by virtual and dedicated standby servers.

Most important organizational challenges in database backup and recovery:

  • Backups are too slow
  • Backups need constant management
  • Recovery is too complex
  • Backup processes slow down production servers
  • Difficulty in coordination of backup windows across servers

Most important complexity issues associated with recovery of virtual environments:

  • Backups need constant management
  • Too many virtual servers to backup
  • Backup tools too difficult to use
  • Difficulty in backing up to tape
  • Nearly 50% of organizations are not employing dedupe appliances, top appliances in use included: Datadomain, IBM and HP.
  • Most organizations have no plans to change backup and recovery tools.
  • Increase of data volume, backup server infrastructure, along with total cost of ownership, software costs, and complexity of tools were cited as the main reasons for potentially changing current backup and recovery tools.
  • Most organizations perceive that the effectiveness of backup and recovery tools will become less effective because of increases in data volume, servers and infrastructure.


Big data and its increasing volume, velocity and variety, along with business demands and the complexity of the database infrastructure will certainly put significant pressure on not only backup and recovery tools and solutions, but all aspects of the complex processes of database management. According to this dataset, more databases, more users, increased data volumes and the complexity of testing tools is likely to reduce the effectiveness of these tools and solutions. This presents an opportunity for database management vendors to address many of the key issues identified in this research with new robust and easy to use tools.

This also presents a great opportunity for NoSQL database vendors and the associated semantic tools to enable management and discovery of unstructured data, which in many cases is where the business value exists. With today’s triple store and graph database technologies, it’s really not about the type of data, structured or unstructured; its about the relationships between the data that provide the key business insights. Until next time the data dog wishes you great selling and marketing in this millennium!



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: