Posted by: pauditore | January 22, 2014

Big Data: a Day in the Life of the Database Administrator


In the age of big data, the three “Vs,” Volume, Velocity and Variety are king, making database management and data protection paramount in this millennium, but extremely complex.  Organizations are literally swimming in data and struggling to mange not only multiple databases, but new large data volumes, velocities and varieties of data to provide near real-time access to business critical data for decision-making.

Many organizations are challenged to manage, backup and secure hundreds of siloed databases.  The overall volume of data has increased dramatically and some  enterprises are now managing and moving more than 5 petabytes of data daily. More importantly many organizations lack data governance and security policies, which is why we are seeing large customer data thefts like Target most recently for example.

Late last fall I architected and helped conduct a primary market research survey of 200 IT professionals, of which 182 are directly involved in the management of databases. Nearly 50 respondents indicated they were managers of the IT groups involved in enterprise database management in their organization. The survey specifically excluded IT professionals not involved in database management, but does include some IT respondents that are knowledgeable about database management in their organizations.

  • Although this was an extremely long and highly technical, more than 150 respondents completed the entire survey. There was an average of approximately 150 respondents that answered each question, many of which were extremely comprehensive and complex much like the modern database management environment.
  • The majority of the data trended very early and was consistent with a preliminary analysis before the survey was completed. The overall results delivered an accurate and consistent assessment of current enterprise database management practices and the important challenges facing database administrators and their organizations.

Survey Architecture

The survey instrument explored a comprehensive set of questions directly related to database management, with the first half of the survey exploring the characteristics of organizational database infrastructure including:

  • Database solutions employed, number of databases and volume of data
  • Current backup strategy
  • Databases protected by HA technologies
  • Amount of data backed-up in near-real time
  • Disaster recovery challenges
  • Database availability infrastructures
  • Database protection solutions deployed
  • Challenges to maintaining database availability
  • SLA objectives and frequency of database update and backup

The second half of the survey explored the organizational issues and challenges currently encountered with databases including:

  • Organizational challenges in backup, data protection and recovery of databases
  • Organizational infrastructure for update and backup of databases
  • Current organizational strategy and management processes including testing
  • Success rate of backups, restores, recoveries, reasons for failure and times
  • Tools and solutions currently in use for backup and recovery and challenges
  • Planned organizational change of backup and recovery tools and reason for change.
  • Effectiveness of backup and recovery tools as a result of data volume and infrastructure change.

Key Findings  

A day in the life of today’s database administrator regularly involves: backing up approximately twenty databases, (although many enterprises have hundreds,) moving 200TB of data, in a combination of full, daily incremental to disk, daily full to disk and incrementally to disk or tape. Administrators are challenged by a highly complex database management infrastructure that includes onsite and offsite backups to tape, disk and appliances.

Major challenges also include: managing increasing data volumes, lack of funding, data governance and protection of business critical data, database performance and network bandwidth in addition to hard to use and complex testing tools. Although most organizations have a formal strategy in place for database management, and backup databases daily, few organizations regularly test all backups and employ HA technologies.

Net/Net: organizations perceive that the effectiveness of backup and recovery tools will become less effective because of increases in data volume, servers and infrastructure; but yet they are reluctant to change the slow, complex, and hard to use backup tools that require constant management.

  • Recovery manager, RAC and Data Guard were the top three solutions in use.
  • Less than one third of organizational databases are protected by HA technologies.
  • Organizations actively manage and/or backup <50 databases with an average of around 20 overall.
  • Organizations are managing 200 TB or less, and few are actively managing more than a PB.
  • Organizations are backing up databases in a combination of full, daily incremental to disk, daily full to disk and incrementally to disk or tape.
  • Organizations are not backing up all their data in real time, one third of the sample base indicated that <5% of their databases were backed up in real-time.

Most important disaster recovery challenges included:

  • Data security/governance and protecting business critical data
  • Backing up and managing increasingly large data volumes
  • Funding and budget constraints
  • Network bandwidth
  • Testing and quality assurance
  • Meeting RTOs
  • Organizational daily churn rate for database updates and modification for all databases varied widely, however, >20% indicated that <5% of their databases received daily updates and modification.
  • Onsite and offsite backups are the top database infrastructures employed for maintaining service and protecting data for all enterprise databases in the event of an outage.
  • Database performance, managing data volumes and increasing query loads/usage spikes were the top three challenges to maintaining database availability.
  • 80% of IT professionals indicated that they meet organizational database SLAs either all or most of the time, however not all backups are regularly tested.
  • Most organizations update and/or backup databases daily.
  • Nearly half of organizations (44%) backup <5 terabytes of data during normal backup while 30% regularly backup between 5 and 50 TB

Organizational challenges in data protection backup, and recovery of databases are:

  • Backing up and managing increasingly large data volumes
  • Funding and budget constraints
  • Network bandwidth and Reliability
  • Meeting RTOs
  • 53% of organizations have some type of integrated end-end solution for storage and backing up of databases.
  • Most organizations have a strategy and formal process for backup testing in place or have one under development.
  • 52% of organizations regularly test <10% of their database backups.
  • Most organizations indicated that <20% of enterprise backups were not successful in the last year.
  • Hardware and software bugs and user error were cited as the main reason for enterprise database backup failure.
  • Most organizations indicated that >10% of their restores and recoveries were not successful in the last year.
  • User error, hardware and software bugs and corrupt backups, were cited as the primary reasons for failure in restoring and recovering databases.
  • Enterprises normally backup and recover databases after and unplanned outage in four hours or less.
  • Organizations employ a wide range of solutions for backup and recovery of databases, tape and RMAN were employed most frequently followed closely by virtual and dedicated standby servers.

Most important organizational challenges in database backup and recovery:

  • Backups are too slow
  • Backups need constant management
  • Recovery is too complex
  • Backup processes slow down production servers
  • Difficulty in coordination of backup windows across servers

Most important complexity issues associated with recovery of virtual environments:

  • Backups need constant management
  • Too many virtual servers to backup
  • Backup tools too difficult to use
  • Difficulty in backing up to tape
  • Nearly 50% of organizations are not employing dedupe appliances, top appliances in use included: Datadomain, IBM and HP.
  • Most organizations have no plans to change backup and recovery tools.
  • Increase of data volume, backup server infrastructure, along with total cost of ownership, software costs, and complexity of tools were cited as the main reasons for potentially changing current backup and recovery tools.
  • Most organizations perceive that the effectiveness of backup and recovery tools will become less effective because of increases in data volume, servers and infrastructure.


Big data and its increasing volume, velocity and variety, along with business demands and the complexity of the database infrastructure will certainly put significant pressure on not only backup and recovery tools and solutions, but all aspects of the complex processes of database management. According to this dataset, more databases, more users, increased data volumes and the complexity of testing tools is likely to reduce the effectiveness of these tools and solutions. This presents an opportunity for database management vendors to address many of the key issues identified in this research with new robust and easy to use tools.

This also presents a great opportunity for NoSQL database vendors and the associated semantic tools to enable management and discovery of unstructured data, which in many cases is where the business value exists. With today’s triple store and graph database technologies, it’s really not about the type of data, structured or unstructured; its about the relationships between the data that provide the key business insights. Until next time the data dog wishes you great selling and marketing in this millennium!


Posted by: pauditore | January 16, 2014

Big Data Variety: Tables-Text & Triples and Oceanography

Check out my latest blog on Big Data Variety at
Killer Whale Baja California

Posted by: pauditore | December 13, 2013

Sematology and Modern Information Management

Arthropoda, one of the largest groups of animals on earth!

Arthropoda, one of the largest groups of animals on earth!


Check out the datadogs first foray into the world of taxonomy, semantics and discovery of big data value at









Posted by: pauditore | November 18, 2013

Enterprise Software Predictions for 2014

The Data Dog Swims In the Russian River

The Data Dog Swims In the Russian River

The Data Dog

The Data Dog

Welcome to the twilight zone of information management where every second of every of day terabytes and petabytes of structured and unstructured information are relentlessly created by machines, humans and now their devices. The digital exhaust from social media, BYOD and telecommunications represent an almost unmanageable tsunami of information that is challenging nearly every organization, business and individual globally. Many organizations have hundreds of databases and information archives often in silos making it nearly impossible for anyone to find the right and related information for accurate and business-critical decision making.

Read The Anatomy of Big Data article at the SandHill link listed above!

Posted by: pauditore | July 23, 2013

Big Data & The Future of Fish

Captain Kids Woods Hole

Captain Kids Woods Hole

This month’s blog is big data and fisheries, the problem is there is so much statistical big data available on the dismal management of fisheries that no even knows about it except for the National Marine Fisheries Service, the fishing industry and the ineffective egalitarian groups that think they can facilitate change. Believe it or not during the 1800-1900s the majority of fish eaten in the United States came from our Great Lakes. Enter the steam engine and off we went to demolish many of the fisheries in our oceans, the fishing pressure has been so tremendous these last 100 years that we have nearly irreversibly damaged many species that will never return to commercial harvesting levels. One of the only fisheries in the Northwest Atlantic Ocean to survive this pressure has been the lobster industry. Lobsta men now are regulated, but they have to run hundreds of traps to make a living, when just 40 years ago there was a huge off-shore lobstering industry. I remember it well because my grandfather was part of it, however, it no longer exists thanks to the NMFS and the fishing industry. Egalitarian groups had no play in the 1940 and 1950’s and from my perspective they have had little to no impact to date.

It’s been fifty-two years since the beginning of Greenpeace (1971) and along with it the Sierra Club (1912), the NRDC, Save the Fish and now we have the Future of Fish Let’s take a look in the rear view mirror, the Japanese are still harvesting whales, Monterrey Bay Aquarium is telling people not to eat farmed fish and there has been little to no effective changes in the NMFS that have lead to sustainable fisheries management. In fact when you look at the big data picture the only thing the NMFS has done right was the marine mammal protection act, at least that worked. Just recently I was in Morro Bay, California where there was a great abalone fishery. During a discussion with a local restaurant owner, he blamed the otters for wiping out the abalone not the fisherman and diving bells. Wait a minute, weren’t the otters and abalone there before the fisherman?

As a young man growing up in Gloucester, Massachusetts I watched species after species of fish get literally wiped out because of over fishing, and lack of effective sustainable fisheries management from our government and the fisheries industry. I can still distinctly remember one early morning in 1971 around 4 am greasing the pumps at the Kennebec Fish Factory in preparation for the herring fleet’s landings. As the boats moved in position to offload their catches I noticed that the majority of the Gulf of Maine herring being pumped into the trucks where less than a finger length long. I felt that something was wrong, but then again two representatives of the National Marine Fisheries Service where to observe and I thought that well if the NMFS is here then it is ok.  Less than ten years later the entire herring fishery was irreversibly damaged and has not recovered to this date. The good news is that some species will come back if they are given the chance and others like the herring and the Red Fish will probably never recover.

I became a marine biologist in an effort to create more sustainable fisheries management, and through tremendous personal and professional sacrifice made significant contributions to fisheries science. Nonprofit organizations whose mission is challenging current practices in world fisheries management and providing new and innovative methods for management are not working. As a scientist I spent seven years of my life studying the early life history and development of two species of fish, the Atlantic Cod and Haddock, primarily on the Georges Bank about 100 miles east of Cape Cod. The Georges Bank once the world’s most productive fishing ground made my hometown the largest fishing port in the world for over two hundred years.  As a result of poor management, the fishermen of Gloucester are no more.


The future of fish will not be effected by the Future of, along with Greenpeace (whom now have x-employees working for big oil) and many of these other organizations that take money from industry and in effect protect some of the industry’s agenda. It is more than obvious that nonprofit orgs devoted to fisheries and their protection are not making any difference in sustainable fisheries management, for sure not in this country. The Future of Fish is just another pipe dream, strewn with x-peace corps veterans that actually have no experience what so ever in the fisheries industry. Do you really think the fishing industry is going to pay attention to anyone that has no experience in their industry? They are not Clayton Christensen of Harvard Business School and are not going to study an industry and make change without any experience it in or cache.  So enjoy your salmon, halibut and cod before they too are eliminated from commercial harvesting. The next time you see the nonprofit asking you for resources think twice about it, in my view they bring wanted media awareness but beyond that their track record is actually quite dismal.

Posted by: pauditore | June 3, 2013

Tales From the Big Data Boot Camp: Scene One

The Data Dog: Swimming in Big Data-Russian River

Conducted in NYC at the Hilton the first Big Data Boot Camp was well attended primarily by IT professionals.  However, in what has become a continuing saga of the odd couple, information technology professionals for the most part are still not connected and working closely with line of business, only two out of 50 attendees in the room indicated that they were working with LOB when asked this question during a session.

The Big Data Opportunities Survey

Prior to the conference Information Today, the show organizer conducted a survey of 300 data managers, again primarily IT professionals. Although the sample size is extremely small there are some nuggets to be found in the data that are consistent with other research initiatives.

Key Findings

Top Industries With Big Data Initiatives

  • Retail
  • Financial services
  • Technology
  • Manufacturing
  • Government
  • Education

Net/Net: This is consistent with other research and what the major analyst firms are also seeing in the market, however, this survey missed one of the most important industries that generates tons of big data, telecommunications! We all know that media, services, healthcare and the energy industry also have huge big data initiatives underway according to other research surveys. And that North American companies are again way ahead of the pack when it comes to big data.

Big Data Business Initiatives

  • Customer analysis
  • Historical data analysis
  • Machine data, production system monitoring
  • Website monitoring and analysis
  • IT systems log monitoring and analysis
  • Competitive market analysis
  • Content management
  • Social media analysis

Net/Net: Very surprising to see social media analysis listed last in order of importance, however, again the respondents are IT professionals and are probably not aware of what business and marketing are doing especially when it comes to the customer. Social media is an easy and fast way to the customer base and this is inconsistent with other social media and business intelligence research.

Big Data Types

  • Production or transactional data
  • Real time data feeds
  • Textual data
  • ERP data
  • CRM data
  • Historical data
  • Web logs
  • Social media data
  • Multimedia data
  • Machine2machine data
  • Sensor data
  • Spatial data

Net/Net: According to this IT sample base these are the primary big data sources that their organizations are dealing with in order of importance. We all know that the majority of ERP data is transactional in nature especially in the supply chain, but what is interesting here are the challenges these data types present. Integration, inconsistent semantics, metadata and applications demanding data along with the silos that most data exist in are major issues in managing big data especially for business initiatives moving forward.

In the next blog I will elaborate on the business challenges and the volumes of big data currently being managed in this data set. Until next time I wish you great selling and marketing in this millennium.

Posted by: pauditore | April 22, 2013

The Odd Couple: IT & LOB Challenged by Big Data

The Data Dog

The Data Dog

At the heart of this dilemma is the odd couple, Felix Unger (Tony Randall) the IT guy who is motivated by structure and order and Oscar Madison (Jack Klugman) the LOB manager who is somewhat chaotic but business driven. For those that don’t remember, Felix was a TV writer and Oscar a sports editor both recently divorced, they shared a common goal, they didn’t want to live alone and inspite of their extreme idiosyncrasies, they lived together many years in the same house. Does this sound familiar, like the cultural rift between IT and LOB managers and the fact that most IT guys don’t understand their business? Well today with the advent of big data the odd couple must more than just live together. Big data represents an opportunity for competitive advantage and because data is information it is of seminal importance to any company or organization.

Alignment of IT and LOB has been a major challenge facing nearly all organizations for many years, especially in large enterprises where archaic and arthritic systems inhibit the modernization of core administrative and operational systems, and IT and business continuously point fingers at each other. Large organizations are facing significant organizational, cultural and technology challenges that are impairing their ability to maintain competitiveness, go global and increase profitably. Competition is driving down prices; shrinking margins are requiring greater efficiencies in operations, and consolidation, while offering greater economies of scale can actually lower efficiency if incompatibilities in systems are not addressed by IT. Small business and medium businesses on the other hand, are not as impaired by antiquated systems and platforms and are getting IT without the IT guys in the form of SaaS or utility computing.

Enter the Cloud & Software as a Service

New enabling and disrupting technologies are entering the market such as software as a service, (SaaS) and cloud based environments will play a key role in big data analysis and the optimization of many business processes. Enabling technologies such as Web 2.0 and service-oriented architectures (SOAs) are facilitating the modernization of legacy systems in some large enterprises, especially databases and data warehouses. Disrupters like telematics and predictive analytics are enabling the concept of pay-as-you-go insurance through quantifiable risk and big data.  These disrupting technologies will play a seminal role as early adopters deploy them to gain competitive advantage in the market, and in Thomas Friedman’s The World Is Flat, these technologies are considered flatteners in the global economy along with big data.

Expecting business and IT cultures to suddenly march to the same drummer and become a bastion of IT-LOB collaboration is unrealistic. However, organizations that make this happen and will reap the benefits of big data.  The world is flat, and now IT and LOB must work together more closely than ever.  One of the greatest barriers to optimizing and automating business process is not technology itself, but organizational culture, and a major question facing organizations today is how Felix and Oscar will work together to change a reactive culture into one that is proactive in the age of big data.

Business and IT Alignment

Like the chorus in a Greek tragedy, industry cognoscenti have been engaging in a continuous chant of “alignment of business and IT strategy for several years now. “Alignment” according to the OED originates from the old English word “alinement,” and is about putting soldiers in a straight line. Felix and Oscar must be more than aligned: they need to be in lock step with a common set of goals that leads to an Emmy for the business performance improvements that are possible though system modernization and/or replacement.

Firstly, IT and LOB need to clearly understand the tactical and strategic business initiatives critical to the company’s success and the business value of their big data. Once these are defined and understood then it’s up to IT to help LOB understand what IT can deliver, this is what I call business-driven IT, or big data enabled IT.

Secondly, IT and LOB must view their business processes as assets or investments, not as the cost centers as they were treated as during most of the 1980’s and 1990’s. As assets they will generate financial returns and should be managed accordingly, and deliver measurable returns. Accurate assessment and modeling of processes insures successful process optimization that delivers a definable range of returns before the optimization and investment occurs, this is called predictable ROI. The ROI of big data is just beginning to be discovered.

Modernization vs. Replacement of Databases & Data Warehouses

According to the analyst community one of the most commonly asked questions by IT today is “how can I renovate my legacy environment?”  Perhaps the first issue organizations should evaluate is the cost of total system replacement vs. modernization. And how they can leverage today’s enabling technologies to put a new face on old systems, and employ a surround strategy to pull together disparate platforms. To mitigate the risks inherent in a traditional replacement strategy, and to maximize the near-term value delivered to the business, organizations should evaluate alternative strategies to total replacement based on the following principles:

  • To the greatest extent possible, the existing legacy environment should be leveraged, rather than discarded.
  • Required/desired business functionality that the current systems are incapable of supporting should be introduced in manageable phases using easily integrated and extensible software components.
  • Project decision-making should be driven by a focus on risks and returns associated with each incremental investment, rather than on meeting some pre-determined list of technical milestones.

Business Benefits of IT Modernization

Overall, any modernization or replacement solution must be mapped out by IT and LOB and significantly improve the ease of doing business for internal staff and customers. IT must work very closely with LOB managers across several departments within an organization to deliver a systems environment that:

  • Is customer-focused, and extends the enterprise to customers via a self-service model.
  • Provides business flexibility by easily evolving and extending as customer requirements and/or business conditions evolve.
  • Provides consolidated views of information related to both core and adjunct activities by integrating information from multiple sources.
  • Supports compliance, business flexibility, and the integration of disparate information sources across business units.

You might think that in today’s economy business success will be dependent on IT and LOB working closely together to achieve organizational goals and initiatives such as customer service. However, the rapid evolution of cloud based technologies and big data has enabled many organizations large and small to deploy IT, without IT, to optimize specific business processes such as HR or sales force automation and now in the case of big data analytics. Organizations and companies in emerging markets such as India, Brazil, and China have a technology advantage in the global economy because they don’t have legacy systems. They can jump right into business process utility computing, cloud and big data as long as they comply with national regulations on financial services and private big data. There will always be a place for Felix when it comes to migrating data from legacy systems or integrating disparate systems, but when it comes right down to it cloud and big data initiatives directly address the IT- LOB disconnect that has plagued businesses and organizations for decades. The big data tsunami is a major challenge to both IT and line of business professionals, until next time I wish you great selling and marketing in the millennium.

Today everything is big, big government, big oil (or energy as they call themselves now) and big data. The problem with big data is finding your smart data is like looking for a needle in the haystack. This is the challenge, finding the smart data that has business value and applying it in real time. In my last major research initiative around business intelligence and data warehousing (circa 1999-2000) the surveys found utopia would be real-time data mining of OLTPs. This is now possible.

The Data Dog Smokes

The Data Dog Smokes

Big data encompasses a myriad of information technologies that are difficult to manage, very expensive in addition to being difficult to use. They range from the traditional data warehouse infrastructure to now encompassing new enabling open source technologies like Hadoop, and new high performance NoSQL and columnar databases. One of the newest enabling technologies to come to market to swell the big data tsunami is In-Memory processing and In-Memory databases. In-Memory technologies will provide many organizations with real-time access to their transactional databases and ERP information never before accessed in real-time. The business value in this is potentially tremendous when it comes to financials, retail and supply chain and many other business processes.

We asked a random set of 100 information technology professionals involved in their companies data management and data warehousing what was most relevant in their big data world. The results are as follows:

  1. Big data analytics in general including a comparison of current technologies like data warehousing versus a newer generation of integration and analytic tools.
  2. Data integration including issues associated with implementing real-time integration and analytics.
  3. Managing and integrating unstructured data.
  4. Understanding and managing social media data.
  5. Education about Hadoop and NoSQL database technologies.
  6. Education about advanced analytical technologies such as In-Memory and predictive analytics with a focus on big data.
  7. Database security and data governance.
  8. Open source database and data integration solutions.

CEOs and their BODs are always on the hunt for competitive advantage and technology is often one of the only weapons available. Until next time I wish you great selling and marketing in this millennium.

Posted by: pauditore | March 29, 2013

Social Media, Big Data and Business Intelligence

The Data Dog

The Data Dog

Data Data everywhere but not many companies know how to get it, use it and leverage it for competitive advantage. Research from board of director surveys all points to one major initiative, how to gain competitive advantage. Although this data is a year old it is the first ever and largest of its kind in the realm of Big Data. 

Social media networks and peer groups are creating large data sets that are now enabling companies and organizations to gain competitive advantage and improve performance. These data sets provide important insights into customer behavior, brand reputation, and the overall customer experience. Intelligent early adopter organizations are beginning to monitor and collect this data from propriety and open social media networks. This benchmark survey, perhaps the first of its kind, explores how small, medium and large organizations are leveraging business intelligence tools and platforms to harvest, manage and analyze social media data and what their future plans are to leverage it for business initiatives.

Survey questions 1-22 are primarily technical in nature and have been designed to provide insights into the current status SMN and BI data collection and management, and a preliminary understanding of how organizations are employing BI tools and platforms. Questions 22-33 examine the business initiatives that are motivating the new discipline of social media data collection and what business processes are supported. Questions 34-38 are primarily demographic in nature.

Executive Summary

The monitoring, collection and building of social media data stores is a very new information technology and business discipline that is all about the customer.   North America and European countries participated in the survey and the data trended consistently and normally through all questions, with more than 450 respondents completing the survey.

Only 30% of organizations are currently monitoring proprietary social media networks (SMNs), and nearly half indicated that they are monitoring open SMNs. Most importantly, 75% of the overall sample base are not collecting data from proprietary and/or open social media networks. Plans to monitor, collect, stage and analyze SMN data in the next year, 1-2 years and 3-5 years, clearly reflects a nascent or early adopter market, with early majority, late majority and laggards following a classic bell curve in evolution. This represents a great market opportunity for entrenched large enterprise BI vendors, and new Cloud and SaaS vendors to create and deliver products and services, thought leadership and best practices for conducting social media and business intelligence processes and analysis.

  • Slightly more than half of the organizations plan to increase SMN monitoring and data collection and analysis in the 1-2 years
  • More than half expect to increase investment SMN BI tools in the next 1-2 years.

Executive Summary Continued

  • Answers to Questions 9-14 about data volume, type, hosting, storage, and frequency of data movement were somewhat inconclusive because of the early market.

Top Legacy BI Vendors in Use

  1. IBM
  2. Oracle
  3. SAS Institute
  4. SAP Business Objects

Top New Social Media BI Vendors

  1. Google
  2. SAS
  3. IBM
  • Only 15% of organizations plan to invest in Cloud and SaaS based social media tools in the next year, but increases significantly in the next 2-3 years.
  • PCs were the primary device employed to view and analyze social media data followed by tablets and smart phones.

Sales/marketing, PR-Communications and customer service are the top business functions employing SMN data monitoring, collection and business intelligence analysis.

Top Business Initiatives Supported

  1. Brand-Reputation Management
  2. Marketing Communications
  3. Customer Service
  4. Customer Experience Management
  5. Sales
  6. CRM

Top Metrics Employed

  1. Customer Satisfaction
  2. Overall Buzz
  3. Brand Experience
  4. Advertising Campaign Performance


  • The majority of respondents are information technology professionals.
  • The sample base is primarily North American (62%), EMEA (30%) and Germany is the largest EMEA country represented.
  • Financial services are the largest industry segment represented.

Question (1.) Monitoring of proprietary or third party built social media networks- Overall 711 respondents, LOB = 105 EMEA=172

Only (30%) of the sample base indicated that they are currently monitoring proprietary SMNs, and nearly 500 respondents (70%) are not monitoring proprietary SMNs. This infers that the majority of companies in the sample base have not built a proprietary social media network or engaged with a third party to create one, in order to interact with customers, partners and suppliers.  Responses were similar for LOB respondents, and the overall EMEA sample set.

Question (2.) Monitoring of open online social media communities and networks- 712 respondents overall, LOB =106 and EMEA =173

In contrast more than half of the sample base (52%) are currently monitoring open SMNs and 47% are currently not engaged with SMNs. (72%) of LOB respondents indicated that their organization was monitoring SMNs, while only (27%) were currently not, and in contrast only (47%) of EMEA based respondents were monitoring SMNs and (52%) were not.

This indicates that many companies and organizations understand the importance of social media networks and information flow in open SMNs, as it directly impacts brand perception and potentially demand generation. Social media networks are rapidly becoming the new word of mouth marketing platform in the millennium. LOB respondents are closer to the customers and understand their growing importance, while EMEA respondents are split on the importance of SMNs and are significantly behind North American organizations.

Question (3.) Organizational Plans to Collect and Monitor SMN Data

Overall respondents 647, LOB = 106 EMEA=172

Approximately (60%) of the overall sample base indicated that they had future plans to monitor SMNs and (40%) currently did not. LOB respondents contrasted this significantly where (80%) had future plans to monitor and collect SMN data and only (20%) did not, and EMEA respondents were split (54%) had plans and (45%) did not have future plans.

Again LOB respondents understand the business importance of SMN data significantly more than North American and EMEA based IT professionals. This is an important distinction as LOB professionals and C levels are now intimately involved in the purchase decision making for IT products and services, especially SaaS and Cloud base solutions.

Question (4.) Building Social Media Data Stores from Proprietary SMNs, Overall respondents 646, LOB=106, EMEA= 172

The vast majority of the sample base (75%) is not currently collecting and/or building date stores from propriety SMNs, and only (25%) are and the same data trends for LOB and EMEA. This is consistent with Q1, and reflects that some companies are early adopters of SMN data and collection for specific business initiatives that may provide competitive advantage in their market segments or industries.

Question (5.) Building Social Media Data Stores from Open Online SMNs, Overall Respondents 646, LOB = 106, EMEA= 173

Only (27%) of the overall sample base indicated that they are currently building social media data stores from open online SMNs, while (73%) are not, LOB respondent were slightly higher with (36%) indicating that they are, (64%) are not and EMEA response were very similar to the overall sample base. Again this reflects an early or nascent market that will be driven by the need to know the customer and gain competitive advantage.

Question (6.) Future Plans for Social Media Monitoring, Overall Responses 613, LOB= 107, EMEA=174

Nearly (60%) of respondents indicated that they expect to increase social media monitoring over the next 1-2 years, while (20%) indicated it would be 3-5 years and only (19%) did not expect it to increase at all. One quarter of the sample expect to significantly increase monitoring in the next year. LOB responses are consistent with previous question and more than (75%) of respondents indicated an increase of social media monitoring in the next 1-2 years. EMEA respondents indicated lower increases in the next year at (15%) and (40%) in the next 1-2 years.

Question (7.) Social Media Data Collection and Business Intelligence Use, Overall Respondents 612, LOB=106, EMEA=173

Slightly more than (56%) of the overall sample base expect to increase social media data collection for business intelligence analysis in the next 1-2 years, only (23%) indicated it would be 3-5 years away and (20%) did not expect it to increase.

(73%) of LOB respondents indicated they would begin or increase SMN data collection in the next 1-2 years, nearly (35%) indicated that it would increase significantly in the next year, this again reflects the business priority around SMN data and customer knowledge. EMEA respondents were less likely to significantly increase data collection in the next year (40%) expected increases in the next 1-2 years and (30%) over the next 3-5 years. Targeting LOB professional with turnkey SMN data collection products and services is the opportunity ahead for vendors.

Question (8.) Future Organizational Investment in Business Intelligence Tools and Services Installed In-House, Overall Respondents 609, LOB=106 EMEA=173

Only (15%) of the sample base expects to increase investment in the next year, however, nearly (40%) expect to increase purchasing in the next 1-2 years, and a late majority indicated 3-5 years. Slightly more than (25%) did not expect to increase investment and 107 respondents skipped this question, perhaps indicating that they don’t know. (65%) of LOB respondents indicated they expect to increase investment in the next 1-2 years, and only (46%) of EMEA respondents expect to increase their investment in the next 1-2 years.

Again this reflects the conditions of a nascent market with great opportunity for vendors to create and deliver thought leadership and best practices for conducting social media and business intelligence processes and analysis.

Question (9.) Extraction Transformation and Loading of SMN Data, Overall Respondents 565, LOB=104, EMEA=171

Only (20%) of the sample base is collecting data continuously or on a daily basis, (11%) are collecting data weekly, and/or monthly. More than (60%) of the sample found this question not applicable and probably did not know if or even when data was collected. (33%) of LOB professionals indicated that they were collecting data, continuously, daily, weekly or multiple times during a week. Results were very similar for EMEA respondents, and (64%) found this question not applicable.

Question (10.) Average Volume of Social Media Data Moved, Overall Respondents 559, LOB=103 EMEA=170

Overall nearly (80%) of the respondents did not know how much data was actually being moved, again reinforcing the nascent nature of this market. Data sizes of 99MB, 100-499MB and 1GB-9.9GB received the most responses and only 3 organizations were moving data in the 50GB to 100GB range. We are in the very early stages of this market and because of the newness of this business process; this data is not reflective of future storage and data collection needs. Nor does it mean that collecting 1GB or 1TB is sufficient to understand social media data trends and perform analysis on SMN data. Responses for LOB and EMEA respondents are consistent with overall findings.

Question (11.) SMN Data Collection Hosting and Analysis Platforms, Overall Responses 554, LOB=105 EMEA=174

The important insight here is that (68%) of the sample base indicated that they don’t know how SMN data is hosted and/or found this question not applicable. Nearly (20%) of the overall sample base is hosting SMN data on an internal data mart, or data warehouse. Only (5%) of respondents are currently employing cloud based platforms with only (1%) using an external cloud database. The findings are consistent with LOB respondents, however, there was a slight increase in use of cloud based database and/or third party. EMEA based findings are consistent with North America.

Question (12.) Percentage of SMN Data, Structured vs. Unstructured, Overall Responses 554, LOB=105, EMEA=174

Three quarters of the sample base (77%) responded that the question was not applicable or they did not know whether the SMN data collected was unstructured or structured. This is not surprising and consistent with other responses, indicating that SMN data collection and analysis is a very new business intelligence practice. Of those that responded (6%) indicated the data was 50-50 structured to unstructured, (10%) also indicated the data was mostly unstructured.  These results were also consistent with LOB and EMEA based respondents.

Question (13.) Current SMN Data Storage Requirements, Overall Responses 544, LOB=104, EMEA=174

Nearly all of the respondents indicated that did not know what the current SMN data storage requirements were for hosting in their organization. This is consistent with LOB and EMEA respondents and again reflects the nascent market for Social Media Analytics. This is a great opportunity for storage vendors, cloud based solutions and traditional BI platform vendors.

Question (14.) Legacy Business Intelligence and Analytic Platforms in Use to Perform Analysis on SMN Data, Overall Responses 544, LOB=107 EMEA=172

Legacy BI Vendors In Use (Overall Responses)

IBM (12%)

Oracle (11%)

SAS (8%)

SAP Business Objects (8%)

Legacy BI Vendors In Use (LOB)

Oracle (9%)

SAS (6%)

IBM (5%)

SAP Business Objects (4%)

Legacy BI Vendors In Use (EMEA)

IBM (15%)

SAS (13%)

Oracle (9%)

SAP Business Objects (9%)

Nearly (75%) of the overall sample base indicated that they did not know which legacy vendor platform was in use, or that the question was not applicable. This is consistent with LOB and EMEA responses, however, LOB respondents ranked Oracle as the number one vendor in use.

Net/Net: IBM, SAS, Oracle and SAP are the entrenched BI platform market leaders with the largest market share. IBMs ranking is not surprising considering that the majority of respondents were from the SHARE consortium. There is no clear market leader in this space and SAP’s low ranking in EMEA is somewhat surprising, but their low ranking with LOB is not. Informatica, Information Builders and Microstrategy would appear to be losing market share according to this data set.

Question (15.) New Social Media Business Intelligence Tools In Use, Overall Responses 511, LOB=106, EMEA=165

Nearly (70%) of the overall sample base selected none of the above, this dropped to (60%) for LOB professionals, and EMEA results were consistent with the overall data set. Evolve 24, Netbase, Radian 6, Kapow and Symomos were relatively unknown to respondents, reflecting that these companies need to do a better job in marketing communications and drive thought leadership in this area.

Top New Social Media BI Vendors

  1. Google
  2. SAS
  3. IBM

Google was the overall leader with (16%), but climbed to (30%) according to LOB professionals, and dropped to (12%) in EMEA.  Approximately (6%) of the sample base selected other, and Radian 6 garnered nearly (5%) just behind SAS at (6%) as indicated by LOB professionals. The Radian 6 increase is most likely the result of the tremendous success of with LOB professionals, however, only one EMEA respondent indicated they were using the product.

Net/Net: This data reflects an almost “pre-early adopter market”

As the early majority begin selecting and implementing SMN BI tools in the next 1-2 years, late majority over the next 3-5 years and laggards beyond that if they survive. If Google could penetrate the enterprise they would have an excellent chance of achieving commanding market share beyond SMB. However, the existing enterprise legacy vendors would appear to have the upper hand in the enterprise if they can provide the thought leadership, tools, products, services and best practices for leveraging SMN data to facilitate business initiatives around the customer.

Question (16.) Use of Third Party Services, APIs, and BI Tools to Analyze SMN data streams, Overall Responses 524, LOB=106, EMEA=171

The majority of the sample base indicated that they are not using third party BI tools (50%) and (38%) did not know. Only (12%) indicated that they are using third party tools, and this was consistent with EMEA responses. However, interestingly (23%) of LOB professionals indicated that they were employing third party tools, LOB as we know are now buying IT products and services in many companies without the involvement of IT departments.

Question (17.) Planned Investment in Cloud and Software as a Service based Social Media Tools, Overall Responses 521, LOB=107, EMEA=174

Only (15%) of the sample base plans to invest in Cloud based solutions in the next year, in contrast more than (50%) who plan to invest in the next 2-3 years. Approximately (30%) had no plans to invest and this was consistent with LOB and EMEA responses, although (20%) of LOB professionals indicated they plan to invest in the next year. This data is consistent with characteristics of an early-early adopter market, and I suspect that those who stated not at all are large enterprise IT staff.

Question (18.) Use of In-Memory Computing, Overall Responses 520, LOB=107, EMEA=174

Only (10%) of the sample base is currently employing In-Memory databases and/or appliances to analyze SMN data. (91%) indicated that they are not and this data was consistent with LOB and EMEA responses.

In-Memory computing is a relatively new technology in the business intelligence market; however, its ability to execute real time data mining and analysis could enable new and innovative analytic processes that will provide competitive advantage.

Question (19.) Current In-Memory Products In Use, Overall Responses 36, LOB=7, EMEA=12

IBM’s Netezza (20 responses) and SAP Hana (12) are the top two In-Memory computing platforms in current use according to a small sample size and few respondents. HP’s Vertica is relatively unknown to LOB and in EMEA, however it garnered (8) responses in the overall small sample set.

In-Memory computing and analysis of SMN data in real time would appear to be a perfect marriage and could deliver significant competitive advantage to those companies with CRM, SCRM and customer experience management business initiatives. IBM, SAP and HP need to provide thought leadership on the business value of In-Memory computing and real time business as it relates to SMNs and data.

Question (20.) Is the Collection, and Management of SMN data into existing BI Platforms a Challenge to Your Organization, Overall Responses 492, LOB=106, EMEA=173

The majority of respondents indicated that collection, integration and management of SMN data into existing BI platforms was not a challenge to the organization.  Only (16%) overall, (21%) LOB and (12%) EMEA indicated it is a challenge to the organization.

Net/Net: In many ways this data “does not compute” so to speak as many organizations are in currently their infancy when it comes to monitoring, collecting, building data stores and analyzing SMN data according to previous answers. Social Media Network data is in many ways the biggest data on the planet and it is growing everyday, in my view it is too early to tell what the challenges will be in this discipline.

Question (21.) Use of a Third Party Social Media Business Intelligence Integration Platform or System, Overall Responses 496, LOB= 107, EMEA=173

The majority of respondents indicated that they currently do not use a third party system or platform, only (12%) indicated they did and (18%) don’t know. This remains consistent for LOB, however, (40%) of EMEA respondents don’t know if they use a third party system. Again responses to this question reflect data consistent with a nascent or early adopter market.

Question (22.) Devices Used to View and Analyze Social Media Data, Overall Responses 420, LOB=95, EMEA=136

Social media data viewing and analysis is primarily conducted on a PC (79%), although a tablet or IPad garnered (24%) followed closely by smart phone. (87%) of LOB professionals use a PC, (35%) use a smart phone followed by (30%) who use a tablet and EMEA responses are very similar to overall sample results.

Net/Net: Crunching of social media data is most likely done on the PC and viewing on smart phones and tablets. Vendors need to create semantic layers and easy to use interfaces on mobile devices for future data viewing and analysis.

Question (23.) Business Functions of the Organization Using BI Tools to Analyze Social Media Data, Overall Responses 366, LOB=92, EMEA=125

Top Three Business Functions

  1. Sales and marketing
  2. PR & Communications
  3. Customer Service

This is not surprising considering that social media networks are now the new word of mouth marketing platforms of the millennium. This is consistent across LOB and EMEA respondents and sales and marketing is more pronounced for LOB professionals. A significant number of IT professionals are also using BI tools to analyze SMN data in (36%) of overall organizations; it is not clear exactly what they are doing.

Social media has become a new channel of influence for many experts and thought leaders, and PR-communications professionals are following and identifying influencers in social media networks.

Customer service is of growing importance to many organizations, and SMNs are rapidly becoming the voice of the customer, both happy and unhappy. Many companies are now following any tweets related their name products and services in an effort to improve customer service and protect brand reputation.

Question (24.) Business Initiatives Supported Through Analysis of SMN Data, Overall Responses 382, LOB=94 EMEA=121

Top Business Initiatives

  1. Brand-Reputation Management
  2. Marketing Communications
  3. Customer Service
  4. Customer Experience Management
  5. Sales
  6. CRM

Brand-reputation management (46%), and customer service (44%) were the number one and three business initiatives supported closely followed by marketing communications (45%).  Customers are king in the SMN world and this is reflected in this data set as customer experience management and CRM related business processes are now being supported by a significant number of organizations.

The data set trends very closely for LOB and EMEA based respondents, although LOB responses for brand, customer service and marketing communications were slightly higher.

Fewer organizations overall are using SMN data collection and analysis for competitive intelligence, identifying influencers, product testing and price testing. However, LOB respondents also indicated that they are using SMN data analysis to support identification of influencers and competitive analysis.

Question (25.) Brand & Reputation Monitoring of SMNs, Overall Responses 465, LOB=107 EMEA=168

Approximately (62%) of the sample base are not currently monitoring brand reputation in social media channels, this drops to (50%) for LOB professionals and increases to (71%) for EMEA respondents. Nearly one third of the overall sample base indicated that they are employing in-house IT systems to monitor social media channels and only (12%) overall are using third party systems. (17%) of LOB professionals indicated that they are using a third party service.

The results for EMEA are consistent with the overall sample base but a slightly lower use of In-house IT systems is indicated, and (70%) of the EMEA organizations are not monitoring brand reputation. This reflects that a smaller number of EMEA organizations are monitoring brand reputation.

Net/Net: Organizations are slowly recognizing the power of social media networks and the speed of information flow along with the importance of word of mouth marketing in this environment. There would appear to be a great opportunity here for entrenched enterprise BI vendors and cloud-based BI SaaS solutions.

Question (26.) Metrics Employed in Measuring SMN Monitoring, Overall Responses 452, LOB=104, EMEA=158

Top Metrics Employed

  1. Customer Satisfaction
  2. Overall Buzz
  3. Brand Experience
  4. Advertising Campaign Performance

Nearly half of the overall sample base indicated none of the above for this question, which is consistent with the overall data indicating that this is an early adopter market. Influencer identification ranked fifth in overall importance followed by net sentiment analysis and Geo-tracking. These findings are consistent for LOB professionals and EMEA based respondents.

Question (27.) Organizational Plans to Leverage Social Media Metrics Into Business Processes, Overall Responses 459, LOB=107 EMEA=167

One quarter of the overall sample base has plans to leverage social media data collection and analysis into business processes, while nearly on third indicated they are giving preliminary consideration to this. Almost half of overall respondents indicated that they have no plans to leverage social media into business processes.

(35%) of LOB professionals indicated that they have plans and (32%) are giving preliminary consideration. Responses from EMEA are consistent and (20%) indicated yes, (60%) had no plans and (22%) are giving preliminary consideration to leveraging social media.

Question (28.) Monitoring of SMNs Changing Business Process, Overall Respondents 455, LOB=105, EMEA=167

The majority of the overall sample base indicated SMN monitoring and data collection has not changed business processes, MBOs and KPIs in business units. One quarter of the sample base indicated that preliminary consideration is being given to this and only (105) of the sample indicated that SMN data collection and analysis was changing business processes and measurement of them. This is consistent with LOB professionals and EMEA based respondents. Again this reflects an early adopter nascent market that is ripe for innovation, products, services and best practices by industry.

Question (29.) Customer Engagement Through Social Media Channels, Overall Responses 458, LOB=106 EMEA=169

One third of the overall sample base indicated that they are actively engaging with customer through SMN channels and the remainder (68%) indicated that they are not. This trend is also consistent with EMEA based respondents. However, LOB professionals contrasted this significantly as (50%) indicated that they were engaging and (50%) were currently not engaging customers directly.

LOB professionals are generally closer to the customer than IT; therefore I would be cautious about the extrapolation of the data from this question, (from mostly IT professionals) it may not reflect an accurate view of the overall market. However the data has trended very consistently from the beginning reflecting and early market, even with LOB professionals.

Question (30.) Employment of BI Platforms to Respond to Brand Crisis, Overall Respondents 456, LOB=106, EMEA=167

The majority of the sample base indicated that they do not employ social media business intelligence platforms in responding to a brand crisis and only (25%) indicated that they currently do. The results for LOB professionals was slightly higher at (33%) indicating that they leverage Social Media BI platforms to respond, and EMEA based responses were very similar to the overall sample base.

Net/Net: Many organizations are not taking social media seriously and are not prepared to deal with a crisis in social media channels were the only barrier limiting information flow is time zone.

Question (31.) The Impact of Social Media Data Collection on Customer Engagement, Overall Respondents 450, LOB=106 EMEA=171

Only (7%) of the overall sample base indicated that social media data collection had significantly changed customer engagement models and (32%) indicated that it had changed customer engagement slightly. Nearly (62%) of the overall respondents indicated that social media monitoring and data collection had no impact on customer engagement in their organization. This is consistent with EMEA responses and LOB professional responses were higher than overall with (11%) indicating there was significant change and (37%) indicated there was a slight change in customer engagement.

Many organizations still do not realize the power of social media networks and don’t know how to engage customers and influencers in the social media space. There is great opportunity to grow market share, enhance brand and customer service through social media engagement.

Question (32.) Organizational Concern About International Data Security Laws, Overall Responses 453, LOB=106, EMEA=172

Only half of the overall sample base is concerned about international data security laws, (24%) indicated they are not and (23%) don’t know. This is consistent with EMEA responses and LOB professionals. International data security laws vary by region and country and are probably not well known, especially in organizations that are not global.

Question (33.) Integration of Social Media Metrics Into Business Processes, Overall Responses 300, LOB=79, EMEA=105

Top Business Processes Leveraging Social Media Data

  1. Understanding Customers
  2. Enhance the Customer Experience
  3. Innovation of Services
  4. Innovation of Product and Service Delivery
  5. Implementation of Social CRM

These findings are consistent with LOB professionals and EMEA based respondents, as organizations are leveraging social media monitoring, collection and analysis to better understand customer needs and requirements. Some organizations are taking the next steps and practicing Social CRM and innovating customer service, product and other services. Social media data and business intelligence tools enable organizations to profiling and understand customers in social media networks.

Question (34.) Primary Job Title

The majority of the sample base consists of information technology professionals, including 19 CIOs, however, 262 respondents skipped this question probably because of survey length. Respondents began to drop off after the first five questions, but remained consistent at or around 450 throughout the survey.

Sample Base

  • 296 IT Professionals
  • 107 LOB-CEO-Marketing- Sales Professionals
  • 51 Other

Question (35-36.) Organization Size and Revenue

The sample base includes and a good range of small, medium and large organizations, with 123 organizations of >10,000 employees.

  • 165 Small Organizations: 1-500 Employees
  • 126 Medium Organizations: 500-5000 Employees
  • 162 Large Organizations 5000 or more Employees

Revenues ranged from <$50M (23%) to >$5B (13%), with the majority of the sample base ranging from $50M-$5B.

Question (37.) Industries

Financial services, including Insurance represented the largest industry segment of the sample base at (29%), followed by High Technology at (14%). A broad spectrum of industries is represented in the sample base, although natural resources, process manufacturing, life sciences, engineering and construction and chemicals have light to no representation.

Question (38.) Geographic Representation

The vast majority of the sample base is North American organizations (62%), Germany represented the largest sample from EMEA and approximately (8%) of the sample is from ROW.

  • (62%) North America
  • (30%) EMEA
  • (8%) Rest of World


Technology is one of the few weapons that companies have at their disposal to gain competitive advantage and you can see clearly in this research how many companies last year had no clue about how to leverage social media. If we conducted the same research again this year I don’t think we would see much change in the data sets for several reasons. One IT departments are too disconnected from the business guys and social media and BI are not easy to implement. Until next time I wish you great selling and marketing in this millennium. (This blog and research was originally published on in 2012)

Older Posts »