Roles

Building Your Data Governance Team (Keys Roles and their responsibilites)

Key Roles and responsibilities can belong to many people or one person can have many roles, it really depends on the size of your company as well as the culture. The ideal person to lead the Mindful Data Governance initiative is the Chief Data Officer (CDO) if one exists in your business, other wise select the best senior level employee that will be the ideal data evangelist to represent Data Governance and implement the Mindful Data Governance Initiative. Remember Data Governance is a team effort but the roles of each of the other members of the data governance teams are different but interdependent on each other. If you think of the roles of data governance as positions on a soccer team, it is great to know who are the strikers, midfielders, defenders, and the goal keeper so the team is unified but everyone has a different role to play in the team. I do want you to remember when implementing the Mindful Data Governance Initiative that their is no title changes for the employees that are assigned to these roles as the responsibilities. These responsibilities should not take up much of the employee’s time and become part of the ever day life and culture.

Data Steward
Data stewardship is a functional role in data management and governance, with responsibility for ensuring that data policies and standards turn into practice within the steward’s domain. (Domain = Data that is collected within their subject area).

Specific Accountabilities:

  • Implement data standards.
  • Ensure that staff who maintain data are trained to follow standards.
  • Monitor data quality.
  • Work with technical and operational staff to create a process for identifying data entry errors and correcting the data to match business standards.
  • Report to the data owner any issues that may require larger action on behalf of the business’s data governance structure.
  • Handle inquiries about data.
  • Receive and respond to any inquiries related to data that originates from the area they oversee; e.g., questions regarding access, standardization, organization, definition and usage, etc.

Data Owner
A Data Owner is concerned with risk and appropriate access to data. In comparing these two roles, often the data steward doesn’t care who uses the data as long as they use it correctly. Often the steward wants a lot of people to use the data! An owner, however, is concerned with who can access data, and tends to be more conservative with granting access. There is a natural conflict between these two roles, but in some organizations the same person plays both roles.

Specific Accountabilities:

  • Approve data Glossaries and other data definitions
  • Ensure the accuracy of information as used across the Enterprise
  • Direct Data Quality activities
  • Review and Approve Master Data Management approach, outcomes, and activities
  • Work with other Data Owners to resolve data issues and lack of harmony across business units
  • Second level review for issues identified by Data Stewards
  • Provide input to the Data Governance team on software solutions, policies or Regulatory Requirements that impact their data domain

Data Custodian
Data Custodian manages the actual data. This role manages servers, backups, or networks. This role may provision access per the data owner’s rules, and this role has mastery of a data schema and lineage. In comparison with steward and owner, a custodian has little knowledge of the types of decisions that are made using the data. In other words, a custodian knows exactly where data is located but does not know how to correctly use it.

Specific Accountabilities:

  • Provide a secure infrastructure in support of the data.
  • This includes, but is not limited to, physical security, backup and recovery processes, and secure transmission of the data.
  • Implement data access policies.
  • Grant access privileges to authorized system users, documenting those with access and controlling level of access to ensure that individuals have access only to that information for which they have been authorized and that access is removed in a timely fashion when no longer needed.
  • Ensure system availability and adequate response time.
  • Install, configure, patch, and upgrade hardware and software used for data management, ensuring that system availability and response time are maintained in accordance with university policies and/or service level agreements.
  • Participate in setting data governance priorities.
  • Provide details on technical, systems, and staffing requirements related to data governance initiatives.

We used the above labels to identify the roles and responsibilities of the team members of the data governance but these labels can be changed to fit your business better. The important part here is the understanding that there is a specific responsibility for each of the roles no matter how you label it. These three roles take up the majority of the work for data governance so having a clear definition will help the person that is assigned to this role exactly what their responsibility is. In smaller businesses, the same person may play all three roles. Even in large business, sometimes the steward and the owner are the same person. Because of the particular nature of each role, it is helpful to articulate each role even if they are assigned to a single person. Each role makes particular types of decisions and brings a particular perspective and skill set to governance work.

Putting each of these roles descriptions down on paper and personally communicating that roles responsibility to the individual, will help that individual perform the role successfully. Formally assigning roles makes it easier for colleagues to approach an individual playing a particular role and ask for assistance.

5 Levels of Mindful Data Governance Initiative

The Mindful Data Governance Levels. What level are you?

Quick Overview of Mindful Data Governance:
In my previous blog I went over why I decided to create Mindful Data Governance and the meaning. Now I would like to go over the different levels of Mindful Data Governance as well as the first step. Within Mindful Data Governance the first thing a business would do is a self evaluation questionnaire to know exactly where the business is starting from, prior to the initiative kickoff meeting. This self evaluation will allow the business, as long as they answer the questions honestly to see what level (or state) their departments and/or business is in before implementing the Mindful Data Governance Initiative. Remember implementing Data Governance does not have to be a hinderance, a distraction or even time consuming. Lets get to the levels.

The Level of Mindful Data Governance Initiative:
So after much thought I came up with the five levels Unknowing, Acknowledge, Acceptance, Mindful and Enlightenment. I will go over each level in detail so you know exactly what each one means.

1 – Unknowing: There is no data governance, security, accountability or ownership in place. There is no informal standards that is known, what this means is that everyone in the business is doing their own thing. There are no business glossaries, no metadata management, no data models existing in the business. Information is fragmented and inconsistent throughout the business systems. Business Decisions are made with inadequate information or with no information at all. Your business does not treat its data as an asset and the business is undisciplined and very reactive. There is most likely duplicate and inconsistent data being stored.

2 – Acknowledge: The business starts to become mindful for the need to control the inconsistent information and do something about the poor data quality. The lack of data ownership and lack of executive support has become evident. The acknowledgement for the need for tools, processes, policies, and standards have been made. The business starts to understand the value of quality data that can be shared and used across the business. Your business recognizes that there is a cost to enter data into multiple systems. Employees are still being utilized to manage and move data. Business has also acknowledged that there is redundant data.

3 – Acceptance: The business understands the values of quality data that can be shared and used across the business. Data is starting to be shared across transactional systems and departments. Data governance polices and standards are be created but following them is almost nonexistence. The majority of the work that has been done within Data Governance is around the retention of data. Business has processes in place but some departments remain separate from others. Formal data management documentation is building. Vision and data strategies have been defined and implemented. Metrics and standards are transpiring around the use of the data

4 – Mindful: Data is being viewed by the business as a top asset. Data governance policies and standards are developed, circulated and well understood throughout the business. A governance body is in place to resolve cross-departmental data issues and they are identifying best practices that should be implemented through out the business. Roles and responsibilities are assigned, they are being followed and data quality, security, usability, and accessibility are increasing through out the business. A formal training for on-boarding new employees in place to ensure quality and standards are met day one.

5 – Enlightenment: The business recognizes that the data that is being collected give them a competitive advantage and it is used to create value and efficiencies through out the organization. Data Management and Data governance are seen as a daily part through out the business. Service level agreements are in place and are enforced. The business has achieved their goal in Data Management and Data Governance. Overall data management if fully aligned and in place and supports the business’ performance. All business’ processes are automated and repeatable. Data management Roles are well established. Monitoring off data is in place and metrics and audits are used to continuously improve data quality.

Below is the questionnaire that you can take to see where your business is at and where you can move up to. To get your score, simply sum up the values of your answers and divide that answer by 13 and you will have your average, take your average score and match it to the number next to one of the levels of the Mindful Data Governance Initiative above.

Mindful Data Governance Initiative Questionnaire

How does your business feel about your data as an asset?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

How accessible is the data that is required to make decisions for your business?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

How is your data quality (Example duplicate data, completeness of your data, etc.)?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

How integrated is your data sources?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Is your data storage replicated and data security ample for your business?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Is there a data warehouse in place?
1 – Non-existent
2 – Some apps have their own database that is accessible
3 – Data is pushed manually into the warehouse
4 – Some of the apps are pushing data to the warehouse (automated)
5 – All data collection applications are pushing data to the data warehouse (automated)

Is each of your data transactional systems documented (Processes and procedural)?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Is your data accessible from within inside departments that collect the data?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Is the data that is collected in each department accessible to other departments?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Are there policies in place around who can use data, how they can use data, which parts can they use, and for what purposes?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Do you have security policies and considerations need to be in place for each of the data sources? (HIPPA, SOC are just examples)
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

What is the attitude from your C-Level or Leaders in the Organization around Data Governance?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Does your C-Level or Leaders make decision based on the data collected by the business?
1 – Non-existent
2 – Poor
3 – Fair
4 – Good
5 – Excellent

Unfocused Data

Why aren’t business focusing on Data Governance and what does that mean to them.

There are so many reasons why businesses are not focusing on Data Governance, but I think the main one is that the Executive teams really does not understand what Data Governance is and it’s benefits. Based on my experience and what I have been observing is that most businesses are focused on getting their systems (separate transactional system(s)) up and running so that their businesses are making money and they are processing their customers through some type of sales life cycle. I truly understand that mentality but your business does have to grow down the road. If businesses cannot fully understand the data that they are collecting from those transactional systems and how it ties back into their business strategy, I believe they are building a black hole within their business.

Another big reason I hear businesses are not doing Data Governance is because of the cost. Let me say plain and simply that statement is just not true, Data Governance can be implemented in phases and the ROI can been seen much quicker as long as you have Executive by in, do not under estimate the total amount of work, do not get bogged down with meetings and process discussion without any hands on work. Lastly make sure you understand that within a business definitions of terms may vary based on what department is defining the business term.

With so many business are under competitive pressure and the desire for financial transparency and regulatory compliancy with HIPPA and Sarbanes-Oxley you would think those business would drive Data Governance into their culture.

Below are some things that businesses will not be able to do without Data Governance:

  • Trust. Data Governance can give the business more trust in the data they are collecting and then what the data is telling the business. Be transparent on why and how you are collecting customer information, sharing that information and especially on how your are protecting that information
  • Identify areas that can be automated. You can reduce manual effort or completely get rid of time consuming processes that are done by your employees. A lot companies do not have an idea that there are employees spending a lot of time manual working with data and not spending their days doing what the actually were hired for.
  • Use data to retain more of their customers. Information from a acquisition and retention study existing customers are 51% more likely to try new products and spend 31% more than new customers.
  • Create consistency and make better decisions through out the business. When different parts of the business are using different data sets problems will ensue into the business. Data Governance can enforce a process where all employees are using the same consistent information
  • Strengthen your ROI. The impact Data Governance can have on a business’ ROI is tremendous for all the above reasons and more. When you apply Data Governance around the collecting of information on what your customers are doing what motivates you then have knowledge on what they want to buy.
  • Remember Data Governance is not about technology it is about defining, implementing and enforcing policies and processes for how information is generated, stored, maintained, and used across the entire business.

    Inner and Outer Joins

    Outer and Inner SQL Joins

    SQL is such a wonderful language and one question I get from people that I talk with is what is the difference between Outer and Inner Joins. Before we talk about that lets talk about what a Join is. A join is used to compare and combine — literally join — and return specific rows of data from two or more tables in a database. It really is that simple. Now an inner join finds and returns matching data from tables, while an outer join finds and returns matching data and some dissimilar data from tables.

    Inner Join
    An inner join targets on the commonality between two tables. When using an inner join, there must be at least some matching data between two (or more) tables that are being compared. An inner join searches tables for matching or overlapping data. Upon finding it, the inner join combines and returns the information into one new table.

    Outer Join
    An outer join returns a set of records (or rows) that include what an inner join would return but also includes other rows for which no corresponding match is found in the other table.

    There are three types of outer joins:

    • Left Outer Join (or Left Join) – returns only unmatched rows from the left table.
    • Right Outer Join (or Right Join) – returns only unmatched rows from the right table.
    • Full Outer Join (or Full Join) – returns unmatched rows from both tables.

    Each of these outer joins refers to the part of the data that is being compared, combined, and returned. Sometimes nulls will be produced in this process as some data is shared while other data is not.

    To see examples of LEFT OUTER and RIGHT OUTER JOINS look back at this blog RIGHT JOIN and LEFT JOIN.

    Data Lakes vs Data Warehouses

    Data Lakes versus Data Warehouses

    What is a data lake?
    A data lake is a central repository that allows you to collect and hold all of your unstructured and structure data at any scale. What that sentence means is that you do not have to transform your unstructured data to store and then run analytics on it. You can store all your data as-is.

    Do I need a data lake?
    Companies that have put data lakes in place outperformed like companies by 9% which was reported by an Aberdeen survey. What was found is that these companies were able to do new types of analytics (even create new products) from their sources of data like log files, social media, IOT, etc being stored inside of data lakes. Being able to parse and learn from this data enabled them to react faster to what their data was telling them. Looking at unstructured and structure data allowed these companies to attract and retain customers and make better informed decisions.

    Data Lakes versus Data Warehouses
    You have to approach this decision with facts and requirements. Depending on the needs to the business you might need a data warehouse, a data lake, or even both. Allow the needs of the business and what data the business collects drive the decision organically. Before we go any further lets define each of these.

    Data Warehouse is a system that pulls together data from many different sources within the business usually transactional systems that is needed by the business to conduct day to day operations. This data is collected for reporting and analysis.

    Data Lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. By allowing the data to remain in its native format, the size of the data can be greater and pulled in a more timelier manner which will give the business quicker insight what is going on.

    Lets compare Data Warehouses to Data Lakes side by side to help see what is best for your business to use for its analytic needs.

    Section Data Lake Data Warehouse
    Schema Written at the time of analysis Architected prior to streaming the data in.
    Data This is both non-relational and relational data from web sites, social media, IOT and business applications Transactional systems and operational databases
    Performance and Price Low cost storage and queries are getting faster Higher cost storage and fast queries
    Data Quality This is raw data. May or may not be curated This data is greatly curated and serves a Single Source of Truth
    Users Data scientists, business analysts, data developers Business analysts
    Analytics Predictive analytics, data discovery, and machine learning Business Intelligence and batch reporting
    Slow SQL

    Speeding up Slow Queries inside of Postgres

    If your tables are large and you are suffering from slow queries the number one place to start to look, is at the indexes on the tables they are usually the culprit. There are several tools that can help you investigate and dissect the problem you are having with your queries within your Postgres database. In other databases there are similar tools that you can use but in this blog we will only focus on the Postgres components.

    PG_STATS_STATEMENTS
    The pg_stats_statements component is a awesome place to start to find to not only gathering statistics on the SQL that is being executed against your database but easily identify the slow performing SQL statements.

    Once you have this component installed, a system view named pg_stat_statements will be available with all kinds of great information once it has had a chance to consume a good amount of data, look for queries that have relatively high total_time value.

    Example: SELECT * FROM pg_stat_statements ORDER BY total_time DESC.

    AUTO_EXPLAIN
    auto_explain is very useful for finding slow queries but has advantages: First it logs the actual execution plan and supports logging nested statements using the log_nested_statements option. Nested statements are those statements that are executed inside a function. If your application uses many functions, auto_explain is invaluable for getting detailed execution plans. The log_min_duration control which query execution plans are logged, based on how long they perform. For instance, if you set this to 1000, all sql statements that run longer than one second will be logged

    Statistics Collector
    The statistics collector is a great system that collects all sorts of performance statistics that are extremely helpful in analyzing your slow performing SQL.

    Turning on the statistics collector on gives you tons of pg_stat_… views which contain all the handy information. In particular, I have found it to be extremely helpful for finding missing and unused indexes.

    EXPLAIN
    The EXPLAIN command is a great tool when it comes to tuning queries. It tells you exactly what is really going on within your SQL. To execute it, simply but the EXPLAIN command in front of your statement (i.e EXPLAIN SELECT ……) and run it. PostgreSQL will show you the execution plan it used.

    When using EXPLAIN for increasing the performance of your SQL, I suggest using the ANALYZE option (EXPLAIN ANALYZE) as it gives you more accurate results. The ANALYZE option actually executes the statement (rather than just estimating it) and then explains it. In a later BLOG I will go into great depth on the output of the EXPLAIN.

    Reading CSV and inserting data with Python

    One of the easiest things that you can do in Python is read a CSV file and insert that data into a Postgres database. In this blog I will show you how to open up and read a CSV file and then loop thru that CSV creating an insert statement to move that data into the database.

    Table structure in Postgres database
    CREATE TABLE user
    (
    business CHARACTER VARYING(155),
    firstname CHARACTER VARYING(75),
    lastname CHARACTER VARYING(100),
    title CHARACTER VARYING(50)
    );

    First thing we want to do is create a connection and create a cursor
    conn = psycopg2.connect(host=”HOSTNAME“, database=”DATABASE NAME“,user=”USER“,password=”PASSWORD“)
    cur = conn.cursor()

    Second lets create an insert statement
    insert_stmt = ” INSERT INTO TEST (business, firstname, lastname, title) VALUES (%s,%s,%s,%s)”

    Third we are now going to open the CSV file, read it, build insert statements and insert. Sample file lines

    “BUSINESS”,”FIRSTNAME”,”LASTNAME”,”TITLE”
    “Jakes Landscape”,”Jake”,”Switch”,”President”
    “Lamps etc”,”Robert”,”Jump”,”Sales”
    “PMP Light Bulbs”,”David”,”Downes”,”Warehouse”

    file_name =”PATH TO CSV FILE
    reader = csv.DictReader(open(file_name))
    # Count how many records have been processed
    recordcount = 0
    for row in reader:

    try:
    cur.execute(insert_stmt, (row[‘BUSINESS’],
    row[‘FIRSTNAME’],
    row[‘LASTNAME’],
    row[‘TITLE’]))
    conn.commit()
    recordcount = recordcount+1
    except (Exception, psycopg2.DatabaseError) as error:
    print(sys.stderr)

    conn.close()
    cur.close
    print (‘Process ‘+str(recordcount)+’ rows from ‘+file_name)

    It is as simple as that to read a CSV file and insert its data into a database.

    Analyzing unstructured data

    Before we start to talk about unstructured data (or unstructured information) lets define it. Wikipedia defines it as information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents. I think that is pretty spot on. Today more than ever businesses are collecting more and more unstructured data whether it is from the vast social media sites to the enormous volumes of emails that are being sent out and received every day. There is a lot of great data just within social media sites and emails that are being collected and desperately need to be analyzed. This new data can be stored in a relational database as well as a NoSql database but my suggestion to you is store this data where you are storing all the other data sources — your warehouse. For example, if you are using Salesforce’s Pardot as your marketing tool you will have all your business’s campaign data, visitor data, and prospect data streaming from Pardot to your warehouse. Now you created a post in Facebook, you set up a campaign in Pardot and then you pushed an email out to all your prospects with a link to that post. You now know who opened that email and you know who clicked on the link. Please tell me how important would it be for your business to know the sentiment of all the comments that the prospects have left on the Facebook post? If you have not thought about that trust me it is powerful and it will make your data actionable. To analyze this unstructured data one of the best ways is to use Natural Language Processing (NLP). NLP is a form of artificial intelligence that focuses on analyzing the human language to draw insights, create advertisements and more. NLP is being used more and more and is driving many forms of Artificial Intelligence (AI). Think about it — you can decipher the sentiment of all the comments left on a post and based on the sentiment you need to pivot because the post has a negative sentiment towards it or better yet you do not have to do anything because the posts results have a positive sentiment around them. Not only is that information extremely important to your marketing teams but you will know how your product’s message is being received by the public.

    There are a several important data points that you can get from NLP including sentiment analysis, keyword extraction, syntax analysis, entity recognition, and topic modeling to name a few. We are going to touch a little bit on each of these to show you not only how important the information can be for you and your business but to also make sure you have a general understanding of each of the topics. I utilize AWS comprehend which is a natural language processing (NLP) service that uses machine learning to discover insights from text and provides all the above functionality and returns the result in JSON format to either store in a database or display in real time inside an application.

    Who and what does no business intelligence solution impact

    Having no business intelligence solution in place impacts many things and many people. It impacts your bottom line, it impacts your customers, it impacts your employees, and it impacts the understanding of your business right now and in the future.

    Your bottom line

    With having no business intelligence solution in place, a business cannot see the full picture on what is happening. The business cannot see where all its money is being spent, it cannot see what areas are most inefficient, they cannot see what area they are most efficient in and why and then drive those efficiencies to other areas of their business. They slow down and prevent the business from making real-time data driven decisions. Without seeing the big picture, the business can be limited or just wrong on actions that are taken that they believe are good for the business. All of these reasons as well as many more impacts your business’s bottom line.

    Your customers

    Having no business intelligence solution does impact your customer. I get a lot of feedback on this one and it is usually from the people that do have the silos of data running within their business. They want to feel better by giving me excuses on how not having a business intelligence solution does not impact their customers. I always tell people that if you cannot get a full picture, a full understanding of the journey a customer takes from the start of the relationship to the end of their relationship within your business you sir or mam will never see its full potential. With so many touch points that a customer could have with a business you have the potential of creating many silos of data. These silos can be in reference to marketing, selling customers additional services, interaction with customer support and even the customer on-boarding process. All these systems and any others should be integrated in order for you to get a clear understanding of the experiences your customers will go through. The silos of data, the lack of the single source of truth will hamper that understanding and the business will never understand the customer’s journey. Picture this: a stack chart and on the X axis you have date values representing the last two weeks, and on the Y axis you have a numeric representation of hours. Within the charting you have multiple columns (different colors) representing different touch points that a new customer had to take in order to become a customer and then received their service from your business. Within seconds you can see where you can improve the process and where the process is working. That is just one powerful reason on how business intelligence can impact your customers and having a single source of truth can help your business.

    Your employees

    How does no business intelligence solution impact your employees? Without a business intelligence solution, you have silo’s that can affect employees when other departments within your business do not wish to share their data. Think about how many times have you heard or have been involved when a department within a business has identified a problem but cannot do anything about it. I have seen a business identify a problem and could not take the appropriate measures to correct that problem because of silos of data. This can be corrected by doing several things. First, leadership must create a unified front and be creative and tactical in their approach. Work towards a common goal. I know each department has its own responsibilities but the business should have one shared vision — the business’s mission statement. Create the data governance group because that will encourage collaboration, build repeatable processes, share measures across the business, and the group will act as one team pushing to that common goal. Most humans instinctively will get behind a common goal and will feel more united when they can share the same measure of excellence to be obtained with the person next to them.

    Descriptive, Predictive and Prescriptive Analytics

    What I have been seeing with all my clients over the last three years is them trying to get their arms around their data, cleaning it, gathering it into a central location which then they typically create dashboards and reports to see how their business did in the past but some are looking at how they are doing right now. So, the way most of my clients are looking at their data is called descriptive. Descriptive data analysis gives businesses insight into the past. Descriptive looks at the data, summarizes the data and then interprets that data into human readable format to give us analytics of the past. The vast majority of the statistics we use fall into this category. (Think basic arithmetic like sums, averages, percent changes). Most often, the underlying data is an aggregate or count of a filtered column of data to which basic math is applied. For all practical purposes, there are an infinite number of these statistics. Descriptive statistics are useful to show things like total stock in inventory, average dollars spent per customer and year over year, or even change in sales.

    When I talk about Predictive data analysis I am looking to understand the future. Predictive analytics want to look at the data and then predict what can happen in the future. Predictive analytics want to give actionable information to its owner on what could be coming. Currently there is no predictive data analysis that can give you with a 100 percent accuracy on what the future holds. A business should take and read the results on what might happen in the future and decide on the path based on that knowledge.

    These two statistics — descriptive and predictive — try to take the data that you have, and fill in the missing data with best guesses. They combine historical data found in CRM, ERP, HR and POS systems to identify patterns in the data and apply statistical models and algorithms to capture relationships between various data sets. Businesses use predictive statistics and analytics anytime they want to look into the future. Predictive analytics can be used throughout the organization from forecasting customer behavior and purchasing patterns to identifying trends in sales activities. These statistics also help to forecast demand for inputs from the supply chain, operations and inventory.

    The last analytic option we will talk about is prescriptive data analytics. Prescriptive data analytics is when you want to be guided on all the possible outcomes. The relatively new field of prescriptive analytics allows users to “prescribe” a number of separate actions to and direct them towards a solution. These analytics are all about providing direction. Prescriptive analytics attempts to quantify the effect of future decisions in order to advise on all the possible outcomes before the decisions are actually made. When prescriptive analytics are at their best it will help predict not only what will happen, but also why it will happen providing recommendations regarding actions that will take advantage of the predictions. With this type of decision analytics, support business should feel comfortable with the actions that they need to take, either staying the course or pivoting to right the ship.

    Which analytics does your business need? Does your business need descriptive, predictive and prescriptive data analytics? I believe in order to answer that question the business needs to know how advanced of a business intelligence solution it needs in order to be successful. In understanding how each descriptive, predictive and prescriptive and what questions they can answer for the business will drive the business to implement a simple or more complex business intelligence solution. One piece of advice that I would like to give here is start off with the simple solution and once that solution is providing the information you need, then enhance your business intelligence into a more and more complex solution. I believe taking this approach will give you a much higher success rate of implementing your business intelligence solution as well as a higher user adaption.

    To quickly summarize the last three paragraphs, descriptive as we know answers the question of the how it looks at data in the past. We also reviewed predictive where we talked about how it will most likely answer questions on how something might happen. And lastly prescriptive will give you answers to questions on what actions can happen. Depending on your business goals and what answers you need from your data, the decision on if you need descriptive, predictive and prescriptive data analytics is very personal to you and the business.

    I think it is important to show you the different levels of human input to draw conclusions from descriptive, predictive and prescriptive as well how each analytic area answers which questions. This will give you a good sense of employee time that will be needed depending on the way you will be looking at your data