How to Design an Analytics Center of Excellence
ABSTRACT: An analytics center of excellence is the cornerstone of every data strategy, yet few data leaders know how to design one that works effectively.
An analytics center of excellence is an enterprise group that provides data and analysis services to business units and departments. Its mission is to enable the business to ask questions and get answers quickly using data.
But many organizations struggle to implement an analytics center of excellence. Their enterprise data and analytics teams are a bottleneck to getting things done, frustrating business leaders to the point where they hire their own data specialists and build data shadow systems. To salvage or restore the relationship, many data leaders embrace self-service analytics but this often makes things worse since many data leaders don’t know how to empower business users without creating data chaos or implement governance without choking insights.
To succeed, an analytics center of excellence needs to balance conflicting demands:
Speed and standards
Agility and governance
Centralized and decentralized development
Business expertise and technical expertise
Skilled data users versus unskilled data users
Reconciling these requirements is not easy. But that’s where creativity and innovation occur. An effective analytics center of excellence uses federation techniques to balance conflicting demands. It uses both centralized and decentralized resources to meet business needs, delivering standard-based solutions quickly that meet business needs. It resists the temptation to either centralize or decentralize data and analytics and strives to do both at the same time.
[An analytics center of excellence] resists the temptation to either centralize or decentralize data and analytics and strives to do both at the same time.
Let’s first look at the organization behind the analytics center of excellence. It consists of a data management team, an analytics team, and a data science team that collectively meet the data and analytics needs of the organization. (See figure 1.) Let’s work from the inside out, starting with the analytics team and data science team, to describe each of the organizational entities in an analytics center of excellence.
Figure 1. The Analytics Center of Excellence Organization
Analyst Services. The analytics team sits between the data management team and business departments and their users. Its primary function is to manage the organization’s data analysts, whether they are centralized or embedded. The analyst services sub-team has one or more managers who are responsible for hiring, evaluating, training, and coaching data analysts. This is easy to do when the analysts are centralized and sit together, harder when they are distributed in each department. In the latter case, analytics managers need to convince a department head that it’s in their best interest to matrix their analysts to the analytics center of excellence.
In either case, data analysts should be aligned with an individual department where they do all their work under the direction of the department head.
In either case, data analysts should be aligned with an individual department where they do all their work under the direction of the department head. This is part of the federation strategy. Centralized analysts spend the majority of their time in their assigned department, while embedded analysts periodically meet with the central team to review their work. Ideally, data analysts rotate departments every 12 to 24 months to gain a better understanding of the business and its data. This keeps data analysts engaged and increases their value to the organization. (See “Career Paths for Data Analysts: Building the Future”.)
Strategic Partnering. Another important function of the analytics team is to partner with all business units and departments and help them proactively use data and analytics to achieve their goals. To do this, analytics leaders should meet quarterly with department heads to understand their strategic objectives and brainstorm how to use data to meet them. (See “Data Analysts are the Key to Your Success”) Another option is to assign senior business analysts (or relationship managers) to each department who work closely with key stakeholders to gather requirements and proactively suggest solutions.
Data Literacy. The analytics department also runs the data literacy program which teaches business users the core data and analytics skills they need to succeed in their roles. Data literacy coordinators assess skills, devise training programs, and provide ongoing support to all business users. (See “Building a Data Literacy Program: What, Why, How”)
Data Science Team
The data science team is an adjunct to the analytics team. Most companies centralize data science to foster collaboration and provide mentoring, workshops, and training, which are essential for data scientists to learn new techniques and avoid statistical fatal flaws. It usually takes data scientists 6-12 months to become productive in a new domain.
Like data analysts, data scientists are assigned to individual departments where they work with the department head or product owners to define statistical projects. The data science team should also have liaisons with the business, either via quarterly planning meetings or business analysts, as indicated in figure 1. Data scientists also need to be surrounded by a team of technical and business specialists to ensure their models align with business needs and are successfully deployed.
Project Management Office
The project management office (PMO) reviews, prioritizes, and manages project requests from across the business, including strategic projects, such as ERP/CRM upgrades and digital transformation initiatives. Although the PMO is not part of the data and analytics program, many projects have a data or analytics component. In some cases, the PMO can consume all the resources of the data and analytics team, if it’s not careful.
Ticketing. It’s important that the PMO has a data-savvy person on its team who can review, scope, and triage all project requests. This person can also handle data requests that come in via the company’s ticketing system, funneling these requests to the appropriate channel, whether it’s an individual analyst or data engineer, a tiger team, or the PMO office. The triage analyst should provide transparency around requests by maintaining a web page that shows the status of all tickets and project requests.
Data Governance Team
The data governance team consists of a handful of full-time data governance experts who work with data owners, stewards, and executives to govern data as a business asset. They facilitate the work of a data governance council that sets data policies, standards, and processes. They are instrumental in helping the organization define and document key terms and metrics and enforce their usage. They also help curate a data catalog that enables business users to find relevant data sets, queries, reports, schema, and models and is a critical driver of self-service activity.
Data Management Team
The data management team is an enterprise group that works closely with the analytics and data science teams, the PMO, and high-demand departments and business units. The data management team centralizes data architects, data engineers, systems engineers, and platform engineers so they learn and apply the same methods, processes, and standards. This improves data consistency, accelerates delivery, and creates a collaborative, career-advancing culture that retains top talent.
Centralized data teams often lack the agility to meet business needs quickly. To avoid spawning data shadow systems that wreak havoc on data consistency and increase risk, we recommend that the data management team form tiger teams and assign each to an individual delivery channel (i.e., the PMO, the analytics team, or data science team) or high-demand department. (See figure 1.)
Tiger teams help data management teams balance the dueling demands of speed and standards. They deliver the best of both centralized and decentralized development. Although they build departmental solutions, they adhere to the enterprise development methodology, architectural design, and governance processes. Rather than handing off tasks among technical specialists who have their own backlog of projects, tiger teams only work on one project at a time. There are no handoffs between specialists which create delays. And tiger team members immerse themselves in the business domain and its data, gaining valuable context which enables them to create better solutions more quickly.
And tiger team members immerse themselves in the business domain and its data, gaining valuable context which enables them to create better solutions more quickly.
Tiger teams are self-organized, cross-functional, and cross-trained agile teams of three individuals from the data management team who work full-time to build a solution for a department or business unit. A typical tiger team consists of a data engineer, BI developer, and technical analyst, but the composition varies based on the requirements. For instance, a data science tiger team might include a software developer and/or a MLOps engineer. Ideally, each member of a tiger team can perform the work of the other two, if required, so there is no delivery delay due to absences or other issues. (See figure 2.)
Figure 2. Tiger Team Composition
Agile Techniques. Tiger teams follow the agile methodology. They work closely with a product owner from the business to define and prioritize stories that comprise each sprint. They are guided by a Scrum master who trains the team on agile techniques and helps them review their work in periodic “retrospectives” so the team can improve its ability to scope, estimate, and deliver projects. The team might also call on specialists to help with certain aspects of development during one or more sprints, such as a data architect to create or review a data model or a software developer to build a web page or integration.
Estimating Capacity. For tiger teams to work, organizations need to hire enough data management experts to meet demand across all delivery channels. Otherwise, the business will relapse into creating data shadow systems. To fully staff a data management team, organizations should examine the volume and complexity of requests from all delivery channels (analytics, data science, PMO, tickets) and calculate the number and type of tiger teams and specialists required to support the demand. The organization can start slowly with a handful of tiger teams and a Scrum master to guide them. Once the tiger teams prove their worth, it can add more.
DataOps. After a while, each tiger team will improve its productivity, doing more work in less time, especially if it adopts DataOps tools and techniques, such as continuous integration, continuous delivery, and continuous testing (CI/CD/CT). With automation and governance built into its development processes, tiger teams will deliver better solutions faster with fewer errors and defects. Thus, the organization might never need as many tiger teams as it initially estimated.
Back to Self-Service
So, what do tiger teams build? The tiger team(s) assigned to the analytics team and high-demand departments should build data structures and applications that help departments service their own data needs, which further reduces the need for centralized services.
Data Marts. For example, every department needs a data set that contains 80% of the data that data analysts might want to query on a regular basis. The data set could be a data mart, a cube, a semantic layer, a database view, or a set of tables in a cloud data platform.
Core Dashboard. Additionally, each department also needs a core parameterized dashboard that contains basic operational information that answers 80% of the questions that businesspeople ask on a daily or weekly basis. The team can design parameters that are tailored to people’s roles so they see only the information that’s relevant to them, avoiding confusion that dampens usage.
Data Discovery. Although a departmental data mart and operational dashboard are great places to start, they are not enough. Tiger teams will also need to deliver advanced analytic functionality. For instance, data analysts periodically hit the boundaries of a data mart. To meet these outlier requests, organizations need to implement a data catalog and access framework so data analysts can query other data structures in the organization they have permission to use. The organization also needs to give data analysts a self-service workbench so they can easily query, prepare, visualize, and share the data. (See “Self-Service Triumvirate: The New Data Analyst Workbench”.)
Sandboxes. Tiger teams can also build sandboxes that data scientists use to provision data and compute resources. This enables them to prepare data and features and test and run models without having to download data to desktop machines.
Embedded Analytics. Finally, each department might need one or more data-driven applications to improve customer service, optimize cross-sell and upsell initiatives, or streamline marketing campaigns. Here, data and analytics might be part of a larger software installation that has real-time and application integration requirements. Tiger teams will need software developers, integration architects, and data scientists to embed analytics into go-to-market applications that require near real-time data.
An analytics center of excellence has distinct data management, analytics, data science and governance teams that contribute unique skills and resources to help an organization become data-driven. When designed around federated techniques, such as tiger teams, agile methodologies, and DataOps tools and techniques, these teams can achieve the holy grail of data development: “faster, better, cheaper.” When this happens, the business views the data and analytics team as a strategic partner integral to its success.