Considerations for a Planned Democratizing Data Framework for Valid and Trusted Data

Abstract

A key requirement of today’s fast changing business outcome and innovation environment is the ability of organizations to adapt dynamically in an effective and efficient manner. Becoming a data-driven decision-making organization plays a crucially important role in addressing such adaptation requirements. The notion of data democratization has emerged as a mechanism with which organizations can address data-driven decision-making process issues and cross-pollinate data in ways that uncover actionable insights. We define data democratization as an attitude focused on curiosity, learning, and experimentation for delivering trusted data for trusted insights to a broad range of authorized stakeholders. In this paper, we propose a general indicator framework for data democratization by highlighting success factors that should not be overlooked in today’s data driven economy. In this practice-based research, these enablers are grouped into six broad building blocks: 1) “ethical guidelines, business context and value”, 2) “data leadership and data culture”, 3) “data literacy and business knowledge”, 4) “data wrangling, trustworthy & standardization”, 5) “sustainable data platform, access, & analytical tool”, 6) “intelligent data governance and privacy”. As an attitude, once it is planned and built, data democratization will need to be maintained. The utility of the approach is demonstrated through a case study for a Cameroon based start-up company that has ongoing data analytics projects. Our findings advance the concepts of data democratization and contribute to data free flow with trust.

Share and Cite:

Takang, T. and Amaechi, A. (2023) Considerations for a Planned Democratizing Data Framework for Valid and Trusted Data. Journal of Data Analysis and Information Processing, 11, 240-261. doi: 10.4236/jdaip.2023.113013.

1. Introduction

Valid and trusted data is a vital asset for any organization. Organizations using data as the fuel to drive large-scale decision-making must intentionally develop their data capability. Data democratization is an IT capability and a method. For many people, “democratizing data” or “data democratization” is a vague term that encompasses various meanings, issues, and visions. Data democratization is a method and an attitude. Development of methods and guidelines for use in industry has remained an important strand of academic practice led research. As Gericke et al. [1] said: “the description of a method should cover its core idea, the representations in which design information is described, the procedure to be followed, its intended use, and the tools it uses.” Literature review (e.g., [2] [3] [4] ) shows that organizations that wish to benefit from data democratization method will have to design it intentionally regardless of the strategy. Data democratization processes are needed to unleash the benefits of more open and trusted data flows. Democratization of domain-specific knowledge has become essential, for instance, large government funded projects are launched to encourage the research investigators to collaborate and conduct academic training, and workshops for promoting the field of data science [3] . This perspective can be likened to Baird and Schuller [5] algorithmic explainability (i.e., the extent one can understand and explain how an algorithm reached some conclusion). Awasthi and George [6] in their studies viewed the concept of data democratization as a “strategy for organisations with a proactive plan to prepare both technical and non-technical data users for effective use of data to cultivate a competitive advantage over others operating in the same strategic group”.

The need for a holistic data democratization implementation method, guideline, and capability lays the foundation for this research. In this exploratory paper, we unpack the concept of data democratization that can accurately describe this emerging space with the attempt to: 1) distil characteristics of the meaning of the data democratization concept, and 2) determine a construction approach to address the identified characteristics, which organizations can refer to when planning their data driven business journey. This theoretical-empirical article aims to understand how data democratization is enacted. For that, we employ Actor-Network Theory (ANT) [7] as a theoretical-methodological approach to identify and describe the practices and knowledge of a professional organization that works with data. From this perspective, democratizing data is an innovation focused on construction. ANT examines the interconnections of human and nonhuman entities. The objective is to understand how these things come together and manage to hold together; to assemble collectives or networks that produce force and other effects. On that, the summarized sets of research objectives are:

1) Create a review of the literature on the topic we will perform a structured literature review to create an overview of literature on the topic and establish a foundation to tackle my other research objectives.

2) Establish a unified definition of the data democratization. Using the existing literature, we will analyze how scholars define the term either explicitly or implicitly.

3) Enact a human-centric framework for data democratization. We will use the insights gathered through the previous steps to establish a conceptual creative framework.

Against this background, the overarching research question of this study is derived as: “What are the resources and capabilities that organizations need to acquire to build data democratization capabilities”.

As will be discussed in subsequent sections of this study, a primary criterion for the selection of the reviewed material is their direct experience and subject matter expertise in either the field of “data democratization” or data analytics innovation analysis, or both. Hence, this study discusses the issues presented above by conducting the following tasks: Section 2 introduces our research methodology and theoretical foundation. Section 3 comprises the literature analysis and the preliminary phases we identified provide our first contribution—characterization of data democratization in data driven innovation. Section 4 presents a human-centric framework for data democratization. Next, we describe how the framework was operationalized into a questionnaire to measure data democratization behavior within the domain of data-driven analytics and present the results of a small empirical test of the framework and instrument. Section 5 discusses conclusions and opportunities for further research.

2. Theoretical Foundations and Methods

2.1. Actor-Network Theory (ANT)

In studying enabling factors for data democratization process, this paper argues that learning from Actor-Network Theory (ANT) teachings can be extended in framing a sustainable data democratization model. According to Gherardi [7] , the ANT forms the umbrella approach known as Practice Theory or Practice-Based Research (PBR). ANT is a theory and a method. PBR approaches are holistic and qualitative practices formed by a set of activities that acquire sense and make them a unit. According to Farias et al., ANT is a way of doing and engaging in the world [8] . Callon et al. [9] also talked about ANT and its contributions to the understanding of the configurations of human and non-human elements that enact organizations and realities. From ANT practice perspective, knowledge, technology, and abstract concepts can be usefully interrogated by examining the human-technology relationships that produce them [10] . The enabling factors are the following four key concepts of ANT: spokespersonship, matters of fact/matters of concern, obligatory passage points and hybrid forums. The core theoretical perspective adopted in this research fits the ANT practice that seeks to retrieve the complexity and heterogeneity that constitute reality as enacted in Law [11] , suggesting that it occurs in a particular prescriptive, and non-problematic way [12] . For ANT, all things are understood as enactments. In the next section, we define data democratization and propose a framework for visualizing data democratization in the context of study design.

2.2. Research Methodology

Because the main aim of this research is to advance, refine and expand a body of knowledge on data democratization in Cameroon, establish facts, and construct an explanatory framework, the design science approach of Vaishnavi and Kuechler [13] was found most suitable. Our awareness of the problem and initial proposal is based on a qualitative meta-synthesis of data democratization projects in research and practice to reconcile findings across studies unearthed with a structured literature review. Following Beck et al. [14] formula, we extend Vaishnavi and Kuechler [13] procedure with theory-building elements from the interpretative research method grounded theory [15] . Grounded theory (GT) is a structured, yet flexible methodology. The process is iterative and recursive. GT is performed through a systematic data collection procedure, identification of categories (themes), linking these categories, and forming theories that explain the process. According to Chun Tie et al. [15] , theory is not discovered; rather, theory is constructed by the researcher who views the world through their own lens. Because limited research has been carried out on data democratization in Cameroon, this methodology is appropriate. The methodology enables to produce or construct an explanatory theory that uncovers a process inherent to the substantive area of inquiry. The adapted approach consists of four distinct phases: Awareness, data collection and suggestion, development, and evaluation and conclusion. See Figure 1 for a summary.

Although the phases are sequential, they were iterated until a coherent framework emerged. Specifically, we have performed three design iterations of data collection and suggestion, development, and evaluation and conclusion: Iteration 1: Structured literature review to enhance theoretical sensitivity, meta-synthesis, initial framework, Iteration 2: Expert interview study, data coding, consolidated framework, demonstration, and expert feedback, Iteration 3: Framework refinement, evaluation workshops, final framework.

• Awareness. The artifact of this research is a framework to facilitate and guide the introduction of data democratization in companies to aid the systematic design, development, and evolution of implementations.

• Data collection and suggestion. To examine the current state-of-the-art and provide our first contribution, we conducted a rapid evidence assessment of the literature on data democratization. Data democratization is an evolving topic and as Tate et al. [16] suggested, conducting an in-depth literature review aids in understanding the current body of knowledge and identifying the research gaps. The systematic literature review model suggested by Watson [17] and Levy and Ellis [18] and Webster were followed. A literature review according to Cronin et al. [19] , must comprise several sources of data to be relevant and effective. This study considered more than two databases (i.e., AIS virtual library, Wiley online library, Taylor & Francis online library, Emerald insight, Springer, Science Direct, IEEE, ACM digital library, and Scopus), industrial initiatives, business case reports, along with grey literature using search terms “data democratiz(s)ation”, “democratiz(s)ation of

Figure 1. Research methodology based on (Vaishnavi and Kuechler [13] ; Chun Tie et al. [15] ).

data” and “democratiz(s)ed data”. While we were aware of the limited scientific research in data democratization, we still considered a literature review to structure the domain as important. From the analysis of the articles, we designed a first iteration of the framework. In the second iteration, we conducted semi-structured interviews to verify the first iteration of the framework and adapt it according to the input we received from the interviewees. The interviews were recorded, anonymized, and transcribed. To extract information from the interviews, we coded the transcripts iteratively (using open and axial coding) and analyzed them with the grounded theory approach. Some of the publications were included at a later stage when we conducted a “snowball” search by following up references cited in literature that we included at the initial stage. There was no restriction placed on the discipline on which the article’s focus is on; as data democratization is perceived as a multidisciplinary field. However, only papers written in the English language were considered for this study.

• Development. Based on the evaluation of the structured literature analysis through the interview study, we combined the identified stages and phases from both analyses. The first version of the framework emerged from the results of the literature analysis, which were adapted and supplemented by the expert interviews. The abduction logic casual mapping determination suggested by Nandi et al. [20] and Narayanan and Armstrong’s [21] was followed.

• Evaluation and conclusion. Our evaluation was informed by Venable’s FEDS Framework [22] to “demonstrate the utility, quality, and efficacy” [23] of our design artifact. Due to the nascent nature of our research with the absence of general recommendations for data democratization implementation, we decided to implement a two-staged naturalistic, summative evaluation based on a human risk and effectiveness strategy [22] . First, we presented the framework to the interviewed experts again and collected their feedback. Second, we evaluated the applicability of the revised and refined version of our framework in multiple workshops using real-life cases. We conducted online meetings with multiple companies and applied the framework to their respective situation. We used their input to finalize our framework as outlined above. After the final consensus between the team proposed enablers and subject-matter expert’s feedbacks was reached, an outside case study found in the literature was used to partially validate the framework. The learning from the case study workshops constitutes our third contribution.

3. Literature Review

This paper builds on previous efforts to identify the data democratization enabling factors and definitions. The following sections summarize those efforts.

3.1. Data Democratization Characterization

This study builds on other recent work in the data democratization discipline. Literature review shows that democratizing data is a multi-faceted, complex phenomenon with many sub-concepts, and definition is thus in flux. In this study, we did not select one definition of democratization as a normative baseline to critically assess our material but strived to induce various definitions of “democratization” from different authors’ writings. As the selected definitions in Table 1 shows, the terms data democratization is used in different ways, and it is agnostic practices.

Different definitions imply the need for interoperability, ease of integration, openness and inclusiveness, trust between those data producers and the data consumers. Establishing a unified definition of the data democratization is the not the goal of this paper but creating a flexible and robust architecture for data democratization is. By combining the various insights acquired through analysis of the various definitions, this study defines the agnostic term “data democratization” as: “a holistic attitude of willing organizations focused on curiosity, learning, and experimentation for delivering trusted data for trusted insights to a broad range of authorized data stakeholders”.

3.2. Enablers of Data Democratization—The Past and the Present Control Enablers

In this section, we analyze prominent scientific research papers, widely used data democratization success factors, and best practices to study the mechanisms, and factors that can be used to gauge and benchmark an organization’s data democratization architecture. Several applications of data democratisation success factors can be found in many different fields such as domain of healthcare (Eichler et al. [38] ; Lewis et al. [39] ; Kuiler & McNeely [40] ; Minielly et al. [41] ), energy resource (Yoder [42] ; DiChristopher [43] ; Husseini [44] ), education (Fay [45] ), housing market (McLaughlin & Young [46] ; Grey [47] ), agriculture (Chandra et al. [48] ). The many recent publication cited shows that data democratization in both theory and practice is gaining ground and acceptance. While data democratization is recognized as an important concept in research and practice, it is still unclear what it comprises and how it is built (Labadie et al.

Table 1. Sample definitions of data democratization as used in literature.

[49] ; Lefebvre et al. [50] ). Therefore, identifying existing data democratization guide and success factors for data democratization is considered an important part of this study. Much of the cited literature discussed data democratization success factors to be multifaceted. A factor could be an enabler, an obstacle, or both. For example, one of the factors was data leadership. This was presented in the literature as an obstacle if it was lacking, and an enabler if it was present. Table 2 summarizes the most important data democratization success factors or enablers empirical results.

4. Constructing the Human-Centric Data Democratization Framework (HCDDF)

One of the key challenges of outlining a data democratization indicator framework is selecting which factors to track. It is always better to align elements with strategies or goals that organizations have. This means selecting fundamental drivers of performance that are the true drivers of desirable outcomes. The development of the HCDDF presented here was greatly influenced by Ittner and Larcker [58] process to discover which factors have the most powerful effects on long-term performance.

4.1. Data Democratization Conceptual Architecture

A principal tenet in this HCDDF is the common-sense idea that, at any given time data democratization implementation attitude is an IT capability. It shows questions to be asked and issues to be taken care of. Consistent with the reasoning in our analysis of previous literature and lived experience, critical success indicators were identified using manual coding and grouped into 6 explainable data democratization building indicators: 1) “Organization structure, ethical guidelines, business context and stakeholders’ value”, 2) “Data leadership and data culture”, 3) “Data literacy, continuous training and capacity building”, 4) “Data Observability, trustworthy, standardized data sets, interoperability”, 5) “Sustainable data technologies |platforms, access to data and tools”, 6) “Intelligent Data Governance and Privacy”. These six barrier-enabler couples describe the general prerequisites of sustainable, flexible, adaptable, and practical framework for data democratization solutions.

Figure 2 shows a representation of the data democratization framework, outlining how the different components relate to each other. The framework is

Table 2. Summarization of most important propositions and empirical studies.

Figure 2. Overarching data democratization framework.

aggregation of indicators describing the different aspects of a robust data democratization attitude and capability. A consideration of all these factors in the development of human-centric data democratization approaches can lead to automated functions. Data democratization capabilities are critical resources for any organization that sees business data as an organizational asset.

Although each perspective prompts different types of research question, they should be thought of as overlapping rather than mutually exclusive. Data democratization must start with human-centric design and continue with human-centric workflows. The initial validation efforts (interviews and focus groups) of the model were a valuable way to capture the perspectives of SMEs who have worked in this field.

4.1.1. Building Block 1: Organization Structure, Ethics, Business Context and Stakeholders’ Value

This first building block is a key concept in data democratization as an innovation. Organizational structure through which the data democratization is pursued is important to its success. It is proven that enactment and promotion of community of practice (CoP) “focused on developing skills around tools and methods, specific data object or data domain, and on spreading general data awareness” is essential to data democratization success. Harland et al. [52] talked about the importance of cross-functional problem solving in data democratization approach. A key concept of ANT that we found highly associated with this dimension is the idea of Hybrid forums. This is the ability of principal actors to share, evaluate and modify information and it is shaped by the technological, bureaucratic, and physical spaces where information is exchanged and debated. Making data democratic requires the kind of approach in which interested parties can examine, recalculate, debate over, and expand upon the meaning of data. Understanding the business value at risk should be part of the holistic data democratization approach.

Access to data alone is not enough. Every data democratization project must include a primer on ethics with the guide. It is essential that ethical risk and complexity are considered in use case prioritization and project approach. It has been argued rightly so that data is impacted by the philosophies, prejudices, and purposes of each person who interacts with it. Paraphrasing Leslie [59] definition, data democratization ethics is a set of values, principles, and techniques that employ widely accepted standards of right and wrong to guide moral conduct in the development and use of data technologies.

For effective decision making, the context should always come before other relevant questions such as methods. What is the purpose should be answered at the very beginning. Attempt must be made to provide situational assessment, scenario generation and decision-making at various levels of scalability. Each will have a different emphasis and time scale. Vidgen et al. [60] argue that becoming data-driven is not merely a technical issue but requires that firms organize their business analytics departments and align their analytics capability with their business strategy. Solving the wrong problem is one of the causes of data project failure. To mitigate this risk, organization must start data democratization by asking the right questions. For an organization to effectively utilize the data democratization processes, it is vital that the aims are clear and realistically attainable. Organization must determine what defines success. The data democratization must clearly show what values are generated for all stakeholders. Generating value for data users is generally rather easy to motivate as it correlates strongly with the data’s analytical value [35] . Data democratization must ensure that data preparation, data quality compliance and data infrastructure employees have sufficient strategic and business knowledge of the project.

4.1.2. Building Block 2: Data Leadership and Data Culture

As Ravindran [54] notes, top of data democratization success factor is good data leadership. There are three contributing factors to good data leadership: People, Process and Technology. Data democratization implementations will require all 3 to be successful. People include roles for data stewards and data owners. The data owner is responsible for the data, such as customer, product, or organization attributes. The data steward is responsible for the day-to-day maintenance of the attributes and acts as the gatekeeper for any changes to the data. The second part, Process, requires that there are well-defined processes for entering data, maintaining data and details for specific attributes. As part of that process, we agreed with Harland et al. [52] who advocated that democratizing data should start from making data-based decision making mandatory and executives of the organization serve as role models. Part three, Technology, is leveraged to implement processes for data transformation, integration, and cleansing. When organizations want to embrace data democratization, there can be impediments to the free flow of information. The organizational structure should empower employees to proactively improve their routines and initiate and implement improvements on their own. The data democratization explainable process should foster a data-informed mindset among team and promote collaboration across teams and business units.

According to Davenport and Mittal [61] , culture depends in large part on the orientation of senior leaders and a key impediment holding many of businesses from profiting from data and analytics is the lack of a culture that truly values data/analytics capability and the superior decision making that can flow from it. Brown [62] said, data culture is one of the keys to building a data-driven organization. Fostering this ethical, data-driven culture means having frequent discussions about the implications of the data work, products, and data usage. It requires data producers and data consumers having “conversations that emphasize facts, credibility, and responsibility over opinion”. For an organization engaging in big data projects, a data-driven culture has been noted as being a key factor in determining their overall success and continuation [62] . Part of creating a data-culture is to have data literate people in the organization. As ANT scholars has argued through the concept of Spokespersonship (i.e., the ability to represent, or speak on behalf of groups and individuals) data democratization process will succeed where there is good data leadership in an organization and where enabling structure such as community of practice is well focus.

4.1.3. Building Block 3: Data Literacy, Continuous Training, and Capacity Building

Having high quality data and it is accessible is not sufficient for data democratization. You need people with the right skills to use that data. Awasthi and George notes that skills gaps and data awareness gaps are significant challenges with implementing data democratization [6] , therefore it is necessary to build the capacity of staff to work with the data through training programs. Data democratization capability requires data literate. Kasey Panetta, brand content manager at Gartner, defined data literacy as the ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied—and the ability to describe the use case, application and resulting value. Due to the constantly evolving technological landscape associated with such technologies. For an organization engaging in big data projects, [63] argues that it is important that a logic of continuous learning is infused in organizations that invest in big data. Data consumers must be continually educated to make sure they are aware of data security risks and how to prevent them. They must be empowered and encouraged to make value with the data, and to use data in a responsible way. Belli et al. [24] data democratization definition implies that besides aspects such as the access to data, and technological aspects, such as the easily used analytic patterns, that the users themselves need to be capable. Democratizing data means making data technologies accessible to non-domain specialists.

4.1.4. Building Block 4: Data Wrangling, Observability, Trustworthy, Standardization, Interoperability

To be successful, data democratization holistic approach must prioritize data observability, data sets standardization, interoperability, data quality compliance. Well-defined data observability is an essential capability for keeping data fit for use and for ensuring the continued availability, reliability, efficiency, and performance of the enterprise operational data pipeline. The democratizing data must ensure that organizations continuously track, assess, manage, and optimize the health of their data. Data needs to be catalogued and connected and put in context, within all line of businesses as well as enterprise wide to get full value from it. The process must describe the data appropriately (interpretable so everyone can understand what the data mean). The data should be joinable, shareable, and queryable as this will greatly enhances their analytical value for users. Data discoverability and interoperability are key components for data democratization. Data should be quality compliance. Harland et al. [52] argues that data must be of sufficient quality and that the type and scope of the data must be sufficient. Democratizing data should have adequate mechanism to give verified data accuracy and quality and encourage orchestrating semantic interoperability. Core data quality and pipeline metrics must be defined. This we think is related to the Actor-network theory concept of “Matters of fact and matters of concern”; a concept encourages the capability to determine what settled fact is and what is up for debate. We believe that developing standards and encouraging interoperability of datasets and necessary analytical tools are key component of data democratization. Improving understanding of a dataset’s composition and data model are very important to the success of data democratization.

4.1.5. Building Block 5: Sustainable Data Platform, Access, and Analytical Tool

Data democratization requires that organization have on-board tools and platforms that help everyone in the organization to make sense of the data at their disposal. Data analytics tools help data consumers to develop their analytical skills and perform data analysis by themselves ( [6] [30] ). Organization needs to identify and pick a storage option such as cloud storage that makes it easier for everyone to access it anytime from anywhere, and on-board tools and platforms that help everyone in the organization to make sense of the data at their disposal. Capable data platform is one that reduces the time required to extract insights from data, which accelerates decision-making. Capable data platform makes larger processing workloads and quantities of relevant data available. Belli et al. state in their definition of data democratization, it is the “ability of users to access all data using well-defined and easily used analytic patterns to answer unexpected questions”. Ensuring open access to data is one of the key components of data democratization. From the literature we have identified several different ways of managing data in a sustainable open fashion; such creating a decentralised marketplace where the datasets are aggregated and ready for consumption. Data exploration and visualisation tools can play an important role in understanding datasets composition and making it easier to detect biases in a set of data by providing insights into one’s data [64] . Good data democratization is based on availability of data technologies to obtain and process them.

4.1.6. Building Block 6: Intelligent Data Governance and Privacy

Effective data democratization implementations require rigorously enforced security and privacy protocols. The holistic approach must ensure that data are easily accessed from safe, secure, organized repositories at any time by anyone with respect to legal confidentiality and privacy issues. As a continual process, intelligent governed approach must ensure data stakeholder’s trust and at the same time make sure that the organization is strictly in compliance with both external regulatory mandates. The governance mechanisms should emphasis rapid consensus and consent, and less of absolute authority in decision making. Access to data can be disseminated to trusted parties (bringing data to the analysis) or users can analyze the data within a secure trusted research environment (bringing the analysis to the data). Related to that is the ANT concept of obligatory passage points—the capacity to share specific kinds of information more widely. As advocated by previous researchers (e.g., [65] [66] ) promoting fairness to access data irrespective of the users/actors’ domain expertise and technical know-how is important when democratizing data within the organization. Valid datasets used wrongly for nefarious purposes can be damaging for the organization. Therefore, there is need for there to be ethical assessments in the planning, development, and use of datasets, as well as regulation to make it possible to hold those who misuse data accountable for their actions.

5. Data Democratization Holistic Approach at B’SSADI GALLERIES

5.1. Case Study

Data will talk to you only if you are willing to listen. But not all enterprises are very well positioned to leverage value from their data assets. In the following section, we first present the evaluation results in the context of the data management workshop conducted at B’SSADI GALLERIES. External triggers such as the recent COVID-19 pandemic and the ongoing Anglophone Crisis has exerted multiple pressures on the organization to digitize their business data. B’SSADI GALLERIES has embarked on a journey towards sweeping digital transformation with intention of providing modern data analytic platforms for the ’IT Team’. B’SSADI GALLERIES management indicated their great desire towards becoming data-driven organization and was interested to understand how best to structure their data analytic project. The researcher spoke about how a holistic approach would help in creating a more solid objective and key results system for different departments and not just the “IT Team”. The B’SSADI GALLERIES workshop consists of two primary activities—presentations and three focus groups’ discussion. In the first part of the workshop (two full working days), we presented the data democratization holistic approach framework. The second component of the workshop engages participants with discussions around the state of adoption of data democratization principles in their own institutional contexts and subsequent completion of questionnaire.

The framework (Figure 1) proposes that the data democratization holistic approach maturity of a willing organization must intentionally develop in its six structuring elements in a synchronized way to unfold its data democratization potential. To be able to assess the dimensions of the framework in a standardized and assessor independent way, we formulated hypothesis/questionnaire building on the definitions of each of the data democratization building blocks (see Section 3.3). The questionnaire describes scenarios for each maturity level and each assessed dimension. The questions focus was to discover where B’SSADI GALLERIES’s employees believed that they are in incorporating the various principles of democratizing data to achieving their desired goal of becoming a data-driven organization. Each dimension is loaded with check questions reflecting its constituents. For example, the first dimension has the following required capabilities: “data ethics,” “business context” and “stakeholder value”.

Across the six dimensions, respondents rated B’SSADI GALLERIES data implementation on a scale of 1 (low or absent/ad hoc related to the capability; it is addressed in an improvised, irregular way) to 5 (high or capability performance is regularly assessed to improve practice and manage risks) as defined in Table 3. The test questionnaire poses 20 questions separated into five different categories. That is, we applied the principles in Figure 1 and evaluated these using criteria in Table 3. While assessing a project one can give the project a score from one to five for each question, sum the total score and divide it by 100. By doing this, one is left with a score between zero and one, representing how data democratic the process of data analytics is.

5.2. Case Study Analysis/Discussion

Maturity level 1 (Absent/ad hoc) and 2 (Repeatable) as defined above are mostly where the organization is on their data driven requirements. On the question of if their approach ensured that data usage is optimized to effectively realize business objectives without running afoul of ethical data usage guidelines and regulations. The measured maturity is registered at level 1. We had equally concluded based on the previous focus group discussion that there is a lack of understanding of what data ethics should constitute of among the employees. Employees are not able to independently operate systems where they are available beyond their standard functionalities to satisfy their information needs, e.g., analytical applications required for their daily work. The participants show

Table 3. Evaluation Dimensions (Source: Yanosky & Arroway [67] ).

from their maturity level selection that they do not have standardized capability and have documented procedures organizing, educating, and creating tools that enable non-experts to become involved in the process of governing with data. Yes, data democratization demands data access to and for everyone, but the process also requires intelligent checks and balances in form of governance mechanism. In addition, an important part of changing systems which data democratization approach advocate is changing attitudes. Over 80% of the participants gave indicated that they are either Maturity level 1 (Absent/ad hoc) or 2 (Repeatable).

Another hypothesis that we tested is that almost all data analytics projects require a collaborative, multidisciplinary team approach that needing training programmes to equip all data stakeholders with the multidisciplinary networks and skills that are required to develop and/or use data technologies. Participants indicated that the current training provision is not fit for purpose; with over 60% of the participants indicating that they are at maturity level 1 (Absent/ad hoc) and 25% saying maturity level 2 (Repeatable).

Another enabling factor the hypothesis question tested is the issue of data culture. Besides formal processes, organizational structure and assigned responsibilities, the culture among the associates has a paramount influence on how the concept of data democratization is embraced in a data-driven decision-making organization. Like most of the other questions, B’SSADI GALLERIES as an organization is operating at maturity level of 2 with most “decisions adapted case by case on the basis of personal knowledge or the intuition of the manager”. It must be emphasised that this is a preliminary analysis, and for formal recommendations, a much fuller and more systematic analysis would be required using the principles set out in Figure 1 and Table 3.

5.3. Strengths and Limitations

This study used qualitative approach. Qualitative methods allowed us to explore and map the idea of data democratization or democratizing data topic in detail. Our mapping of the characteristics and enabling factors also has limitations, which need to be considered to qualify our results. First, the use of peer-reviewed articles and documents in English limited our attention to comparatively privileged voices in the discourse on data democratization in digital economy. A more thorough mapping of visions of democratization, which could capture alternative visions of democratizing data, would need to extend the materials to other languages and materials. It should be noted that research into human evaluation of the framework is piecemeal and non-systematic. It relies heavily on self-reported behaviors and beliefs which are not always reliable.

6. Conclusion

In this paper, we raised the concept of data democratization as a holistic attitude of willing organizations focused on delivering trusted data for trusted insights to a broad range of authorized data stakeholders. An overall conceptual layered framework was proposed which aims to enable such a vision. The intention has been to create a framework that helps those wishing to understand, design, analyze or evaluate, real-world sustainable data democratization, by highlighting elements that should not be overlooked. It is not intended to be a step-by-step, how-to guide. The elements of the framework, outlined in Figure 2, were developed through the analysis of lived experience, interviews, workshops, and purposeful literature reviews. The framework proposed in this study supports a people-first approach, emphasizing the need for an organizational structure that encourages continual learning and democratizes access to and oversight of data. The framework can be considered as a prescription that can narrow the gap between theory and practice as all experts saw value in the framework. It provides clear methodological guidance on how to approach data democratization implementation projects comprehensively and it is of practical value for organizations as confirmed by the interviews and workshops. While it does not guarantee data democratization project success, it provides a means to track progress and conduct data-driven innovation project in a structured manner. It reinforces the conclusions: the problem is rarely a lack of data, but availability and trust in the data. And, as [68] voiced, insufficient protection of data privacy and confidentiality could lead to a loss of trust and a consequent disincentive to participating in sharing knowledge. For adequate maturity, further studies are needed to keep on integrating the proposed conceptual framework with technical and procedural solutions that can promote democratic trusted data sharing.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Gericke, K., Eckert, C. and Stacey, M. (2022) Elements of a Design Method—A Basis for Describing and Evaluating Design Methods. Design Science, 8, E29.
https://doi.org/10.1017/dsj.2022.23
[2] Bhattacharya, S., Hu, Z. and Butte, A.J. (2021) Opportunities and Challenges in Democratizing Immunology Datasets. Frontiers in Immunology, 12, Article ID: 647536.
https://doi.org/10.3389/fimmu.2021.647536
[3] Van Horn, J.D., Fierro, L., Kamdar, J., Gordon, J., Stewart, C., Bhattrai, A., et al. (2018) Democratizing Data Science through Data Science Training. Pacific Symposium on Biocomputing, 23, 292-303.
[4] Batarseh, F.A. and Yang, R. (2020) Data Democracy: At the Nexus of Artificial Intelligence, Software Development, and Knowledge Engineering. Academic Press, Cambridge.
[5] Baird, A. and Schuller, B. (2020) Considerations for a More Ethical Approach to Data in AI: On Data Representation and Infrastructure. Frontiers in Big Data, 3, Article No. 25.
https://doi.org/10.3389/fdata.2020.00025
[6] Awasthi, P. and George, J.J. (2020) A Case for Data Democratization. Proceedings of the Americas Conference on Information Systems (AMCIS), Salt Lake City, 10-14 August 2020, 23.
https://aisel.aisnet.org/amcis2020/data_science_analytics_for_decision_support/data_science_analytics_for_decision_support/23
[7] Gherardi, S. (2012) How to Conduct a Practice-Based Study: Problems and Methods. Edward Elgar Publishing Limited, Cheltenham.
https://doi.org/10.4337/9780857933386
[8] Farias, I., Blok, A. and Roberts, C. (2020) Actor Network as a Companion: An Inquiry into Intellectual Practices. In: Farias, I., Blok, A. and Roberts, C., Eds., The Routledge Companion to Actor Network Theory, Routledge, London, 20-35.
[9] Callon, M., Pierre, L. and Yannick, B. (2009) Acting in an Uncertain World: An Essay on Technical Democracy (Graham Burchell, Transl.). The MIT Press, Cambridge.
[10] Latour, B. (2004) Politics of Nature: How to Bring the Sciences into Democracy. Harvard University Press, Cambridge.
[11] Law, J. (1999) After ANT: Complexity, Naming and Topology. The Sociological Review, 47, 1-14.
https://doi.org/10.1111/j.1467-954X.1999.tb03479.x
[12] Alcadipani, R. and Hassard, J. (2010) Actor-Network Theory, Organizations and Critique: Towards a Critique of Organizing. Organization, 17, 419-435.
https://doi.org/10.1177/1350508410364441
[13] Vaishnavi, V.K. and Kuechler, W. (2015) Design Science Research Methods and Patterns: Innovating Information and Communication Technology. CRC Press, Hoboken.
https://doi.org/10.1201/b18448
[14] Beck, R., Weber, S. and Gregory, R.W. (2013) Theory-Generating Design Science Research. Information Systems Frontiers, 15, 637-651.
https://doi.org/10.1007/s10796-012-9342-4
[15] Chun Tie, Y., Birks, M., Francis, K. (2019) Grounded Theory Research: A Design Framework for Novice Researchers. SAGE Open Medicine, 7.
https://doi.org/10.1177/2050312118822927
[16] Tate, M., Furtmueller, E., Evermann, J. and Bandara, W. (2015) Introduction to the Special Issue: The Literature Review in Information Systems. Communications of the Association for Information Systems, 37, 1.
https://doi.org/10.17705/1CAIS.03705
[17] Webster, J. and Watson, R.T. (2002) Analyzing the Past to Prepare for the Future: Writing a Literature Review. MIS Quarterly, 26, 13-23.
[18] Levy, Y. and Ellis, T.J. (2006) A Systems Approach to Conduct an Effective Literature Review in Support of Information Systems Research. Informing Science, 9, 181-212.
https://doi.org/10.28945/479
[19] Cronin, P., Ryan, F. and Coughlan, M. (2008) Undertaking a Literature Review: A Step-by-Step Approach. British Journal of Nursing, 17, 38-43.
https://doi.org/10.12968/bjon.2008.17.1.28059
[20] Nandi, S., Hervani, A.A. and Helms, M.M. (2020) Circular Economy Business Models—Supply Chain Perspectives. IEEE Engineering Management Review, 48, 193-201.
https://doi.org/10.1109/EMR.2020.2991388
[21] Narayanan, V. and Armstrong, D.J. (2004) Causal Mapping for Research in Information Technology. IGI Global, Hershey.
https://doi.org/10.4018/978-1-59140-396-8
[22] Venable, J., Pries-Heje, J. and Baskerville, R. (2016) FEDS: A Framework for Evaluation in Design Science Research. European Journal of Information Systems, 25, 77-89.
https://doi.org/10.1057/ejis.2014.36
[23] Treuhaft, S. (2006) The Democratization of Data: How the Internet Is Shaping the Work of Data Intermediaries, Working Paper, No. 2006,03, University of California, Institute of Urban and Regional Development (IURD), Berkeley.
https://escholarship.org/uc/item/32961226
[24] Bellin, E., Fletcher, D.D., Geberer, N., Islam, S. and Srivastava, N. (2010) Democratizing Information Creation from Health Care Data for Quality Improvement, Research, and Education—The Montefiore Medical Center Experience. Academic Medicine, 85, 1362-1368.
https://doi.org/10.1097/ACM.0b013e3181df0f3b
[25] Marr, B. (2017, July 24) What Is Data Democratization? A Super Simple Explanation and the Key Pros and Cons. Forbes.
https://www.forbes.com/sites/bernardmarr/2017/07/24/what-is-data-democratization-a-super-simple-explanation-and-the-key-pros-and-cons/?sh=1a0241eb6013
[26] Cornelissen, J. (2018) The Democratization of Data Science. Harvard Business Review.
https://hbr.org/2018/07/the-democratization-of-data-science
[27] Zeng, J. and Glaister, K.W. (2018) Value Creation from Big Data: Looking inside the Black Box. Strategic Organization, 16, 105-140.
https://doi.org/10.1177/1476127017697510
[28] Hyun, Y., Hosoya, R. and Kamioka, T. (2019) The Moderating Role of Democratization Culture: Improving Agility through the Use of Big Data Analytics. Pacific Asia Conference on Information Systems, Xi’an, 8-12 July 2019.
[29] Mallik, P. (2019, July 18) Data Democratization. Towards Data Science.
https://towardsdatascience.com
[30] Pires, D.M. (2020, April 15) A Data Engineer’s Perspective on Data Democratization.
https://towardsdatascience.com/a-data-engineers-perspective-on-data-democratization-a8aed10f4253
[31] Lefebvre, H., Legner, C. and Fadler, M. (2021) Data Democratization: Toward a Deeper Understanding. Proceedings of the International Conference on Information Systems (ICIS), Austin, 12-15 December 2021.
[32] Hertzano, R. and Mahurkar, A. (2022) Advancing Discovery in Hearing Research via Biologist-Friendly Access to Multi-Omic Data. Human Genetics, 141, 319-322.
https://doi.org/10.1007/s00439-022-02445-w
[33] Choudhgurry, A. (2022) What Is Data Democratization? Definition and Principles.
https://amplitude.com/blog/data-democratization
[34] Hinds, T.L., Floyd, N.D. and Ueland, J.S. (2021) Policy and Praxis in Data Democratization Efforts: A Case Study of Minnesota State’s Equity 2030. New Directions for Institutional Research, 2021, 53-70.
https://doi.org/10.1002/ir.20352
[35] Samarasinghe, S., Lokuge, S. and Snell, L. (2022) Exploring Tenets of Data Democratization.
[36] Marinakis, V., Koutsellis, T., Nikas, A. and Doukas, H. (2021) AI and Data Democratisation for Intelligent Energy Management. Energies, 14, Article No. 4341.
https://doi.org/10.3390/en14144341
[37] Eichler, R., Gröger, C., Hoos, E., Schwarz, H. and Mitschang, B. (2022) Data Shopping—How an Enterprise Data Marketplace Supports Data Democratization in Companies. International Conference on Advanced Information Systems Engineering, Leuven, 6-10 June 2022, 19-26.
https://doi.org/10.1007/978-3-031-07481-3_3
[38] Eichler, G.S., Imbert, G., Branson, J., Balibey, R. and Laramie, J.M. (2022) Democratizing Data at Novartis through Clinical Trial Data Access. Drug Discovery Today, 27, 1533-1537.
https://doi.org/10.1016/j.drudis.2022.02.019
[39] Lewis, K., Pham, C. and Batarseh, F.A. (2020) Data Openness and Democratization in Healthcare: An Evaluation of Hospital Ranking Methods. In: Batarseh, F.A. and Yang, R.X., Eds., Data Democracy, Academic Press, Cambridge, 109-126.
https://doi.org/10.1016/B978-0-12-818366-3.00006-X
[40] Kuiler, E.W. and McNeely, C.L. (2020) Knowledge Formulation in the Health Domain: A Semiotics-Powered Approach to Data Analytics and Democratization. In: Batarseh, F.A. and Yang, R.X., Eds., Data Democracy, Academic Press, Cambridge, 127-146.
https://doi.org/10.1016/B978-0-12-818366-3.00007-1
[41] Minielly, N., Hrincu, V. and Illes, J. (2020) Privacy Challenges to the Democratization of Brain Data. iScience, 23, Article ID: 101134.
https://doi.org/10.1016/j.isci.2020.101134
[42] Yoder, R.T. (2019) Digitalization and Data Democratization in Offshore Drilling. Offshore Technology Conference (OTC), Houston, May 2019, OTC-29381-MS.
https://doi.org/10.4043/29381-MS
[43] DiChristopher, T. (2015) Oil Firms Are Swimming in Data They Don’t Use. CNBC.
https://www.cnbc.com/2015/03/05/us-energy-industry-collects-a-lot-of-operational-data-but-doesnt-use-it.html
[44] Husseini, T. (2018) Big Data in Oil and Gas Operations and Other Tech Advancements: Seven Expert Opinions. Offshore Technology.
https://www.offshore-technology.com/features/big-data-in-oil-and-gas-tech
[45] Fay, C. (2020) Perceptions of Community College Institutional Research Leaders on Data Democratization. Doctoral Dissertation, Northern Arizona University, Flagstaff.
[46] McLaughlin, R. and Young, C. (2018) Data Democratization and Spatial Heterogeneity in the Housing Market. In: Herbert, C., Spader, J., Molinsky, J. and Rieger, S., Eds., A Shared Future: Fostering Communities of Inclusion in an Era of Inequality, Harvard Joint Center for Housing Studies, Cambridge, 126-139.
[47] Grey, J. (2017) The Democratization of Data. Housing Wire.
https://www.housingwire.com/articles/40946-the-democratization-of-data
[48] Chandra, R., Swaminathan, M., Chakraborty, T., Ding, J., Kapetanovic, Z., Kumar, P. and Vasisht, D. (2022) Democratizing Data-Driven Agriculture Using Affordable Hardware. IEEE Micro, 42, 69-77.
https://doi.org/10.1109/MM.2021.3134743
[49] Labadie, C., Legner, C., Eurich, M. and Fadler, M. (2020) Fair Enough? Enhancing the Usage of Enterprise Data with Data Catalogs. 2020 IEEE 22nd Conference on Business Informatics (CBI), Antwerp, 22-24 June 2020, 201-210.
https://doi.org/10.1109/CBI49978.2020.00029
[50] Janssen, M., Van Der Voort, H. and Wahyudi, A. (2017) Factors Influencing Big Data Decision-Making Quality. Journal of Business Research, 70, 338-345.
https://doi.org/10.1016/j.jbusres.2016.08.007
[51] Wenger, E. (1998) Communities of Practice: Learning, Meaning, and Identity. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511803932
[52] Harland, T., Hocken, C., Schröer, T. and Stich, V. (2022) Towards a Democratization of Data in the Context of Industry 4.0. Sci, 4, Article No. 29.
https://doi.org/10.3390/sci4030029
[53] Ravindran, K. (2022) Microsoft. Scaling Digital Innovation with Responsible Data Democratisation.
https://www.youtube.com/watch?v=Vv8TRvGCOZc
[54] Rubeis, G., Dubbala, K. and Metzler, I. (2022) “Democratizing” Artificial Intelligence in Medicine and Healthcare: Mapping the Uses of an Elusive Term. Frontiers in Genetics, 13, Article ID: 902542.
https://doi.org/10.3389/fgene.2022.902542
[55] Shamim, S., Yang, Y., Zia, N.U. and Shah, M.H. (2021) Big Data Management Capabilities in the Hospitality Sector: Service Innovation and Customer Generated Online Quality Ratings. Computers in Human Behavior, 121, Article ID: 106777.
https://doi.org/10.1016/j.chb.2021.106777
[56] Lefebvre, H. and Legner, C. (2022) How Communities of Practice Enable Data Democratization inside the Enterprise. European Conference on Information Systems (ECIS 2022), Timisoara, 18-24 June 2022.
[57] Samarasinghe, S. and Lokuge, S. (2022) Exploring the Critical Success Factors for Data Democratization. Australasian Conference on Information Systems, Melbourne, 4-7 December 2022, 1-8.
https://arxiv.org/ftp/arxiv/papers/2212/2212.03059.pdf
[58] Ittner, C.D. and Larcker, D.F. (2003) Coming up Short on Nonfinancial Performance Measurement. Harvard Business Review, 81, 88-95.
[59] Leslie, D. (2019) Understanding Artificial Intelligence Ethics and Safety: A Guide for the Responsible Design and Implementation of AI Systems in the Public Sector. The Alan Turing Institute, London.
https://doi.org/10.2139/ssrn.3403301
[60] Vidgen, R., Shaw, S. and Grant, D.B. (2017) Management Challenges in Creating Value from Business Analytics. European Journal of Operational Research, 261, 626-639.
https://doi.org/10.1016/j.ejor.2017.02.023
[61] Davenport, H.T. and Mittal, N. (2020) How CEOs Can Lead a Data-Driven Culture. Harvard Business Review.
https://hbr.org/2020/03/how-ceos-can-lead-a-data-driven-culture
[62] Brown, S. (2020) How to Build a Data-Driven Company.
https://mitsloan.mit.edu/ideas-made-to-matter/how-to-build-a-data-driven-company
[63] LaValle, S., Lesser, E., Shockley, R., Hopkins, M.S. and Kruschwitz, N. (2011) Big Data, Analytics and the Path from Insights to Value. MIT Sloan Management Review, 52, 21-22.
[64] Gao, J., Wang, W., Zhang, M., Chen, G., Jagadish, H.V., Li, G., Ng, T.K., Ooi, B.C., Wang, S. and Zhou, J. (2018) PANDA: Facilitating Usable AI Development.
http://arxiv.org/abs/1804.09997
[65] Nagahawatta, R., Warren, M., Lokuge, S. and Salzman, S. (2021) Security Concerns Influencing the Adoption of Cloud Computing of SMEs: A Literature Review. Proceedings of the 27th annual Americas Conference on Information Systems (AMCIS 2021, Montreal, 9-13 August 2021, 1-10.
[66] Wang, Y., Blobel, B. and Yang, B. (2022) Reinforcing Health Data Sharing through Data Democratization. Journal of Personalized Medicine, 12, Article No. 1380.
https://doi.org/10.3390/jpm12091380
[67] Arroway, P., Morgan, G., O’Keefe, M. and Yanosky, R. (2015) Learning Analytics in Higher Education. Research Report, ECAR, Louisville, CO, 17.
[68] Lane, J. (2022) A Vision for Democratizing Government Data. Issues in Science and Technology, 39, 84-88.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.