Research Ethics

While personal preference can introduce bias, collaborating with stakeholders and incorporating their perspectives can help to identify and mitigate potential biases in the research variables. Collaboration is a way to achieve objectivity, which can curb and prevent biases that may not be apparent to the researcher. It is also imperative to state that complete neutrality is impossible due to inherent biases. However, research methodologies serve the crucial purpose of mitigating these biases through various techniques.

To ensure a robust research design and analytical approach, researchers must avoid letting personal preferences or preexisting notions influence their work, as these can introduce flaws. Although engaging and collaborating throughout research is essential to ensure that those providing data benefit, the community is often excluded during the early phases, such as when hypothesizing about significant causal, nonlinear, time-delayed, or other systemic effects (Martin et al., 2020).

It is ethically responsible for individuals to avoid situations that could compromise their ability to remain impartial. The community should be engaged through interactive workshops, bringing unique context, understanding, and lived experience to generate a hypothesis, which will go through cognitive modeling and serve as a framework (Penn et al., 2013).

Examining an individual’s positionality (personality, identity, and experience), technological infrastructure, and institutions are standard research tools to understand how they influence research results; positivism, a philosophical approach emphasizing objective data collection and analysis through rigorous methods, is a valuable tool for research. However, it is essential to acknowledge that achieving complete objectivity can be challenging due to inherent biases in any researcher (Torres Rodríguez et al., 2023). The quantitative method describes measures, predicts, or explains numerical or nonnumerical data or phenomena through casual and statistical analysis.

Quantitative analysis uses various analytical techniques such as statistical analysis (frequency, impact, regression model, and correlation), comparative analysis, surveys, and questionnaires. Another set of quantitative data analytics tools includes techniques for conducting data analysis through a quantitative lens that encompass methods of measurement (such as index measures versus single indicators), various research designs (including experimental setups), and the field of data science (focusing on machine learning).

Assessment of Cybersecurity Risks for Mitigating Data Breaches in Business Systems

(Algarni et al.,2020)

A review of this paper evaluates a model that estimates the cost and the likelihood of a data breach within a year in the business industry. The data for this research comes from an industrial business report, and the estimators were conducted by stakeholders in the field of data and cyber security using a mathematical model. The model shows the cost impact of security breaches and how to mitigate potential data breaches.

Earlier research shows potential vulnerabilities using vulnerability discovery models using data from different databases (Alhazm, 2008). Some researchers have evaluated main vulnerability and breach discovery models using on-the-field data from different technology systems and infrastructures (Alhazmi & Malaiya, 2008). The various benchmarks used in estimation processes conducted by different bodies have led to confusion, leading them to develop a standardized and consolidated approach model.

The different calculation methods lead to considerable confusion and disagreements, bringing them to develop a systematic and comprehensive estimation model that will assist in a reliable determination of the quantitative estimates. The hypothesis is to determine whether the model accurately reflects actual field data and assess the total cost of data breaches based not only on the number of records involved but also on other factors, including tangible and intangible costs and the overall impact of the breach. This research used R-squared values to quantify the model’s performance, indicating how well the regression model fits the observed data.

Big data monetization throughout Big Data Value Chain: a comprehensive review

(Faroukh, 2020)

This paper explores the benefits of data within the value chain in business processes. The value chain (VC) process implies the end-to-end digitization and utilization of the data from the organization to the consumer level to provide actionable and valuable insight. Applying value chain within big data value chain big data (BDVC) processes comes with evolving challenges. BDVC provides valuable diagnostic information about the business’s health from the production to the consumer level, which helps predict losses and generate income.

Income generation is the fundamental goal of any business organization; however, finding a sustainable strategy to use data to harness and maximize business gains can be very challenging. The increasing pervasiveness of digital technology in our lives fuels the exponential growth of big data; this data originates from our computers, mobile phones, payment systems, and the myriad Internet of Things (IoT) in equipment and devices we use daily.

The hypothesis in this research aims to investigate the efficacy of value chain models in overcoming the challenges of value creation processes, especially in digitalization and Big Data, and to explore how these models facilitate the sustainable generation of new value and the improvement of operations within organizations.

The collected data from various devices and IoT are pre-transformed and loaded in a model that finds trends, patterns, and correlations, identifies patterns, and creates actionable insights. The data analytics approach includes diagnostic, descriptive, and predictive analysis, followed by data visualization.

Evaluating Privacy-Preserving Machine Learning in Critical Infrastructures: A Case Study on Time-Series Classification

(Mercier et al.,2020)

This paper explores safeguarding machine learning models from hacking and cyberattacks. The applicability of machine learning in these contexts is limited to protect privacy, and most sensitive data relying on time-series data are under-represented for privacy concerns. This paper aims to develop efficient privacy-preserving machine learning (PPML) methods suitable for time-series data to mitigate privacy risks associated with deploying machine learning models in critical infrastructure domains.

The paper validated real-time series data to assess the effectiveness of existing encryption for deep learning, showing the importance of reliance on differential privacy on the dataset, and explored the wide-ranging applicability of federated methods. The growing need for automated decision-making emphasizes providing real-time protection to personal data and the models and infrastructure that support it; building stakeholder trust and protection to infrastructure is essential in promoting business and avoiding costly data breaches.

Providers and developers of contemporary AI systems are also legally obligated to maintain the confidentiality of user data (Al-Rubaie et al., 2019). Additionally, the evolving field of Privacy-Preserving Machine Learning (PPML) addresses issues related to the exposure and transfer of sensitive information during machine learning models’ training and inference phases (Fredrikson et al., 2015).

This research employs privacy-preserving machine learning (PPML) to address the challenges associated with data breaches occurring during the machine learning process. Specifically, it focuses on safeguarding against the reconstruction of training data from a model’s weights through further model training. This paper aims to develop efficient privacy-preserving machine learning (PPML) methods suitable for time-series data to mitigate privacy risks associated with deploying machine learning models in critical infrastructure domains. The University of California, Riverside (UCR) and the University of East Anglia (UEA), pioneers in data classification archives and repositories for time series datasets, provided the data for this research.

Conclusion

The paper shows an approach to quantitative methods in the technology field and the importance of objectivity and bias-free approaches — an integral aspect of every research endeavor.

1. Algarni, A. M., Thayananthan, V., & Malaiya, Y. K. (2021). Quantitative Assessment of Cybersecurity Risks for Mitigating Data Breaches in Business Systems (Vol. 11, Number 8). MDPI AG. https://doi.org/10.3390/app11083678

2. Alhazmi, O. H., & Malaiya, Y. K. (2008). Application of vulnerability discovery models to major operating systems. IEEE Transactions on Reliability, 57(1), 14–22. https://doi.org/DOI

3. Al-Rubaie, M., & Chang, J. M. (2019). Privacy-preserving machine learning: Threats and solutions. IEEE Security & Privacy, 17(2), 49–58.

4. Faroukhi, A. Z., El Alaoui, I., Gahi, Y., & Amine, A. (2020). Big data monetization throughout Big Data Value Chain: A comprehensive review (Vol. 7, Number 1). Springer Science and Business Media LLC. https://doi.org/10.1186/s40537-019-0281-5

5. Fredrikson, M., Jha, S., & Ristenpart, T. (2015). Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (pp. 1322–1333).

6. Martin Jr., D., Prabhakaran, V., Kuhlberg, J., Smart, A., & Isaac, W. S. (2020). Participatory problem formulation for fairer machine learning through community-based system dynamics. arXiv: 2005.07572. https://doi.org/10.48550/arXiv.2005.07572

7. Mercier, D., Lucieri, A., Munir, M., Dengel, A., & Ahmed, S. (2022). Evaluating Privacy-Preserving Machine Learning in Critical Infrastructures: A Case Study on Time-Series.

--

--

Issah Musah

My experience range in Engineering, Process Control, and Data Science. It’s my pleasure to explore a quantitative approach to resolving enterprise issues.