- The Master Tutors

A massive volume of data needs to be specially handled in almost any organization when tapping value to be used in decision-making. This paper mainly explores how big data found in organizations should be used for it to be an essential input when making decisions (Sill 2016). The document explains how data should be handled within the shortest time possible to make real-time and clever decisions in an organization. It describes a personally derived framework that the top managers of Americas Community Bankers (ACB) in collaboration with their junior employees should implement when handling data to make decisions that will place the firm at an added competitive advantage.

Data Gap Analysis

Background

Americas Community Bankers is a community initiative representing the community banks grievances and saving institutions to the necessary authority. It was started in 1992 when the National Council of Community Bankers merged with the U.S. League of Savings institutions (Americas Community Bankers 2007). Initially, it was known as Savings and Community Bankers of America Community, but it later changed its name to Americas Community Bankers in 1995 (Americas Community Bankers 2007). It mainly operates in the community banks owned by American citizens. The primary issue that the organization has to deal with now is understanding the financial needs of the American residents and businesspeople in the areas where banks operate.

Data sources and datasets available for Americas Community Bankers

Research Information Systems (RIS) which have many digital libraries is a significant data sources for Americas Community Bankers. It has software that enables data exchange through programs such as the ACM portal and Springer Link. Another source of data known as SOD simplifies the tasks of selecting tedious data and employs the seismology method of data selection (García et al. 2017). CALC is a dataset available for Americas Community Bankers, which allows the combination of data.

Table 1. Key financial and non-financial types of data in the case study

No	Data source	Associated datasets	Financial or non-financial	Business units or departments using the data
1	NAMEHCR	RIS	Financial	Bank holding companies
2	CB	CALC	Financial	Community banks
3	Max deposits	SOD	Financial	Banks headquarters

Data integrity and current of potential gaps in data analytics and data protection

Data integrity is a mandatory requirement for Americas Community Bankers for adherence to its ethical requirements. For any organization to ensure that it maintains its integrity and status, it must ensure it has the right business model for data protection and privacy. (Pullonen et al. 2019). They are required to respect the American bankers privacy and never use their information without their consent.

Organization Data Sources Business functions

Figure 1. Mapping between business function and data sources

Table 2. The identified gaps in data management

	Data protection/ ethics requirement	Procedures to be implemented in Americas Community Bankers	Relevant data protection standards	References to literature
1	Consent	Seek consent to use an individuals data	Seeking explicit consent before using someones data.	Principles of data management- facilitating information sharing, Edinburgh, Scotland
2	transparency	How the data will be used. Who will use the data?	Publishing of a statement covering the transparent information.	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016)
3	Physical and IT security	Taking security measures to prevent data from reaching unauthorized hands.	Installation of super-powered and secure software.	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016)

Recommendations to Americas Community Bankers data analytics processes

Reorganization of the current data-driven strategies to streamline and enhance the data analytics and decision-making.

The data analysis and decision-making can be enhanced by first defining the questions and ensuring that they are right. The organization should ensure that the questions are concise, clear, and measurable (Cleff 2021). The questions should be designed to provide high chances of finding the solutions within the shortest time possible. The organization should also come up with manipulated ways of analyzing and interpreting the data (Janssen et al. 2017). After analyzing the data, the results should be interpreted to answer the original questions and defend against all objections that may arise.

Table 3. Proposed data analytics

no	Data source	Specific organizational decisions	Decision type
1	General public	Election of officials. Policymaking	Operational
2	Meeting minutes	How the organization will be governed.	Strategic
3	Program operations	Coming up with suitable production processes.	Tactical

Roadmap to the development or enhancement of extensive data infrastructure

Firstly, the American Community Bankers should evaluate the nature and form in which the data exists and identify which elements they need to operate for their system to work perfectly. The organization should then carry out a diagnosis of the actions and the decisions they plan to make (Chen et al. 2021). The organization should evaluate whether the decisions and steps they are planning to make are effective and the barriers that might exist. The last step the organization should take is to strategize by identifying plans, actions, and new theories which they will adopt.

Table 4. Data analytics implementation process

no	The phase of the extensive data analytics process	Activities to be implemented in the American Community Bankers organization
1	Contextual	Improvement of the services to their customers.
2	Diagnostic	Creation of good perceptions about their organizations to the community.
3	Evaluative	Come up with ways to achieve their objectives.
4	Strategic	Come up with more efficient and effective ways to achieve their objectives.

Compliance aspects of the proposed changes in data analytics

The American Community Bankers should adopt data protection policies and maintain ethics when handling data for the organization to comply with the law. The organization should ensure that it has adhered to the regulations set up by the General Data Protection Regulation (GDPR).

Table 5. Data protection and ethical compliance

	Data protection/ ethics requirement	Procedures to be implemented in the chosen organization or project	Relevant data protection standard	References to literature
1	Accountability and transparency	Providing plain language to individuals.	The individual whose data is being used should be asked for permission by the organization.	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016)
2	Lawful grounds for the processing	Personal data should only be used on lawful grounds (Jin and Kim 2018).	GDPR	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016).

How the processed data can be used in the organizational decision-making

Table 6. Supported business decisions

Business decision	Decision type
The top management will be identifying all the requirements of each task for the cost of production to be well managed. Identifying the most suitable and relevant software.	Strategic
Selecting the most suitable equipment to be used within the organization. Routing and batching.	Tactical
Come up with ways to improve the skills of the employees and training of new employees. Implementing the most suitable and cost-effective production methods (Shamim et al.2019). Come up with the most effective and efficient working hours for each level of employees (Shi et al.2021).	Operational

For an organization to ensure that data management is effectively achieved, it can apply the operational strategy. This can be accomplished by coming up with ways to improve the skills of the employees and training new employees. This way, the company ensures that its traditional methods of data management are maintained and updated in accordance with technological advancements. By offering training, new employees skills in big data management are up to standard with the company requirements.

Executive Summary

Big data has become highly proliferated in the contemporary business environment. In the past, this information did not help organizations much. However, with technological advancement, and with the growth of online databases and information processing software, big data is now driving crucial decisions and innovation. Notably, a firms ability to utilize big data effectively to understand current trends and change operations depends on that companys willingness to examine the existing data and use corporate performance management (CPM) software to organize, filter, analyze, and interpret the data, among other things (Pulakos, 2004; Corporate performance management, n.d.). CPM is essential in this regard because it is cheap, accurate, and customizable. In this paper, the author talks about the Independent Community Bankers of Americas use of data about commercial bank loans to American businesses and citizens to streamline its operations through effective decision-making. The article also provides a critical discussion of the confidence interval, Chi-square, and ANOVA as specific examples of inferential statistics that corporate organizations can use for decision making (Tabachnick and Fidell, 2007; Sharpe, 2015). The decision that the author chose pertains to the Independent Community Bankers of Americas business expansion. Since the outcomes show a positive trend in loan taking, the recommendation is that the company focuses on loan-related services to individual and institutional customers.

Data Preparation Process

For an organization to succeed in todays fast-paced environment, it must make strategic decisions in a timely and efficient way. Leaders need reliable and actionable data to make these strategic decisions and inspire organizational growth and development (Nakagawa and Cuthill, 2007). Financial institutions have significant customer data, but most of them are not utilizing this information properly and to their advantage (Nick, 2007). Notably, the struggle that companies have to convert data into critical business leads lies in their inability to organize and prepare the existing data. Luckily, there is a variety of software tools on the market that companies can use to organize their data. Most of these pieces of software contain procedures for collecting, filtering and preparing datasets.

Notably, corporate performance management (CPM) is the umbrella term describing all pieces of software that companies can use to automate their business operations. Specifically, CPM automates such core business functions as budgeting and forecasting, thereby enabling organizations to execute strategic goals promptly (Integrated business intelligence and performance management: What you need to know, n.d.; Corporate performance management: A tool for formulating organizational strategies, n.d.). It allows companies to define and analyze strategic options before implementation to ensure that the recommended course of action agrees with company finances, business model, operational approaches, and key performance indicators. Any existing CPM software will help the user collect, filter, and prepare data for analysis (Corporate performance management, n.d.). CPM saves the company time and money by preventing employees from entering information manually and tracking it using less specialized software.

The continued utilization of CPM can reduce budgeting, analytics, forecasting, and planning time by 50 to 70 percent. It also reduces the time needed for data preparation, report production, and assessment of performance. Understanding descriptive statistics adds value to business processes and makes big data more meaningful to the company (Fisher and Marshall, 2009; Du Prel et al., 2009). Once the CPM software is up, no major changes will be needed to make it function. Instead, users will input daily figures during their usual working times, and the software will organize the data and generate graphs or reports whenever needed. Companies such as the Independent Community Bankers of America have information about banks and their clients. Using CPM, the company easily manipulates available data to see trends over the years and predict possible future performance.

Rapid consolidation of data is also possible within any organization that has integrated CPM into its operations. During the life of a company, specific changes that cause significant disruption of data may occur. However, with CPM, such transitions will be seamless, and the company will need only focus on delivering its promise to the customers and maintaining its bottom line. The powerful analytics of the enhanced CPM software promote and enhance data-driven decision-making, which was once a difficult and costly undertaking (Data modeling: Conceptual, logical, physical, data model types, n.d.; Data-driven decision-making: Succeed in the digital era, n.d). Companies in the past maintained disorganized data, and they hardly used it for business enhancement because no CPM that could simplify the job existed. Rather than spend a significant amount of time and resources cleaning historical data and analyzing them using less-sophisticated tools to see possible trends, these past companies preferred instead to use trial and error and to base their forward decisions on the CEOs or top managers recommendations.

Big data has helped the Independent Community Bankers of America understand many important financial data in the United States. For example, the Independent Community Bankers of America know the amounts of business loans extended by American commercial banks to the public from as far back as the Great Depression of the 1930s. Its CPM has carefully and adequately organized this data such that one can see specific trends in how Americans have been borrowing loans over the years. The good thing about the arrangement of the data is that one can select any data points and examine them for a more effective case study. For example, a look at the data about business loans in the United States by month between 2010 and 2020 shows a general upward trend in the loans extended to the public as illustrated in Figure 1. With this graph, it is evident that more American people and companies are taking loans, and financial institutions could leverage the same data to improve performance.

Figure 1: Changes in loaned amount (in billion dollars) between 2010 and 2020

The data is representative of the American population because it is drawn from all American commercial banks. Therefore, it includes all the information about all loans extended by American commercial banks to individuals and businesses throughout the United States. The main disadvantage of this data is that it does not show the specific individuals or groups that took the most loans. As such, it is difficult for a user to target specific loaners with good credit history. For that decision, one will need to access another dataset about individual creditworthiness. However, the data presented here is highly generalizable because it is derived from a large sample. Indeed, the sample from which the data is drawn is almost the same size as the population.

Data Modeling

Descriptive Statistics

Data is useful only when knowledgeable and skilled individuals analyze them and interpret the results meaningfully. One way that the Independent Community Bankers of America can convert its mere data about amounts of loans borrowed by American citizens and businesses is to perform a confidence interval. Doing so will help the company gain some insights into the probable loaning trends in the coming years (Marshall and Jonker, 2010; Larson, 2006). The confidence interval gives a clear picture of how much confidence there is in a given data. It is obtained by getting the mean and the standard deviation to determine the general characteristic of a given data set. Mathematically, the confidence interval CI is given by:

Formula

Where

is the sample mean,

the confidence level value,

the sample standard deviation, and

the sample size. Luckily, advanced CPM can calculate the confidence interval without the user needing to know this formula (Hazra, 2017; Bonett, 2006). In Microsoft Excel, one can determine the 95 percent confidence interval by using the alpha (0.05), the standard deviation, and the number of occurrences. For example, the data that the independent Community Bankers of America have about loans borrowed in each of the 12 months of 2010 (Figure 2) reveals the confidence interval in Figure 3.

Figure 2: Loan amounts in each month of 2020 (in billion dollars)

Figure 3: Confidence interval of loan range

The implication of the confidence interval is that American commercial banks could lend out 2716 billion dollars in the following months of 2021, plus or minus 141 billion dollars. The CI in this regard is the true mean, but there is a 5 percent chance that the upcoming loans would not fall within this range.

The Chi-square test is another piece of inferential statistic that one could derive from the data presented by the Independent Community Bankers of America. It is effective and reliable in testing categorical variables. According to McHugh (2013) and Franke, Ho, and Christie (2012), the mathematical notation of the Chi-square test (

) it is:

Where

is the Chi-square test,

the observed value, and

the expected value. Existing CPMs can calculate the Chi-square value and determine how categorical variables relate and are useful in hypothesis testing and formulation (Satorra and Bentler, 2010; Franke, Ho, and Christie, 2012). In Excel, a longer process is needed to arrive at the same answer, since one has to calculate manually the expected and observed values. For example, one can hypothesize that the loans that Americans received in each month of 2019 differed significantly from their 2020 receipts. The data from the Independent Community Bankers of America provides the following information for each month of the year for 2020 and 2019.

Figure 4: Monthly loan amounts for 2020 and 2019

The expected loan amounts for 2021 are given by multiplying the category column total by the result of dividing the category row total by the total sample size. Since the category row and the total sample size are the same in this scenario, the expected loan amount is given by multiplying the category column total by one. As seen in Figure 4, the expected value for 2020 (based on the 2019 results) for January was 2328.1491. However, the observed value for that period was 2359.0658. Thus, Chi-square is given by subtracting the expected value from the observed value, squaring the result, and dividing it by the expected value. The statistic proves that some correlation exists between the 2019 and the 2020 monthly loan amounts extended by US commercial banks to individuals and organizations.

Lastly, the analysis of variance (ANOVA) is also an important inferential statistic that could be calculated from the Independent Community Bankers of Americas data about the amounts of loans that commercial banks have extended to Americans over the years. ANOVA is important in statistical analysis because it splits data into systematic and random factors (Rutherford, 2001; Blanca et al., 2018). Since it calculates the treatment levels means and examines their similarity to the overall mean, they provide crucial information about the statistical difference between them (Plonsky and Oswald, 2017; Breitsohl, 2019). Figures 5 and 6 below summarize the analysis of variance for the 2020 data about the loans offered by US commercial banks to individuals. Figure 5 is the ANOVA single factor while Figure 6 is the ANOVA two-factor without replication.

Figure 6: ANOVA two factors without replication

Since the f-value is significantly large in both the single and to-facto ANOVA, the variability between group means in the loan data between January and December 2020 is larger. Usually, the P-value indicates the strength of the evidence for or against the null hypothesis. If it is low, then strong evidence exists against the null hypothesis. However, this information was not captured in the analyzed data in the ANOVA because the author did not determine the null and the alternative hypotheses before examining the data.

Justification of the Inferential Statistical Model

The use of the confidence interval, Chi-square, and ANOVA is justified by their significance in the study. It is difficult or impossible to do a statistical comparison without looking at what some of these inferential statistics provide. In big data analysis, knowledge of all the important inferential statistics ensures that one makes the right decision based on the available data (Mishra et al., 2019) In less-big data sets such as the one provided in this case study, the study of the confidence interval provides an illustration of the spread of the values making it easy for the user to approximate the range of the results and the likelihood that a given value will fall within the desired range. it helps with predicting future values in forecast scenarios. The Chi-square, on the other hand, provides information about the relationship between categorical variables. It is one of the most important inferential statistics for testing or formulating a hypothesis (Kaur, Stoltzfus, and Yellapu, 2018). It shows the relationship between the different statistical values included in a given dataset. Lastly, the analysis of variance (ANOVA) examines how the dataset variables, including the dependent and the independent ones, relate to each other (Faraway, 2002). In this examination, the provided dataset shows that critical decisions are possible if the corporation understands the meaning of its big data or how it is changing. The Independent Community Bankers of America can use its data to determine better ways of extending loans and making profits from its huge data. Table 6 summarizes the information.

Table 6. Comparison of different inferential statistical models

	Model or test	Advantages	Limitations	References to literature
1	Confidence interval	The main advantage is that it provides information about the range wherein the true value lies within a given degree of probability and direction.	The confidence interval does not generate a single mean estimate. Instead, it creates that averages lower and upper limits to prove the range.
2	Chi-square	In addition to being easy to compute, Chi-square is advantageous also because it is utilizable with categorical data.	The main disadvantage of the Chi-square test, as is the case with many statistical measures, is that it is meaningless unless a person understands how to interpret it.
3	ANOVA	One of the advantages of ANOVA is that it provides the overall test of equality of means of groups and can control false-positive findings..	One-way ANOVA is useable only when an individual is investigating a single factor and a single dependent variable. Although it can tell if significant differences between means exist, it cannot pinpoint the exact pairs having those differences.

Initial Outcomes of the Inferential Analysis

The initial outcomes of the inferential analysis show that the Independent Community Bankers of America can use the data in its proposal about the amounts of loans extended to American people and businesses by American commercial banks to organize its services and benefit from it. For example, the trends prove that the amount of loans that American people and businesses have been taking over the years has increased significantly, and will continue to do so in the coming years. Therefore, theoretically, companies offering financial services like loans are likely to continue enjoying a good business well into the future (Asadoorian and Kantarelis, 2005; Big data: How could it improve decision making within your company? 2017). The only uncertainty that currently exists is that there is no confirmation about the most effective or reliable lender out there. That information could only be obtained from other datasets existing on the market.

Data used in this presentation proves that Americans will continue to take loans, even though it does not specify the preferred amounts or the people more likely to do so. Thankfully, because the data is highly representative, it is greatly generalizable back to the population from which it is drawn (Allua and Thompson, 2009; Amrhein, Trafimow, and Greenland, 2019). The generalisability of the data means that the information applies to almost all Americans, but in slightly different ways. Therefore, it is the responsibility of the user of the data to determine how well and how best to apply the information to get the maximum benefit from it. For example, although the data proves that Americans have been taking an increasing amount of loans yearly, it does not specify which Americans offer the least likely risks due to timely repayments. Thus, the company will need to examine individual loan applicants critically before approving any payments to them.

Further Visualisation and Interpretation of Results as Expected in Business Reporting

Further visualization of the data is possible depending on the users needs. It can happen through tabulation, diagram formation, and graphs to provide the most accurate representation of the data. Agreeably, the visualization of the statistical data is an important component of understanding big data and making sense of it. It simplifies the information and summarizes it for easy consumption and decision-making. Most importantly, visualized data are easy to consume and interesting to utilize. The data used in this example do not provide all possible scenarios for the application o

Need help with assignments?

Our qualified writers can create original, plagiarism-free papers in any format you choose (APA, MLA, Harvard, Chicago, etc.)

Order from us for quality, customized work in due time of your choice.

Click Here To Order Now