Disclaimer: This is an example of a student written essay.
Click here for sample essays written by our professional writers.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.

Big Data Analytics in Healthcare, It's Potential and Challenges

Info: 4845 words (19 pages) Essay
Published: 18th May 2020 in Information Technology

Reference this


Objective/Purpose: This paper describes the promises, potentials and challenges of big data and analytics in healthcare. As there are more and more data generated in healthcare, there are vast possibilities provided with big data analytics.
Method: I conduct a literature review using thematic analysis focusing on the categorization of primary use cases of Big Data analytics in healthcare and main blocking factors on given use cases
Results: The paper describes that by learning applications and its architectural framework, as well as the implementation of big data analytics in healthcare give considerable opportunities to improve the quality of the patient care, by reducing error and the cost spent for the healthcare.
Conclusion: The perspectives and opportunities of big data analytics in healthcare are tremendous; however, there are remaining challenges to be taken and resolved.

Keywords: Big data; Analytics; patient care; applications; methodology; healthcare; personalised medicine


Big Data Analytics in Healthcare, It’s Potential and Challenges


The term big data is new in the 21st century, but there are a variety of definitions associated with the time. Most attempts view it as data that is complex, large, fast, and changing necessitating new mechanisms for storage, analysis,  and visualisations. As such, health care is one of the sectors that the four Vs of data, that is, variety, velocity and volume can be applied because of the data produced. The reason being, healthcare data is widely distributed in the system, including insurers, researchers, government agencies, and other interest groups. The data from these entities are stored in databases in variety of ways and manners that are difficult to the data to be categorised.

Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!

Find out more

Notwithstanding the inherent complexities, there is potential and promises in siloed healthcare data. The data can be used to apply big data techniques to derive critical knowledge that proves essential in the development of the healthcare sector. According research by McKinsey Global Institute, is the U.S. were to harness the potential of big data, it would generate a value of over $300 billion every year (Manyika et al., 2011). One of the key benefits that the U.S. has missed is the reduction of healthcare expenditure. Traditional healthcare has been based on the psychological changes that used single modality of that data that can be limiting and flawed. Although the approach is fundamental to the understanding of diseases, it does not recognise interconnectedness and variations that play a role in healthcare (Celi, Mark, Stone, and Montgomery, 2013). The medical field has just started to integrate digital capabilities in data processing, storage and communication in the last decade. The new era has made it possible for extensive data to be processed and stored. However, despite the implementation and use of digital equipment in healthcare, the data collected have not been utilised to yield real benefits.

Problem Statement

 There are numerous evidence-based researches on the application of big data analytics various healthcare processes. Each of these studies offers fundamental insight into the potentials and challenges of big data analytics in their distinct context of the application. Their findings are specific in use within the condition of the research conducted. Also, while each of these researches identifies the potential of their big data application, they fail to identify them in the context of healthcare system. Therefore, they all miss to point out the promises of big data analytics in healthcare. Hence, a need for research focusing on overall promises, potential and challenges of big data analytics in healthcare.

Purpose of the Study

The use of systematic analysis is to establish a common finding on promises, potentials and challenges of big data analytics in the whole healthcare system. It will summarise all the results from evidence-based researchers to a single document of reference. The research will also contribute to the knowledge of the implementation and application of big data analytics in diseases detection, patient care, and treatment.

Objectives of Research

The paper seeks to find out promises, potential and challenges of the big data analytics in the healthcare system.

Research questions

  1. How has big data analytics been implemented and applied in disease treatment and prevention, patient care, as well as in research and development?
  2. What are the challenges that face big data analytics in healthcare?
  3. What are the promises and potentials of the big data analytics application and implementation of healthcare?


 The paper conducted a literature review using thematic analysis focusing on the categorisation of primary use cases of Big Data analytics in healthcare and main blocking factors on given use cases. Each case was analysed regarding the application of big data analytics, the promises it presents, potentials and likely challenges faced. The cases categories included big data analytics and medical image analysis, cancer, sickle cell anaemia, telemedicine, and HIV/AIDs. The finding was analysed based on articles results in each case. Each category of case findings was presented as either promise, potential or challenge. The findings informed the formulation of conclusion about application of healthcare and recommendations.

Literature Review

Big Data Analytics

Figure 1 shows big data analytics as to the science of collection, analysis and presentation of extensive data set to reveal the trends, patterns, and relationship in the data (Pouyanfar, Yang, Chen, Shyu, &   Iyengar,   2018). The overall goal of big data analytics in healthcare as shown in figure 2 is to improve and lower costs of treatment and care. Conventional databases have become inefficient in the management of data due to the massive data generated by advancement in technologies. According to Kubick, databases lack the capacity and capability to capture and process data in real-time (Kubick, 2012). The limitation in space with few databases having dozen of petabytes of data and that is exponentially increasing makes it challenging to process the data. Baseman et al. (2017) observed that storage, search, and analysis of data in databases is difficult using conventional techniques. Therefore, to do extensive data analytics advaced methods like real-time analysis and visualises are required.

Figure 1: The workflow of big data analytics

Figure 2: Primary objectives of big data analytics in healthcare

Features of Big Data

Big data analytics is the amount of data that is large and diverse. There are advanced architecture, tools and analytics create significant value that makes it essential modern data analysis. A paper is written by Raghupathi andRaghupathi (2014) reveal the importance of data integrity for its impact on decision making. As such, data analytics has four commonly identified features including volume, variety, velocity and varacity as depicted in figure 3 below. According to McAfee and Brynjolfsson (2012) digital technologies has contributed extensive data and is continuously increasing. They also noted that the data come from many sources and hence of different type and usage. Digital technologies and various source of data combined increase the speed at which data is created and therefore fast techniques are needed to get meaningful insight from it (Oussous, Benjelloun, Ait Lahcen, & Belfkih, 2017). These features are an essential consideration in big data analytics for decision making.

Figure 3: The Four Vs of big data analytics

Big Data Analytics in Healthcare

  1. Medical Image Analysis

Aiello, Cavaliere, D’Albore, and Salvatore (2019) did a review on challenges of medical images. Medical imaging is essential for anthropometry and simulation which depend on big data analytics. They found out that diagnostic imaging data is not well-managed to utilise its full potential. Below figure 4 is the bioimaging workflow that is needed for beneficial big data analytics. Notably, there are numerous uncontrolled imaging data from autonomous and monolithic sources that reduce the potential of big data analytics. According to Lambin, et al. (2017) images lack high-level presentation for radionics and anthropometrics fields. It then becomes difficult to apply predictive and pattern, recognition models. Therefore, there is lack of enough credible bioimages data that can result in misleading decisions. The authors identified the challenge to be lack of centralised data sharing centre. They suggested a need to have controlled decision protocols that target to contribute and improve big data science.

Figure 4: Adapted from Aiello, Cavaliere, D’Albore, and Salvatore, (2019)

  1. Cancer and Big Data Analytics

An article by Chang reporting for Weill Cornell Medicine Research Center shows how Olivier Elemento uses big data analytics and high performance computing to cancer prevention, diagnosis, treatment and cure. According to Chang (2018), there are over 100 billion cells in a tumour with each of these cell having distinct mutation process. The mutation ability of these cells makes the disease always to change, evolve and adapt. It is fundamental for researchers to know the genetical make-up of cancerous cells. When more frequent analysis of these tumours is done, the closer it is to the understanding of tumours. The continuous measurements of tumours result in data that is critical to Olivier Elemento research. The researcher uses big data generated over time identify cancer genome, understand their changes and use the information to discover treatments.

 Regarding above, the lab has developed diagnotic model for thyroid cancer (Chang, 2018). They used available data to identify patterns in cancer cells. However, to get the patterns machine learning algorithms were used. Elemento’s lab is also developing a database that will host cancer genome mutations from their research. The database offers future potential of applying big data analytics in cancer treatment. Also, Capobianco (2017) saw big data analytics as promising research and development its application in precision oncology. The data on regions of interests and genomics has made understanding of cancer like never before. They have increased possibility early cancer detection, and personalised treatment and hence makes it treatable.

 Despite the benefits, promises and potentials of using big data analytics, there are inherent challenges. There are large volumes of genome data involved in the research (Bates, Saria, Ohno-Machado, Shah, and Escobar, 2014). It becomes a challenge to analyse 3 billion genomes per person in a study that needs thousand of samples. The massive data takes many days, weeks or months to complete single analysis. If the researcher decides to reduce the data to lower analysis time, information loss is high and may result in misinterpretation. The whole process can be frustrating to the researchers.

 Additionally, the article by Paniagua (2019) through Cancer World explains that although artificial intelligence and big data analytics are promising to the in tackling cancer, there is caution to their use and application. The developments require rigorous studies and publication as well as clinical validations before they are assumed to be useful and implemented patient care. Also, statistical limitations have slowed the development of precision cancer prognosis, and more genomics and clinical data analysis are needed as explained by Ow and Kuznetsov, (2016). The challenges are as a result of substantial varieties of data that are difficult for stratification. Topol (2019) concluded that though the fields of AI and big data analytics are promising, there is relatively low data and proof of their usefulness. As such, there is possibility of faulty algorithms and yet exist promises of reducing errors and inefficiencies in future.

  1. Big Data Analytics and HIV

Moreover, big data analytics has proved valuable in HIV treatment. A research by Olatosi et al. (2018) medical care has remained a challenge due to low linkage and retention data. According to the research, utilisation of big data science in modeling of HIV care is critical in establishing patterns that facilitate decisions antiretroviral medications. As such, the authors describe the creation of data centre for all South Carolina persons living with HIV (SC PLWH) and the implementation of big data analytics in the data for better and new insights in offering HIV care. Figure 5 below shows the revenue and fiscal affairs integrated data system. Figure 6 shows the linkage of various variables that constitutes a cascaded HIV care. The two figures 5 and 6 forms HIV care utilisation predictive model shown in figure 7.

Figure 5: Adopted from Olatosi et al. (2018)

Figure 6: Gelberg-Anderson Model variables and data source

Figure 7: Predictive model of HIV care utilisation

While this study was recognised fundamental in profiling and compiling PLWH in South Carolina for creation of HIV care predictive model, there were inherent difficulties in the study. Olatosi et al. (2018) encountered challenges in getting agreements, collection, and merging data from many sources. The considerations of data privacy and confidentiality was a subject of efforts and time which were proved to be limit factors. Further, the use of machine learning in predictive modelling, yield information that is difficult to interpret due to complexity of raw data. Nonetheless, big data analytics have proven to be the most promising development in the developing treatment, care and cure of HIV.

Young (2015) contributed to the impact of big data analytics in HIV treatment and prevention. According to Young, digital technologies including social media and smartphones are promising development in addressing HIV epidemic through bioinformatics, digital epidemiology and disease modelling. The approach uses social media text mining which filters HIV related keywords and phrases which are used to model prevalence of HIV. Some of the high-risk social interactions include sexual and drug use behaviours. Young found that there exists a correlation between online HIV-related posts and CDC reports on HIV cases. Thus, social media has been used as cost-effective method of monitoring and surveillance of HIV risk. The data is useful in HIV prevention and intervention in that it can be used in the provision of need-based home-HIV testing kits. In another study, Young and Jaganath (2013) found that those who use social platforms to post about HIV prevention and testing are more likely to use HIV self-testing kit. The data from social media analysis is of critical importance to the health departments for planning purposes and real-time response need-based HIV prevention and treatment.

Find out how UKEssays.com can help you!

Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.

View our services

However, the use of social media big data analytics came with criticism. According to Lazer, Kennedy, King and Vespignani (2014), highlighted that the data and method are questionable on their reliability and validity. These challenges need to be addressed before the application of big data analytics and decision making. The problem is further aggravated because government agencies, research institutes and other health departments have limited capacity to handle big data generated from social media and digital technologies (Murdoch & Detsky 2013). It is also observed that data collected from online post need to be updated frequently for the application of big data analytics. The real-time HIV data from databases is not usually accessible instantly which limits the power of social media HIV monitoring of cases and prediction. Instead, it is used to show relationships between posts and HIV prevalence.

  1. Big Data Analytics and Sickle Cell Anemia

According to a systematic study conducted by Badawy et al. (2018), improve self-management results of patients with sickle cell anaemia. The research relied on literature evaluations that focused on application of technologies in management of sickle cell disease. Despite their findings that the use of eHealth in the proper control of illness, all the research used showed concern about the availability data large enough for the studies to be feasible and acceptable. Therefore, continuous research was needed to evaluate the efficacy of self-management of patient with sickle cell.

Moreover, the use of big data analytics and artificial intelligence are affected by biases. A study by Lacy et al. (2017) established the existence of bias in haemoglobin A1c levels in sickle cell patients in African Americas. They observed that “Among African Americans from 2 large, well-established cohorts, participants with SCT had lower levels of HbA1c at any given concentration of fasting or 2-hour glucose compared with participants without SCT.” They concluded that “HbA1c can underestimate past glycemia in African American patients with the sickle cell disease.” It was found that when variables in a large dataset are sampled repeatedly, there is possibility of selective bias in reporting which can show differences in typical values that otherwise does not exist.

  1. Big Data Analytics and Telemedicine

People have always imagined a cure for all diseases, extending life span, and improved general health of the world’s population. These are hopes that are more or less realistic in the era of advancement of big data analytics coupled with other health technologies that give rise fo telemedicine (Bairagi, 2017). It has made healthcare better by harnessing extensive patients data available. He established that development of telemedicine had been made a reality by application of the big data analytics in healthcare. Telemedicine has impacted of increasing patient care, diagnosis and treatment as well as reducing the cost of accessing healthcare. Healthcare providers use data analytics for diagnosis that is drawn from vast information that goes beyond personal experience and local resources. The process promotes accuracy in diagnosis and use of evidence-based practice efficiency. The future growth of telemedicine is highly dependent on the future of data analytics. That is, advancements in data analytics is a potential development in telemedicine.


This systematic review used a total of 14 articles. The study was divided into distinct categories of application of big data analytics in healthcare. The table below shows the types of cases used in systematic review and the number of articles used as well as their conclusions. The article’s findings are tabulated further according to their conclusions.

Cases of big data analytics





Genome analysis for cancer cure

Discovery of treatments for cancer diseases

A more significant number of data on genomic for efficient application of big data analytics and predictive models

Large volumes of genome data per patient sample that makes the process tedious and time-consuming.

There is no significant data to get conclusive analytics for cancer treatment.

Difficulty in interpreting machine learning outcomes due to their large sizes

Sickle cell anaemia

Big data analytics will continue to develop self-management

Model variations in patients to predict the amount of haemoglobin A1c

No availability data that is large enough for the studies to be feasible and acceptable

Presence of selective reporting biases.


Medical image analysis

Development of anthropometry and simulation.

More no incision therapies in future

Understanding of diseases.

Centralised image data sharing centre.

There is a need to have controlled decision protocols

There are numerous uncontrolled imaging data from autonomous and monolithic source limit the credibility of biomedical images.


The new era of telemedicine where patients do not have to visit a health facility.

Free e-Medical sites that diagnosis and give recommended treatments

Extensive patient data that offers efficacy after the application of big data analytics

Promotion of accuracy in diagnosis.

Lack of ERH interoperability which can cause problems for medical practitioners

There are legal concerns like government regulations such as FDA which block licensing of telemedicine platforms.

Undoubtedly, the use of big data analytics will unlock the potential of healthcare. All article agree that that amount of data generated in healthcare is exponentially growing. Therefore, it will be about the ability to extract meaningful information from pools of data. Due to the growth and utilisation of big data, the healthcare sector is booming, the number of patients increased, and innovative treatments have increased synonymously. The interests that health practitioners have taken on the implementation of big data analytics in various cases show that the sector has high potential to grow. A research by Market Watch (2018) predict that healthcare market will grow by up to $34.27 billion by 2022 at a CAGR of 22.07 percent. These data will be stored in electronic health records, government agencies data, private research institutions and other databases.

 Generally, the findings from the articles can be summarized into five main promise and potentials of big data in healthcare as discussed below.

  1. Promote health tracking
  2. Reduce healthcare cost
  3. Reduce cases of high-risk diseases
  4. Prevent human errors
  5. Advancement in healthcare sector

On the other hand, the review has revealed inherent challenges in the use of big data analytics in healthcare including.


  • Manyika, M., et al. (2011). Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute.
  • Lacy, M. E., Wellenius, G. A., Sumner, A. E., Correa, A., Carnethon, M. R., Liem, R. I., … & Luo, X. (2017). Association of sickle cell trait with haemoglobin A1c in African Americans. Jama, 317(5), 507-515.
  • Celi, L. A., Mark, R. G., Stone, D. J., & Montgomery, R. A. (2013). “Big data” in the intensive care unit. Closing the data loop. American journal of respiratory and critical care medicine, 187(11), 1157.
  • Kruse, C. S., Goswamy, R., Raval, Y., & Marawi, S. (2016). Challenges and Opportunities of Big Data in Health Care: Systematic Review. JMIR Medical Informatics, 4(4), e38.
  • Aiello, M., Cavaliere, C., D’Albore, A., & Salvatore, M. (2019). The challenges of diagnostic imaging in the era of big data. Journal of clinical medicine, 8(3), 316.
  • Lambin, P., Leijenaar, R. T., Deist, T. M., Peerlings, J., De Jong, E. E., Van Timmeren, J., … & van Wijk, Y. (2017). Radiomics: the bridge between medical imaging and personalised medicine. Nature reviews Clinical oncology, 14(12), 749.
  • Bates, D. W., Saria, S., Ohno-Machado, L., Shah, A., & Escobar, G. (2014). Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Affairs, 33(7), 1123-1131.
  • Capobianco, E. (2017). Precision Oncology: The Promise of Big Data and the Legacy of Small Data. Frontiers in ICT, 4, 22.
  • Ow, G. S., & Kuznetsov, V. A. (2016). Big genomics and clinical data analytics strategies for precision cancer prognosis. Scientific reports, 6, 36493.
  • Topol, E. (2019). High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine, 25, 44–56.
  • Bairagi, V. K. (2017). Big data analytics in telemedicine: a role of medical image compression. In Big Data Management (pp. 123-160). Springer, Cham.
  • Paniagua, E. (2019). Big data and precision medicine in cancer: challenges to face. Cancer World. Retrieved from https://cancerworld.net/cancerworld-plus/big-data-and-precision-medicine-in-cancer-challenges-to-face/
  • Olatosi, B., Zhang, J., Weissman, S., Hu, J., Haider, R. M., Li, X. (2018). Using big data analytics to improve HIV medical care utilisation in South Carolina: A study protocol. BMJ Open, 9(7), 1-11. doi:10.1136/bmjopen-2018-027688.
  • Young S. D. (2015). A “big data” approach to HIV epidemiology and prevention. Preventive medicine, 70, 17–18. DOI:10.1016/j.ypmed.2014.11.002
  • Murdoch, T. B., & Detsky, A. S. (2013). The inevitable application of big data to health care. JAMA, 309(13):1351-2. doi:10.1001/jama.2013.393.
  • Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). Big data. The parable of Google Flu: traps in big data analysis. Science, 343(6176):1203-5. doi:10.1126/science.1248506.
  • Young, S. D., Jaganath, D. (2013). Online social networking for HIV education and prevention: a mixed-methods analysis. Sexually Transmitted Diseases, 40(2):162-7. doi:10.1097/OLQ.0b013e318278bd12.
  • Badawy, S. M., Cronin, R. M., Hankins, J., Crosby, L., DeBaun, M., Thompson, A. A., & Shah, N. (2018). Patient-centered eHealth interventions for children, adolescents, and adults with sickle cell disease: systematic review. Journal of medical Internet research, 20(7), e10940.


Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please: