Using Tableau as a Data Visualization Tool to Explore Reoccurring Cancer Trends
Georgia State University
Key Words: cancer, cancer data, cancer research, cancer trends, data visualization
Cancer has touched the lives of millions for over a century. Numerous organizations have cooperated in research and fundraising in order to find a cure for cancer. The need for an analysis of the various trends in cancer is evident in order to investigate the decline of cancer in recent years. To provide a clearer understanding of the different trends, the data visualization software, Tableau, was used to help visualize the data presented.
Cancer is a group of diseases "characterized by the uncontrolled growth and spread of abnormal cells" (American Cancer Society, 2015a). If the intensity of the cells does not stay in a controllable condition, death is one of the possible results. Cancer results from numerous external and internal factors. "These [external or internal] factors may act together or in sequence" in order for a diagnosis to be a possibility (2015a). Many treatment options are possible, including surgery, radiation, chemotherapy, hormone therapy, immune therapy, and targeted therapy (2015a).
According to the World Health Organization (2007), an estimated eight million people died from cancer in 2010, and the number of deaths is projected to climb to twelve million by 2030. This alone is profoundly persuasive to bring cancer research to the forefront of medical investigation. Though it may seem overlooked, it is beneficial to investigate where trends occur in cancer rates. It is also beneficial to examine the factors that cause these patterns to occur, become prominent, or disappear over time.
During the investigation, numerous models were considered and analyzed. The trends were investigated and consolidated into multiple groups. These groups include cancer diagnosis and death rates in adults, children, and adolescents; cancer cases by state; and cancer deaths by country. Colored data visualization images accompany each trend. These images were produced using the software Tableau to depict the data collected on the different patterns.
Comparing Men's and Women's Chances with Cancer
The first major trend investigated is the cancer diagnosis rates in both men and women. Both men and women are estimated to have both high new cancer cases and death cases. For 2014, male cases were estimated at 855,220 for all cancer sites while female cases were estimated at 810,320 in all sites. For cancer deaths, men experienced an estimated 310,010 deaths while women's deaths were estimated at 275,710 (American Cancer Society, 2014).
Analyzing cancer in men. The first data visualized image (see Figure 1) compares diagnosis and death percentages in men (2014).
In this picture, there is a relationship displayed between the diagnosis and death rates. The color of the circles represents the diagnosis rates for the cancer types analyzed. The darker the color, the higher the diagnosis rate. This graph shows that prostate cancer is the highest diagnosed cancer in men.
Prostate cancer. In the late 1980s and early 1990s, incidence rates for prostate cancer spiked dramatically because of increased use of the prostate-specific antigen (PSA) blood test for screening (American Cancer Society, 2015a). Rates have since been declining. Cases of prostate cancer are significantly higher in men sixty-five and older than any other age group.
Prostate cancer death rates have been decreasing since the early 1990s in men of all races/ethnicities, though they remain more than twice as high in blacks as in any other group (2015a). To catch these cases before death can be rather difficult as early prostate cancer usually has no symptoms. The only well-established risk factors for prostate cancer are increasing age, African ancestry, a family history of the disease and certain inherited genetic conditions (2015a).
The second variable graphed in Figure 1 is death rates in the cancer types. The size of the circle represents the death rates for each cancer types depicted. The larger the circle, the higher mortality rate. In this case, lung cancer causes the most cancer deaths in men today. A further analysis of cancer deaths will be investigated in the section titled "Cancer Deaths in the Past 70 years."
Analyzing cancer in women. Just as the diagnosis and death rates for cancer in men were digitally presented using a bubble chart, diagnosis and death rates for women were graphed using a bubble chart as follows in Figure 2 (American Cancer Society, 2014).
The first variable graphed is diagnosis rates, depicted using the color of the circles. The darker the red, the higher the diagnosis rate. In the provided data visualized image, breast cancer has the highest recorded diagnosis rate in women.
Breast cancer. Potentially modifiable factors associated with increased breast cancer risk include weight gain after the age of 18 or being overweight or obese (for postmenopausal breast cancer), physical inactivity, and alcohol consumption. In addition, recent research indicates that long-term, heavy smoking may also increase breast cancer risk, particularly among women who start smoking before their first pregnancy (American Cancer Society, 2015a). The International Agency for Research on Cancer (IARC) has concluded that shift work, particularly at night, may be associated with an increased risk of breast cancer (IARC, 2012).
Mammography can often detect breast cancer at an early stage, therefore, treatment is more effective. Numerous studies have shown that early detection with mammography saves lives and increases treatment options. Mammography will detect most breast cancers in women without symptoms, "though the sensitivity is lower for younger women" (American Cancer Society, 2015a). Mammography also results in some detection of cancer that would neither have caused harm nor been diagnosed in the absence of screening.
The second variable depicted in Figure 2 is death rates, represented by the size of the circles. Lung cancer is the cancer type causing the most death not only in men but also in women in 2014 and 2015. To understand the reason lung cancer causes the highest death rate, an investigation of deaths will be looked at next.
Cancer Deaths in the Past 70 Years
The following graph, Figure 3, represents the total number of fatalities in the United States by cancer from the year 1930 until 2010 (American Cancer Society, 2014). This legend distinguishes the cancer locations:
The data shows that, over time, lung cancer has climbed to exceed other types for the highest cancer death rate. To understand the reason of this, a further analysis of lung cancer is necessary.
Lung cancer. Over the past century, lung cancer rates grew to an all-time high as a result of the tobacco epidemic (American Cancer Society, 2011). Cigarette smoking is the largest cause of lung cancer, and the risk factor increases as the number of cigarettes and the number of times consumed in a particular interval increases. Cigarette consumption has increased over time due to the influence of many factors. For this reason, diagnosis and death rates have increased.
Events influencing cigarette use. The timeline of events affecting cigarette consumption stretches over a time span of more than a century. It began in 1884 and 1889, where the invention of a machine to manufacture cigarettes and the invention of the safety matches, respectively, ignited the beginning of the mass production of cigarette (Whelan, 1984). As mass production began in flourish, mass marketing of cigarettes influenced a higher consumption rate as well. As stated by Burrough and Helyar in 1990, the "introduction and mass marketing of Camel brand cigarettes by R.J. Reynolds Tobacco Company in 1913" played one of these influential roles. Almost two decades passed after this introduction before, in 1928, companies began "cigarette advertisements targeting women, including the 'Reach for a Lucky Instead of a Sweet' campaign" (Burns, 1994 and Health Advocacy Center, 1986). Also, marketing of filtered cigarettes began in 1955 (U.S. Department of Health and Human Services, 1981).
After almost a century of companies attracting customers to cigarettes, actions were taken to counteract the rise in cigarette consumption and death. Publications were first written about the "retrospective studies linking tobacco and disease [in 1950] and the prospective mortality studies linking cigarettes and lung cancer [in 1954]" (U.S. Department of Health and Human Services, 1989). After these publications, according to Freedman and Cohen (1993), the Council for Tobacco Research was founded. According to the 1964 publication by the U.S. Department of Health, Education, and Welfare, the Surgeon General released a 1964 report on smoking as it relates to health. Counter-advertising on television between 1967 and 1970 aided in the retaliatory efforts against cigarette consumption (Warner, 1977). Even with all these efforts to minimize the quantity of cigarette use, "the introduction of Virginia Slims" and other brands of cigarettes targeting women began in 1968 (Burns, 1994).
It was in the 1970s and onward that initiatives were taken to combat this epidemic. Examples include the ban on cigarette advertisements (U.S. Department of Health and Human Services, 1989), the start of the Nonsmokers' rights movement (Steinfeld, 1972), and the release of reports from the U.S. Surgeon General in 1986 on "involuntary smoking" (U.S. Department of Health and Human Services, 1986) and from the U.S. Environmental Protection Agency on "environmental tobacco smoke" (U.S. Environmental Protection Agency, 1992).
These historical events illuminate the fact that efforts have been made to lower the chances of tobacco related sickness and disease; however, the graph demonstrates that as time passed, the number of deaths from lung cancer have increased. This correlation shows the dramatic effects of tobacco upon the numerous diseases it causes including lung cancer.
Cancer Chances in the Youth Population
Although cancer is much less common among children compared to older adults, "1 in 285 children in the US will be diagnosed with the disease before the age of 20" (American Cancer Society, 2014). The types of cancers that develop in children and adolescents differ from those that develop in adults. The predominant types of pediatric cancers in children from ages 0-14 and adolescents from 15-19, are pictured below in the next data visualized image (American Cancer Society, 2014).
Some of the cancers that develop in children are rarely seen in older individuals, notably those cancers that arise from embryonic cells which include "neuroblastoma (sympathetic peripheral nervous system), Wilms tumor or nephroblastoma (developing kidney), medulloblastomas (brain), rhabdomyosarcomas (muscle), and retinoblastoma (retina of the eye)" (American Cancer Society, 2014). Because these cancers occur during stages of rapid growth and development in youth, most experts strongly recommend that they are treated at medical centers that specialize in childhood cancer by multidisciplinary teams.
From 1975 to 2010, the overall incidence of pediatric cancer in the US increased slightly, by an average of 0.6% per year (Howlader, N., Noone, A.M., & Krapcho, M., 2013). Specifically, incidence rates increased for four cancer types: acute lymphocytic leukemia, acute myeloid leukemia, non-Hodgkin lymphoma, and testicular germ cell tumors. Reasons for increases in incidence rates are largely unknown. It is possible that some of this increase may be due to changes in environmental factors. Improved diagnosis and access to medical care over time may also have contributed. Without medical attention, some children may die of infections or other complications of their cancers without ever being diagnosed. Death rates, on the other hand, for all childhood and adolescent cancers combined declined steadily from 1975 to 2010 by an average of 2.1% per year, resulting in an overall reduction of more than 50% (2013). Mortality declines were observed for all sites with the steepest declines in Hodgkin lymphoma, non-Hodgkin lymphoma, and acute lymphocytic leukemia.
Cancer Cases by State
The investigation continues now with broadening our perspective by looking at different cancer trends by locations as opposed to by individuals. The United States is mapped in the next visualization, Figure 5, comparing the number of cancer cases to the population as a percentage. A divergent scale was used to define the lower rates of cancer cases per population size compared to higher percentages. The green represents low percentages. As the green changes from a dark green to a light green, the percentage of cases increases. Then the color diverges from light green to light red. Then the red darkens as the percentage continues to increase, creating the map.
With this in mind, we see from the map that most cancer cases occur in the northeastern region of the United States. Cancer most commonly develops in older people; "78% of all cancer diagnoses are in people 55 years of age or older" (American Cancer Society, 2015a). People who smoke, eat an unhealthy diet or are physically inactive also have a higher risk of cancer. These same risks plague American culture today. Because the population is most heavily concentrated in the northeast, this data supports the idea of greater incidence of cancer cases in densely populated areas.
Another location experiencing higher cancer rates in the United States is the southeast. As depicted, Georgia is the only state with a low percentage of cases in comparison to other states in the southeast due to the number of recorded melanoma (skin cancer) cases. For melanoma, one of the major risk factors is sun sensitivity; the skin's reaction to the presence of excessive sunlight. As most use tanning beds or tan on the beach, it is evident that many experience sunburns; however, the UVA and UVB rays that penetrate the skin cause the cells to mutate and become cancer cells. As a result, melanoma and all other skin cancer types are the most common form of cancer (2015).
Cancer Deaths by Country
As the final visualization of the investigation, Figure 6 (below) brings together data from every country. The analysis shows the number of people who have died of cancer in comparison to the population size to indicate the percentage of people who died of cancer by country. A blue-red divergent scale was used for this trend as well; blue indicates a lower percentage and red points to higher percentages.
This map illustrates that developed countries have higher mortality rates than developing nations. The red nations have been recorded to have contributing factors resulting from geographic differences that include the age groups, the prevalence of risk factors, the availability and use of diagnostic tests, and the availability and quality of treatment; in which case "approximately 16% of all incident cancers worldwide are attributable to infections. This percentage is about three times higher in developing countries (23%) than in developed countries (7%)" (Cancer Genome Atlas Network, 2012).
The estimated number of cases and deaths in economically developing countries will probably grow, however, due to the adoption of lifestyles that are known to increase cancer risk – such as smoking, poor diet, physical inactivity, and fewer pregnancies. Cancers related to these factors, such as lung, breast, and colorectal cancers, are already on the rise in economically transitioning countries.
According to estimates from the International Agency for Research on Cancer, there were 14.1 million new cancer cases in 2012 worldwide, of which 8 million occurred in economically developing countries containing about 82% of the world's population (IARC, 2012). By 2030, the global burden is expected to grow to 21.7 million new cancer cases and 13 million cancer deaths simply due to the growth and aging of the population (American Cancer Society, 2015b).
Cancer knows no boundaries, yet over the years, scientists and advocates united and fought to find the reasons behind cancer and why so many are affected by it. This investigation analyzed the reasons behind the most common cancer cases in all individuals, then compared these cases in the fifty states and finally in each country. One constant remained: cancer plays a vital role in the lives of people by the millions and has done so for over a century. But hope is inevitable. So many began the fight to stop these trends and began the initiative to create new trends; trends that eradicate cancer from the world permanently.
American Cancer Society. (2011). Global Cancer Facts and Figures 2nd Edition. Atlanta, GA: American Cancer Society.
American Cancer Society. (2014). Cancer Facts and Figures 2014. Atlanta, GA: American Cancer Society.
American Cancer Society. (2015a). Cancer Facts and Figures 2015. Atlanta, GA: American Cancer Society.
American Cancer Society. (2015b). Global Cancer Facts & Figures 3rd Edition. Atlanta, GA: American Cancer Society.
Burns, D.M. (1991). The scientific rationale for comprehensive, community-based, smoking control strategies. Strategies To Control Tobacco Use In the United States: A Blueprint for Public Health Action in the 1990's, 1-32. Shopland, D.R., Burns, D.M., Samet, J.M., & Gritz, E.R. (Eds.). (Smoking and Tobacco Control Monographs—1. NIH Publication No. 92-3316). Bethesda, MD: U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, National Cancer Institute.
Burns, D.M. (1994). Overview of office-based smoking cessation assistance. Tobacco and the Clinician: Interventions for Medical and Dental Practice, 3-11. Shopland, D.R., Burns, D.M., Cohen, S.J., Kottke, T.E., & Gritz, E.R. (Eds.). (Smoking and Tobacco Control Monograph No. 5. NIH Publication No. 94-3693). Bethesda, MD: U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, National Cancer Institute.
Burrough, B., & Helyar, J. (1990). Barbarians at the Gate. New York, NY: Harper & Row.
Cancer Genome Atlas Network. (2012). Comprehensive molecular portraits of human breast tumors. Nature, 490, 61-70.
Freedman, A.M., & Cohen, L.P. (1993, February 11). Smoke and mirrors. Wall Street Journal, pp. A1.
Health Advocacy Center. (1986). Sixty Years of Deception: An Analysis and Compilation of Cigarette Ads in Time Magazine (Vol. 1). Palo Alto, CA: Health Advocacy Center.
Howlader, N., Noone, A.M., & Krapcho, M. (2013). SEER Cancer Statistics Review, 1975-2010. Retrieved from http://seer.cancer.gov/csr/1975_2010/ (Original work published in 2013)
International Agency for Research on Cancer. (2012). IARC Monographs on the Evaluation of: Volume 100E-Tobacco Smoking. Carcinogenic Risks to Humans. Lyon, France: IARC Press.
Steinfeld, J.L. (1972). The public's responsibility: A bill of rights for the non-smoker. Rhode Island Medical Journal, 55, 124-126.
U.S. Department of Health and Human Services. (1981). The Health Consequences of Smoking: The Changing Cigarette. A Report of the Surgeon General (DHHS Publication No. PHS 81-50156). Rockville, MD: U.S. Department of Health and Human Services, Public Health Service, Office of the Surgeon General, Office on Smoking and Health.
U.S. Department of Health and Human Services. (1986). The Health Consequences of Involuntary Smoking: A Report of the Surgeon General (DHHS Publication No. CDC 87-8398). Rockville, MD: U.S. Department of Health and Human Services, Public Health Service, Centers for Disease Control, Center for Health Promotion and Education, Office on Smoking and Health.
U.S. Department of Health and Human Services. (1989). Reducing the Health Consequences of Smoking: 25 Years of Progress. A Report of the Surgeon General, 1989 (DHHS Publication No. CDC 89-8411). Rockville, MD: U.S. Department of Health and Human Services, Public Health Service, Centers for Disease Control, Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health.
U.S. Department of Health, Education, and Welfare. (1964). Smoking and Health: Report of the Advisory Committee to the Surgeon General of the Public Health Service (PHS Publication No. 1103). Rockville, MD: U.S. Department of Health, Education, and Welfare, Public Health Service.
U.S. Environmental Protection Agency. (1992). Respiratory Health Effects of Passive Smoking: Lung Cancer and Other Disorders (EPA/600/6-90/006F). Washington, DC: Office of Research and Development, Office of Health and Environmental Assessment.
Warner, K.E. (1977). The effect of the anti-smoking campaign on cigarette consumption. American Journal of Public Health, 67, 645-650.
Whelan, E.W. (1984). A smoking gun: How the tobacco industry gets away with murder. Philadelphia, PA: George F. Stickley.
World Health Organization. (2007). Ten statistical highlights in global public health. World Health Statistics 2007. Geneva: World Health Organization.
World Health Organization. (2011). [Map illustration of all cancer death rates per 100,000 people standardized by age provided by World Life Expectancy]. All Cancers Death Rate per 100,000. Retrieved from http://www.worldlifeexpectancy.com/cause-of-death/all-cancers/by-country/
Male Diagnosis and Death Rates
|Cancer Type||Percent Males Diagnosed||Percent Male Deaths|
Female Diagnosis and Death Rates
|Cancer Type||Percent Females Diagnosed||Percent Female Deaths|
Children and Adolescent Diagnosis and Death Rates
|Cancer Type||Percent Children Diagnosed||Percent Adolescents Diagnosed||Description of Cancer Type|
|Acute lymphocytic leukemia||26||8||Cancer from white blood cells|
|Acute myeloid leukemia||5||4||Cancer that starts inside bone marrow, the soft tissue inside bones that helps form blood cells|
|Bone tumor||4||7||Cancer within the bone|
|Brain and CNS||21||10||Cacner of the brain and central nervous system|
|Hodgkin lymphoma||4||15||Cancer of lymph tissue. Lymph tissue is found in the lymph nodes, spleen, liver, bone marrow, and other sites|
|Melanoma||1||6||Most dangerous type of skin cancer|
|Neuroblastoma||7||1||Cacner tumor that develops in the nerve tissue|
|Non-Hodgkin lymphoma||6||8||Cancer of lymphocytes (white blood cells)|
|Ovarian germ cell tumors||1||2||An abnormal mass of tissue that forms in germ (egg) cells in the ovary (female reproductive gland in which the eggs are formed|
|Retinoblastoma||3||1||Cacner of the retina|
|Rhabdomyosarcoma||3||1||Cancer that forms in the soft tissues in a type of muscle called striated muscle anywhere in the body.|
|Testicular germ cell tumors||1||8||Cancer that forms in tissues of one or both testicles|
|Thyroid carcinoma||1||11||Cancer that forms in the thyroid gland. Carcinoma is a cancer arising in the epithelial tissue of the skin or of the lining of the internal organs.|
|Wilms tumor||5||1||A disease in which malignant (cancer) cells are found in the kidney, and may spread to the lungs, liver, or nearby lymph nodes|
Death Rates in the Past 70 Years
Cancer Cases by State
|State||Number of cases||State Population||Percent of Population with Cancer Cases|
|District of Columbia||2,800||658,893||0.4250|
Cancer Deaths by Country
|Country Name||2013 Population by Country||Number of Deaths by Country in 2013||Percent of Population Who Died of Cancer|
|Antigua and Barbuda||89,985||118||0.1307|
|Bosnia and Herzegovina||3,829,307||4,097||0.1070|
|Congo, Dem. Rep.||10,521,468||16,634||0.1581|
|Egypt, Arab Rep.||82,056,378||74,261||0.0905|
|Iran, Islamic Rep.||77,447,168||71,794||0.0927|
|Korea, Dem. Rep.||24,895,480||26,489||0.1064|
|Micronesia, Fed. Sts.||103,549||87||0.0844|
|Papua New Guinea||7,321,262||9,379||0.1281|
|Sao Tome and Principe||192,993||293||0.1516|
|St. Kitts and Nevis||54,191||78||0.1443|
|St. Vincent and the Grenadines||109,373||128||0.1167|
|Syrian Arab Republic||22,845,550||12,634||0.0553|
|Trinidad and Tobago||1,341,151||1,529||0.1140|
|United Arab Emirates||9,346,129||5,533||0.0592|
This project would not be possible without the support of the various colleagues and resources I have worked with and used in the process of completing this manuscript.
First, I would like to thank Joe Hurley and my colleagues at the Georgia State University Collaborative University Research and Visualization Environment (CURVE) for their help in the use of the equipment and the program Tableau provided at CURVE. Without their help, the images presented in this article would not be possible.
Second, I would like to thank Preston Berger – a student at the University of Georgia that I have known for almost ten years now – for his recommendations, revisions, and additions in order to strengthen the language presented in the report.
Third, I would like to thank the co-author of this manuscript, Maroun Sassine, for his efforts and dedication to this project from the very beginning. His planning and attention to detail are admirable and beneficial to the success of the report.
Finally, and most importantly, I want to thank my family for believing in me; in always supporting me in everything that I do, and encouraging me to reach the unreachable boundaries of life and to persevere with ambition and integrity.
– Charbel Aoun