INTRODUCTION
The age-standardized incidence rate of cancer among children and adolescents aged 0-14 years old in South Korea was 125.1 per million in 2005, 141.8 per million in 2010, and 146.9 per million in 2018, showing a steadily increasing trend. The 5-year cancer survival rate among children and adolescents aged 0-14 years old also increased significantly in the last 10 years, from 72.4% in 2005 to 77.3% in 2010 and 85.2% in 2018 [1].
The survival period for children and adolescents with cancer is longer than that of adults with cancer. After cancer treatment, an effective care system is required for social adjustment as well as physical, emotional, and spiritual health management [2,3]. Compared to adult cancer, pediatric cancer involves a wider variety of cancer types and necessitates more intense treatment methods. Furthermore, since childhood is a period of growth, the effects of treatment can have a significant impact on post-treatment life depending on the patient's developmental state [4]. In particular, adolescence is an important transition period during the developmental process that lays the foundation for one's healthy adulthood; thus, balanced growth and development based on one's holistic needs is important during this period [5].
Childhood and adolescent cancer survivors (CACS) experience various neurological and cognitive complications including physical weakness, developmental disabilities, stunted growth, and fears of post-treatment recurrence. Consequently, it has also been reported that they have trouble with regular school attendance and experience various obstacles to participating in activities, decreased self-esteem, and confusion about self-identity [2,4-15]. In addition, previous studies on the needs of CACS found that many of them had unmet needs related to continuing health management, healthy living and self-actualization, emotional well-being, social interaction, academic performance and future careers [6-12].
Research on CACS in Korea published in the past 10 years mainly consists of surveys related to the health needs of CACS [2,5,7,8,10-15], systematic reviews [16], and qualitative research studies [6,9,17]. However, these studies only address specific topics related to CACS. Therefore, there is a limited ability to comprehensively grasp the overall research trends on CACS. A systematic and extensive analysis of recent research trends on CACS will be useful for improving quality of life among CACS. With the rapid development of social networking since 2010, various studies and findings can be shared around the world through academic databases. Bibliometric networks, which are a type of social network, can reveal relational structures between knowledge entities included in bodies of academic literature, which makes it possible to analyze the connective structure of main concepts from the literature, central themes, and cooperative structures between topics [18-19]. The three types of bibliometric networks are co-authorship networks, citation networks, and word co-occurrence networks (or keyword networks) [18-20].
An analytical study using a word co-occurrence network, which can reveal overall research trends and the integrated knowledge structure of related topics, has not yet been conducted for research related to CACS. Understanding recent domestic research trends related to CACS is crucial for identifying the potential future directions of nursing studies about care recipients. The purpose of this study was to analyze research trends related to CACS by conducting a word co-occurrence network analysis on studies registered in the Korean Citation Index (KCI).
METHODS
Ethics statement: This study was approved by the Institutional Review Board of Sahmyook University (No. 2021032HR).
1. Study Design
This was a descriptive study that aimed to analyze major research trends related to CACS using word co-occurrence network analysis on the abstracts of articles registered in the KCI that were published in the last 10 years.
2. Data Collection and the Study Process
Data collection and the analysis process for this study are summarized in Figure 1.
1) Keyword selection
The range of data selection included the English-language abstracts of articles published in KCI journals. Articles without English-language abstracts were excluded. The main search terms were "cancer survivors", "adolescent", "child", "pediatrics", "adolescent cancer", and "childhood neoplasm". In this study, the keywords suggested by the author for each abstract are referred to as "author keywords", while semantic morphemes extracted from abstracts for network analysis are referred to as "keywords".
2) Abstract collection
Abstracts of KCI-registered articles, including information about the author, publication year, and title, were extracted from articles published between January 2010 and February 2021 using the Biblio Data Collector tool included in NetMiner Program version 4 (NetMiner 4, Cyram Inc., Seongnam, South Korea), a software program for conducting network analysis. A total of 45 articles were extracted, and after reviewing the titles and abstracts, all 45 abstracts were included in the analysis. Abstracts included in the analysis were each entered into one row of an Excel file.
3) Pre-processing
(1) Node filtering
Node filtering is the process of extracting semantic morphemes-the smallest linguistic unit with meaning-from unstructured text and discriminating the word class of the extracted morphemes. In this study, only nouns were analyzed as keywords for morpheme analysis. Nouns alone are typically used as morphemes in social network analysis [18,20].
· Word refinement and selection
After importing the Excel file using the unstructured data function in NetMiner, 644 words were initially identified in the main node set. An initial dictionary (thesaurus, exception list, and defined words) that needed to be refined was developed after reviewing the entire keyword list. The thesaurus was placed into the same row of the Excel sheet, and the exception list and defined words were entered in a single row per word using a .txt file. To refine keywords, four researchers checked the frequency of words in descending order and then set the cut-off criterion of word frequency to fewer than 3. There were no specific standards to define cut-off criteria. Therefore, upon reviewing low-frequency words, the researchers chose fewer than 3 instances as an appropriate cut-off criterion to eliminate words that were not very relevant to CACS. In this process, words that were related to research methods and other unnecessary words were removed.
· Developing dictionaries
When filtering for words that met the research purpose, three representative types of dictionaries (a thesaurus, defined words, and an exception list) were developed. For the thesaurus, several words with similar meanings were specified to be extracted as a single word. A dictionary of defined words was developed so that unique words, compound nouns, and trade names would be extracted. An exception list was developed to exclude certain words [18,21].
After registering the first organized dictionary in the NetMiner Program, keywords with a frequency of 3 and more instances were exported using the Query composer function. Words with little relevance to the research topic were added to the exception list (e.g., "total", "sample"), and compound nouns consisting of two or more words were added to the list of defined words (e.g., "quality of life", "behavior problem"). One representative word from the thesaurus (e.g., "caregiver", "family", "parent") was selected and placed at the beginning of the same row. The dictionary refinement process for the final analysis was repeated five times. Finally, the main words ("cancer survivor", "cancer", "adolescent", and "child") included in the research titles were excluded.
(2) Developing a co-occurrence matrix for keywords (from a 2-mode to 1-mode network)
The basic structure of a text network consists of nodes (keywords) and links (connections between nodes) [18,20]. In this study, a word-document network (2-mode network) was converted into a word-word network (1-mode network) for word co-occurrence analysis. The degree value was standardized to 0 and 1 using the "Word-Document Network (Term Frequency-Inverse Document Frequency [TF-IDF])" of the document link node set. TF-IDF is an evaluation index of word importance in documents with which it is possible to assign a value that reflects the frequency of word usage in both entire documents and individual sentences [21]. In the current study, since the co-occurrence of words appearing in abstracts served as the unit of analysis, a co-occurrence word matrix was created using the 1-mode network with standardized TF-IDF values. This action was performed using the inner product function in the software.
(3) Link filtering
Link filtering is a step during which relationships with low frequencies are excluded from analysis based on the number of documents (abstracts, in this study) in which the word appears multiple times [18,20]. After converting a 2-mode network (number of links: 499) to a 1-mode network (number of links: 1,545), the number of links increased. After link filtering, the cut-off criterion for the number of links that could be visualized was determined while checking the link frequency distribution. The frequency of links in the 1-mode TF-IDF was 639 for words with a frequency of 2 or more instances, 287 for those with a frequency of 3 or more instances, and 134 for those with a frequency of 4 or more instances.
3. Data Analysis
Data were analyzed using the NetMiner Program version 4.
1) Network visualization
Network visualization refers to the stylized presentation of nodes and links so that the connecting structure can be represented visually [18,20]. After visualizing the 1-mode TF-IDF link frequencies of 2 or more instances, 3 or more instances, and 4 or more instances by applying Spring 2D, network analysis was ultimately performed with a link frequency of 3 or more instances based on the consensus of the authors that it best reflected the trends in childhood cancer research.
2) Network analysis
(1) Centrality
Centrality analysis determines which node is the most important and identifies the degree of centralization (i.e., how the network structure is concentrated in a few important nodes) [18,20].
Degree centrality and eigenvector centrality were used to analyze the co-occurrence of words. Centrality is an index that shows the degree to which a keyword (node) is centered in a network based on relative ranking rather than absolute size. Keywords with high centrality are considered to be core keywords. Degree centrality is a measure of how many connections the nodes of a network have, and a high degree centrality for a keyword indicates that it is a central issue located at the core of the network. Eigenvector centrality is useful for finding the most influential central node in the network. High eigenvector centrality means that there are many nodes with high centrality in a particular area. Centrality values are relative values standardized from 0 to 1 and are used to identify the co-occurrence of keywords [12].
(2) Cohesion
Cohesion refers to the identification of which subgroups make up the entire network and the groups of subtopics belonging to the co-occurrence word network [18,20]. As the first step, the giant (i.e., largest) component was extracted and community analysis was performed on it. The largest modularity (best cut) was regarded as the most optimized value, with values of 1.25-2.75 being considered normal and values of 2.75-3.50 being considered good [20,22]. Higher modularity with a positive value means that the community is significantly divided, and, thus, that the link density is high within the group and low between the groups [20].
(3) Topic modeling
Topic modeling is a text mining technique that interprets a document's word distribution based on the words in the document that were identified [22,23]. In the current study, topic modeling using the latent Dirichlet allocation (LDA) model was conducted to validate the results of the cohesion analysis. LDA is a method mainly used in topic modeling for identifying relevant topics in documents [20,23]. The parameter values estimated by LDA were: ⍺=.01, β=.01, and number of iterations=1,000 [20]. The most appropriate number of topics was determined to be 3, at which the distances and boundaries between topics were clearest. Each topic was given a name denoting the single theme that best represented the meaning of the keywords included in that topic by consensus among the authors.
4. Ethical Considerations
We received approval to conduct this study from the S University institutional review board (IRB No. 2021032HR). The data were collected from published articles in the KCI, and the study therefore posed no harm or risk to the participants. In addition, the collected data were used solely for the purposes of this study.
RESULTS
1. Characteristics of the Included Studies
Among the 45 studies included in the analysis, an average of 3.75±2.14 articles were published per year from January 2010 to February 2021. The study designs included descriptive studies (17 studies), qualitative studies (13 studies), and experimental/observational studies (10 studies). In total, 35 studies (80%) focused on CACS, and 9 articles (20%) focused on children and parents.
2. Network Analysis
1) Word frequency, degree centrality, and eigenvector centrality of co-occurrence keywords in abstracts
The keywords from the abstracts of research papers on CACS published within the past 10 years in Korea were ranked based on the analysis criteria, which included: 1) word frequency, 2) degree centrality, and 3) eigenvector centrality (Table 1, Figure 2-A, Figure 2-B). The top 30 core keywords per analysis criterion are shown in Table 1. The top-ranking words in terms of frequency included "treatment" (#75), "support" (#47), "measurement" (#46), "group" (#39), "adaptation" (#35), and "quality of life" (#33). The top 10 keywords in terms of degree and eigenvector centrality included "treatment," "factor," "intervention," "group," "radiotherapy," "health," "risk," "measurement," "outcome," and "quality of life." The degree centrality index, at 49.3%, was desirable. A centrality index ranges from 0 to 1; a value closer to 1 indicates that the link is focused on one node, whereas a value closer to 0 indicates that the link is evenly distributed across all nodes [20].
2) Cohesion
In the current study, the giant component for cohesion analysis was a network of 287 keywords at a 1-mode TF-IDF link frequency of 3 or more. A community composed of three clusters was identified. The maximum modularity value was +21.0-much higher than 3.5, which is considered good. This means that the link density within the community was significantly high and the link density between the communities was significantly low. Keywords classified by the community are shown according to their clusters in Table 2 and Figure 2-C. Cluster 1 had the highest number of keywords (n=29).
3. Topic Modeling
The keyword order of topic modeling refers to the keywords that best represent the identified topics [20,23]. The keywords with the highest probability of being found under topic 1 were "measurement" (17.5%) and "quality of life" (9.3%). For topic 2, they were "treatment" (11.0%) and "risk" (5.6%). For topic 3, they were "support" (13.8%) and "need" (8.0%) (Table 2). The highest number of documents related to topic 2 (n=20) showed a similar classification to cluster 1 from the cohesion analysis. Topic 3, which included 16 documents, showed similar results to cluster 2. Topic 1 had the lowest number of documents (n=9), which was most similar to cluster 3. The results of topic modeling are depicted intuitively using a word cloud in Figure 2-D.
Given the above results, most of the studies on CACS in Korea were associated with topic 2 ("treatment and complications"). The second-highest number of studies were related to topic 3 ("adaptation and support needs). The topic with the lowest frequency was topic 1 ("management and quality of life").
DISCUSSION
In the last 10 years, the number of studies on CACS published in KCI journals was considerably small, with an average of fewer than four new articles published annually. This research trend reflects the need for multidisciplinary interest and efforts to improve adaptation and quality of life among CACS. In addition, further studies are needed to examine research trends in other international journals.
In this study, the major topics identified in studies related to CACS published in KCI journals included "treatment", "factor", "intervention", "group", "radiotherapy", "health", "risk", "measurement", "outcome", and "quality of life". In a systematic review of the health-related needs of CACS by Lim [15], the following major needs were identified: psychological support, information and education, educational support, help with the health care system, physical support, and financial support. The main topics included in Lim's study were the treatment that the patient received, the role of the attending physician, precautions and complications after treatment, follow-up management plans, diagnosis and prognosis, treatment side effects, life education, family stress, school problems, emotional support, and community support. These topics mostly related to treatment and management of complications, followed by psychosocial needs, education and information, and help navigating the medical system. Thus, Lim's study [15] showed similar patterns to the present study in terms of the main concepts and research trends it identified.
The predominant research keyword identified through cohesion and topic modeling was "treatment and complications". Related keywords included "treatment", "diagnosis", "examination", "function", and "risk factors". Word co-occurrence network analysis is a method that can be used to identify research trends [18,20]. Our results suggest that many studies have been conducted on CACS related to treatment and complications. Furthermore, the literature related to this subject mainly consists of medical research [24-26]. Such studies were conducted on topics such as the management of various complications and infectious diseases that can affect pediatric cancer survivors after completion of treatment and reflect the particular concerns of the medical science field.
Cluster 2 and topic 3 concerned "adaptation and support needs". The main keywords were "adaptation", "caregiver", "deficiency", "growth", "development", "identity", "information", "self-help", "need", and "support". In one study that included individual in-depth interviews on the psychosocial service experiences of 30 adolescents and young adults with cancer [27], a strong demand for self-help activities and performances were identified in the areas of financial services, psychological counseling, returning to school and education support, mentoring, and family support. Additionally, a qualitative study [28] that examined the needs of CACS in South Korea emphasized the need for CACS to make positive changes in life via a program on the power of positivity and hope. Furthermore, the major attributes of social adjustment for CACS, identified via concept analysis, were relationships with schoolmates, academic performance, and future job plans [9]. CACS have a longer survival period than adults, and because of their developmental characteristics, support needs, and service development traits, psychosocial adaptation is critical among this group [26,29]. In the present study, keywords related to the various needs and the adaptation of CACS after ending treatment were identified. Therefore, the results of this study can be used to provide scientific data suggesting potential directions for future research on CACS. In addition, this study collected all KCI-registered studies related to CACS. Therefore, the results have the advantage of revealing multidisciplinary research trends, particularly related to adaptation and support needs. Nursing studies were primarily related to cluster 2 and topic 3, but other studies on adaptation and social identity have also been published in the field of social welfare. Accordingly, it was possible to confirm the flow of research based on the identities of different disciplines.
Research trends related to assessing the health-related needs of CACS were identified in cluster 3 and topic 1. These results suggest a need to develop a tool to measure the health-related needs of CACS in a Korean context [30]. If a tool that can screen one's health-related quality of life is developed, it is possible to evaluate patients' deficiencies and support needsoriented services for CACS, making interventions more effective [5,15]. Consequently, this may lead to improvements in quality of life for CACS and their families. Furthermore, intervention studies on information, education, and psychosocial support for CACS are necessary.
This study only analyzed the abstracts of articles related to CACS in KCI journals within the last 10 years. Further studies are needed to compare and analyze research trends on CACS internationally. The study was limited because it could only analyze differences according to interdisciplinary and yearly distributions due to the small number of studies. Some studies were excluded due to restricted keyword selection during the initial search. Keywords used in the initial search strategy mainly focused on cancer survivors, which did not necessarily account for specific types of malignancies such as leukemia, one of the major cancers that affects CACS. Recent studies [26] that included the terms "cancer treatment completed," "cancer experienced," and "cancer cured" therefore might have been excluded in the current study, although it is not expected that this limitation led to the exclusion of many studies.
The results of this study aimed at identifying research trends related to CACS in nursing and other fields, which were derived from word co-occurrence network analysis, reflect the unique characteristics of nursing at the boundaries between disciplines. These findings can be used as a basis for determining the direction of future research related to CACS.
This study uncovered recent research trends and identified potential future research directions based on the findings of word co-occurrence network analysis on data from the abstracts of articles published in Korea within the last 10 years. These results can contribute to the field by providing a scientific basis for the applied social network approach in further research on CACS.
CONCLUSION
The primary research trends in Korea related to CACS based on word co-occurrence network analysis over the past 10 years were identified as "treatment and complications", "adaptation and support needs", and "management and quality of life". The predominant keywords were related to CACS treatment and complications occurring after treatment. The keyword groups with the next highest co-occurrence frequency were topics related to the adaptation of CACS after cancer treatment, needs and adaptation in terms of physical, psychological, social, and spiritual factors, and management and evaluation of the quality of life of CACS. The keywords identified in the three main categories reflected interdisciplinary identification. Many studies with keywords related to "adaptation and support needs" were identified in the nursing literature. The results indicate that research on managing and evaluating the quality of life of CACS must be expanded.