The content of research paper

Associations between clinical neck symptoms and various evaluations ofcervical intervertebral disc degeneration by magnetic resonance imaging

Haruka Otaki, Koji Otani, Takehiro Watanabe, Miho Sekiguchi, Shin-ichi Konno

Author information

Haruka Otaki

Department of Orthopaedic Surgery, Fukushima Medical University School of Medicine
Koji Otani

Department of Orthopaedic Surgery, Fukushima Medical University School of Medicine
Takehiro Watanabe

Department of Orthopaedic Surgery, Fukushima Medical University School of Medicine
Miho Sekiguchi

Department of Orthopaedic Surgery, Fukushima Medical University School of Medicine
Shin-ichi Konno

Department of Orthopaedic Surgery, Fukushima Medical University School of Medicine

Introduction

Intervertebral disc degeneration is thought to be related to neck pain1-3). When it progresses, it causes radiculopathy and myelopathy4). Early detection of disc degeneration is important for choosing suitable treatment and for preventing its progression.

T2-weighted magnetic resonance imaging (MRI) is widely used to evaluate intervertebral disc degeneration1,4-7) .Cervical vertebral disc degeneration is based histologically on loss of water and proteoglycan content in the intervertebral disc, and it is seen as a decrease in intervertebral disc height and disc protrusion1,4,7). MRI is useful for observing these findings.

Recently, various studies using MRI have been conducted, and classifications of cervical disc degeneration have been established1,4-8). However, few papers have described the reproducibility of the classifications for evaluating cervical disc degeneration. In addition, no study has compared cervical disc degeneration using various MRI classifications and clinical symptoms. In this study, therefore, the aims were to compare cervical disc degeneration on MR images, as assessed by five different classification systems, and to evaluate their associations with neck symptoms in a community-based cohort.

Participants and methods

This was a cross-sectional study based on epidemiologic data from public health screening conducted in 2005 by local governments in Tadami Town, Ina Village, and Tateiwa Village of Fukushima Prefecture, Japan9). From 3,236 participants (1,326 men, 1,910 women; age range 19-94 years; average age, 65.5 years), a total of 582 agreed to undergo MRI of the cervical spine and answered a questionnaire about the presence of neck pain or shoulder stiffness. They were asked “Do you have neck pain which needs medical care?” and “Do you have shoulder stiffness which needs medical care?” separately. Neck symptoms included those with either neck pain or shoulder stiffness. The exclusion criteria were if they were unable to walk independently or fill out questionnaires due to visual impairment, or had ever undergone brain or spinal surgery. Cases with MRI results insufficient for all classification systems, and those with missing questionnaire data, were excluded.

Municipality-based public health screening is part of Japan’s system of universal health care. Participation is voluntary. This supplemental study was approved by the ethics committee of Fukushima Medical University (No.1880). Informed consent was documented in writing for all study participants.

MRI assessment

Disc degeneration, Modic change, and Schmorl’s nodes were evaluated on MRI images. The detailed imaging conditions of the MRI scanners are shown in the supplemental data.

Disc degeneration was assessed using five classifications: Matsumoto’s grading system8), Miyazaki’s grading system4), Nakashima’s grading system7), Jacobs’ grading system1), and Suzuki’s grading system5). A midsagittal T2-weighted image (WI) was obtained at each level of the intervertebral discs from C2 to Th1. Matsumoto’s grading system consists of four parts: disc degeneration, posterior disc protrusion, anterior disc protrusion, and narrowing of the disc space. Grades 0 to 2 were chosen using the criteria shown in Table 1. Miyazaki’s grading system evaluates disc degeneration by nucleus signal intensity, nucleus structure, distinction between the nucleus and annulus, and disc height4). Grades 1 to 5 were chosen using the criteria shown in Table 1. Nakashima’s grading system evaluates disc degeneration by nucleus structure, the border of the nucleus, and disc height with a flow chart (Fig. 1)7). Jacobs’ grading system uses nucleus signal intensity and disc height1). The grades are grade 0 (normal disc height, with or without a cleft in the nucleus pulposus), grade 1 (dark disc, with normal height), grade 2 (collapsed disc, little or no osteophytes), and grade 3 (collapsed disc, with many osteophytes) (Table 1). Suzuki’s grading system uses disc height, nucleus signal intensity, the border of the nucleus, and disc bulge with a flow chart (Fig. 2)5,10). Disc degeneration of the entire cervical spine was assessed using the degenerative disc disease (DDD) score, which is the sum of the grades at each cervical disc level (C2/3-C7/T1) in each of the five classifications11).

To evaluate the intra/inter-observer reliabilities of each classification, a sample size calculation was estimated for ρ = 0.8 with a 95% confidence interval of 0.4, rated by two examiners; at least 20 subjects would be needed. Therefore, 30 subjects were randomly selected for kappa analysis. Each intervertebral level from C2/3 to C7/T1 was measured by two orthopedic surgeons (HO & TW). The five classifications of disc degeneration were measured four times each in each subject. The second and third measurements were performed one week and one month after the first measurement. The fourth measurement was performed one week after the third assessment. Finally, one orthopedic surgeon (HO) examined all images without any participants’ information, including their symptoms. Other MRI assessments of vertebral endplate changes for Modic changes and Schmorl’s nodes were examined. Modic changes were scored type I (hypointense on T1-WI and hyperintense on T2-WI), type II (hyperintense on T1- and T2-WI), and type III (hypointense on T1- and T2-WI) (Table 1)12). Schmorl’s nodes were defined as more than a 2-mm deficit on T2-WI at each vertebral body level. The presence of a Schmorl’s node was defined as observation of a node at least one vertebral endplate level13).

Fig. 1. Algorithm for Nakashima’s classification7)

Nakashima’s classification evaluates disc degeneration in the order of nucleus structure, border of the nucleus, and disc height. If nucleus structure is inhomogeneously white, it is Grade 1. If the border of the nucleus is clear, it is Grade 2. If the disc is collapsed, it is Grade 4.

Table 1. Classifications of MRI images

Abbreviations: CSF, cerebrospinal fluid; VB, vertebral body; WI, weighted image

Fig. 2. Algorithm for Suzuki’s classification5)

Suzuki’s classification evaluates disc degeneration in the order of disc height, nucleus signal intensity, border of the nucleus, and disc bulge. If the disc height decreases by more than 25%, it is grade 3. If the nucleus is high intensity and homogeneous, it is grade 0. If the nucleus is high intensity and inhomogeneous, or if the border of the nucleus is clear, it is grade 1. If the border is not clear, but it does not have disc bulge, it is also grade 1. If the border is not clear, and it has disc bulge, it is grade 2.

Data analysis

Intra-observer and inter-observer agreements were assessed by κ values for each classification. First, the κ values of intra-observer agreement were calculated between the first and second measurement results of each intervertebral disc level by two observers. Second, the κ values were calculated between the third and fourth measurements in the same way. Finally, the average of these four κ values was used to evaluate intra-observer reliability. The inter-observer agreement was calculated between the first measurement results of each observer. Similarly, the κ values of the second to fourth measurements were calculated. The average of these four κ values was used to evaluate inter-observer reliability. Interpretations were performed in accordance with the guidelines suggested by Landis and Koch14). Agreement was rated as follows: poor, κ 0 to 0.2; fair, κ 0.21 to 0.4; moderate, κ 0.41 to 0.60; substantial, κ 0.61 to 0.8; and excellent, κ > 0.81. A value of 1 indicated absolute agreement, whereas a value of 0 indicated agreement no better than chance. In addition, comparison between groups was performed using Tukey’s test and the Games-Howell test. A p value of less than 0.05 was considered significant. Odds ratios (ORs) were estimated using a logistic regression model, and a two-sided p < 0.05 was considered significant. ORs were adjusted for age, sex, and other explanatory variables to evaluate associations between the presence of neck pain, neck stiffness, or neck symptoms (either neck pain or shoulder stiffness) and the findings of MRI images. Baseline characteristics are described using appropriate summary statistics with the chi-squared test, Mann-Whitney U test, and Cochran-Armitage’s propensity test. Statistical analyses were performed using SPSS (version 13, SPSS, Chicago, IL). A p value of less than 0.05 was considered significant.

Results

Intra/inter-observer agreements for each classification

The κ values for intra-observer and inter-observer agreements of each classification are shown in Table 2. The κ values for intra-observer agreement of Matsumoto’s, Nakashima’s, and Jacobs’ classifications were substantial. The κ values of Miyazaki’s and Suzuki’s classifications were moderate. The κ value of intra-observer agreement was significantly higher for Jacobs’ classification than for Miyazaki’s classification. There were no significant differences for the other classifications.

The κ values for inter-observer agreement of disc degeneration, posterior disc protrusion, and narrowing of the disc space in Matsumoto’s classification were moderate. The κ values of Nakashima’s and Jacobs’ classifications were moderate. The κ values for anterior disc protrusion of Matsumoto’s, Miyazaki’s, and Suzuki’s classifications were fair. The κ value of inter-observer agreement was significantly higher for Nakashima’s and Jacobs’ classifications than for Suzuki’s classification. There were no significant differences for other classifications.

Table 2. Kappa values of intra-and inter-observer agreements for each classification

*; Games-Howell test p<0.05

Associations of disc degeneration grading on MRI findings and clinical symptoms

The 497 participants consisted of 155 male and 342 female persons. Their mean age was 64 years (range 25 to 93 years), and most participants were aged over 70 years (Fig. 3).

A comparison of the patients’ characteristics with and without neck pain is shown in Table 3. The prevalence of neck pain was 27.4% (136 of 497 participants). There were no significant differences in age and sex between participants with and without neck pain. In all classifications of disc degeneration, there was no difference in DDD scores between participants with and without neck pain. The distribution for the highest severity of grade was only significantly higher with neck pain than without neck pain in Miyazaki’s classification (p = 0.000). The prevalence of Modic change was 5.4% (27 of 497 participants). There was no significant difference in Modic types with and without neck pain. The prevalence of Schmorl’s nodes was 32.4% (161 of 497 participants). Fifty-seven participants (41.9%) with neck pain were found to have Schmorl’s nodes, while in those without neck pain, 28.8% had Schmorl’s nodes. The prevalence of Schmorl’s nodes was significantly higher in participants with neck pain than in those without it (p = 0.005).

Comparisons of characteristics with and without shoulder stiffness are shown in Table 4. The prevalence of shoulder stiffness was 48.7% (242 of 497 participants). The mean age and distribution of age was younger with shoulder stiffness than without it (p = 0.000). DDD scores were significantly lower in the participants with than in those without shoulder stiffness in all classifications. However, these differences were very small (range, 0.56 to 1.19), and the clinical meaning might be unclear. There were no significant differences in sex, Modic types, and the prevalence of Schmorl’s nodes between the participants with and without shoulder stiffness.

According to the adjusted odds ratios on multivariate analysis, the associations of the MRI findings in each classification with the presence of neck pain, shoulder stiffness, and neck symptoms are shown in Table 5. There were no significant associations for any clinical symptoms with DDD scores in the five classifications. In addition, there was no association between Modic change and any clinical symptoms. The presence of Schmorl’s nodes was only associated with neck pain and neck symptoms.

TFig. 3. Flow chart of subject selection

Table 3. Comparison of characteristics between participants with and without neck pain

Abbreviations: DDD, degenerative disc disease.

Table 4. Comparison of characteristics between participants with and without shoulder stiffness

Abbreviations: DDD, degenerative disc disease.

Table 5. Associations of MRI findings with neck pain and shoulder stiffness in multivariate regression analysis

Abbreviations: DDD, degenerative disc disease.

Discussion

MRI is a useful method for evaluating disc degeneration1,4,6), but there is no gold standard for its evaluation. Several classifications of disc degeneration using MRI to evaluate signal intensity, bulge, and height of intervertebral discs have already been reported1,4–8). Since morphological assessment is subjective and affected by observer bias, the reproducibility of the evaluation method is therefore important. In addition, the associations between morphological findings and clinical symptoms are still controversial. There are previous studies in which the morphological findings using a different single assessment were evaluated for their associations with neck symptoms15). It is not clear whether the methodology for assessment of disc degeneration using MRI itself affects the result for associations with symptoms, or it is evidence that morphological findings are not associated with clinical symptoms. In the present study, five different assessments of cervical vertebral disc degeneration were analyzed for both their reproducibilities and their associations with symptoms.

In the original papers of the five classifications, both intra-observer and inter-observer agreements of each classification were reported as moderate to almost complete. In the present study, intra-observer agreement was moderate to substantial, and inter-observer agreement was fair to moderate. The present results show that Jacobs’ classification has relatively high reproducibility. This classification is established for routine clinical use, and the criteria are simple; therefore, it is easy to define each criterion, and both intra-observer and inter-observer reproducibilities might be high. One of the causes for lack of agreement is thought to be the problem of defining the criteria for degeneration. The criteria for judging signal intensity and disc protrusion are inferred from the written words in the papers. Therefore, the reproducibility may decrease due to differences in interpretation of the words used to describe the classification. In particular, it seems that the criteria for determining nucleus signal intensity are likely to be confusing. For example, in Matsumoto’s and Suzuki’s classifications, evaluation of the height has rough criteria, such as 25% or 50% reduction, so that it is easier to classify than nucleus signal intensity. However, there is no clear standard for nucleus signal intensity, because signal intensity is evaluated in Matsumoto’s and Suzuki’s classifications by comparison with cerebrospinal fluid, but that is not a clear standard. In other classifications, signal intensity is categorized as high signal intensity, low signal intensity, and no signal intensity, but there are no definitions of high signal intensity and of low signal intensity. It is difficult to evaluate signal intensity quantitatively, and how much signal intensity is high and how much low signal intensity is low depends on the evaluator. Therefore, the determination of nucleus signal intensity is subjective and tends to vary.

According to the present results, quantitative measurement of nuclear signal intensity is proposed as a criterion for degeneration. Use of more quantitative MRI in the study of intervertebral disc degeneration in vivo has been carried out previously16-19). It might improve the inter-observer agreement of the evaluation of disc degeneration and exclude the observer’s experience, whereas quantitative measurement of MRI might not be convenient under routine clinical conditions. Currently, research using artificial intelligence (AI) for image evaluation is being conducted, and its accuracy is considered to be high20,21). AI could solve the problem of reproducibility and is likely to become a method of image evaluation in the near future.

It is considered that a degenerative cervical disc is a source of neck pain22), and the prevalence of disc degeneration was 67% in patients with neck pain23). In the present study, disc degeneration was not more severe with shoulder stiffness than without it, because the mean age of those with shoulder stiffness was younger than that of participants without shoulder stiffness. After adjustment using a logistic regression model, the severity of disc degeneration in all five classifications on MRI was not associated with neck pain, shoulder stiffness, or neck symptoms (neck pain and shoulder stiffness). The Bone and Joint 2000-2010 Task Force on Neck Pain reported that they did not identify any evidence demonstrating that disc degeneration is a risk factor for neck pain24). The results of the present study suggest that disc degeneration of the cervical spine might not be the single key factor for the presence of neck-related symptoms. In addition, according to previous studies, the prevalence of Modic change in the cervical spine varied from 5% to 40%; type 2 was predominant, and type 3 was the least prevalent15,25). The prevalence of Modic change was 5.4%, type 2 was the most common, and type 3 was the least common in the present study. The morphological findings were similar to those reported previously. In a cohort study, neck pain was independently associated with all types of Modic changes (odds ratio 2.7, 95% confidence interval 1.08-6.8), but shoulder stiffness was not26). On the other hand, there were no differences in neck pain intensity, Neck Disability Index, and physical and mental component summaries of the 36-Item Short Form Health Survey in participants with or without Modic change27). In the present study, there was no association between Modic change and clinical neck-related symptoms. The relationships of disc degeneration and Modic change with neck symptoms are still controversial, depending on the participant population in each study.

Even though Schmorl’s nodes can be located at any level of the spine, they were located in the cervical spine in 5.9%, in the thoracic spine in 47.7%, and in the lumbar spine in 46.3%13). In the present study, the association with Schmorl’s nodes was greater depending on both neck pain and neck symptoms, but not shoulder stiffness. Schmorl’s nodes are seen in asymptomatic cases, but they can be a source of pain28). There are not enough reports of evidence of an association between Schmorl’s nodes in the cervical spine and clinical symptoms. In addition, the mechanism of different locations is not known; associations of factors with Schmorl’s nodes should be investigated. In the previous study, it was found that there was an association between patients with severe neck pain of more than 5 points on the pain numerical rating scale and both cervical curvature and spondylolisthesis independently, but not MRI findings for disc degeneration or Modic change29). On the other hand, it was reported that cervical disc degeneration was found in 60% of asymptomatic subjects30). In the present study, there was no relationship between disc degeneration in each MRI classification and neck symptoms, however, the degree of neck symptoms was not evaluated in this study. Therefore, the association of between disc degeneration and severity of clinical symptoms is still uncertain.

The strength of this study is that five different kinds of MRI evaluations for disc degeneration were performed in individual participants. In addition, the distribution of each MRI item with and without clinical symptoms was evaluated, and associations among them were analyzed using a logistic regression model. Therefore, various morphological findings were compared to determine the possible pathogenesis of symptoms. Even though there was no specific classification of disc degeneration, there was no association between disc degeneration and symptoms. The second strength is that the data were obtained from a large community-based population, and various analyses have been performed, including the present study. Therefore, compared to a hospital-based survey, the results of the present study come from a real-world setting and are relevant for establishing the pathogenesis of cervical spine degeneration.

There were some limitations to the present study. First, only relatively healthy subjects were enrolled, and usually only those with any symptoms or more severe symptoms would undergo MRI. However, each possible MRI finding was distributed among all grades, and non-symptomatic participants agreed to undergo MRI. Therefore, compared to a hospital-based study, the benefit of this study was that all grades of morphological changes, including mild and unchanged cases, would be evaluated. Second, since the research location was in a rural and mountainous area, one may not be completely able to extrapolate the findings to the typical Japanese population. Third, the severity of neck symptoms was not examined. Therefore, the relationship between MRI findings and severity of clinical symptoms cannot be evaluated. Finally, this was a cross-sectional study; therefore, causal relationships between morphological changes and symptoms related to the cervical spine could not be determined.

Conclusion

In the present study, there was no difference among the five classifications in reproducibility; therefore, the simple evaluation method with higher accuracy is useful for routine clinical use. In addition, there was no specific classification for evaluating cervical disc degeneration by MRI that showed associations with clinical symptoms. Only the presence of Schmorl’s nodes was strongly related to neck symptoms, but not disc degeneration. In the future, it may be necessary to create a new classification system with simple and objective criteria for image evaluation to investigate cervical disc degeneration.

Acknowledgments

The authors would like to thank Dr. Akira Onda, Dr. Kazuya Yamauchi, Dr. Yoshiaki Takeyachi, Dr. Ichiro Takahashi, Dr. Hisayoshi Tachihara, and Dr. Bunji Takayama for participating in the data collection. The authors would also like to thank five public health nurses (Nobuko Fujita, Nakako Hoshi, Misako Hoshi, Naoko Imada, and Seiko Kanno) for their support in carrying out this study.

Disclosure

The authors declare no conflicts of interest associated with this manuscript.

Abstract

References