610 Medizin und Gesundheit
Refine
Has Fulltext
- yes (8) (remove)
Is part of the Bibliography
- yes (8)
Document Type
- Journal article (6)
- Doctoral Thesis (2)
Keywords
- reliability (8) (remove)
Institute
- Institut für Psychotherapie und Medizinische Psychologie (2)
- Klinik und Poliklinik für Psychiatrie, Psychosomatik und Psychotherapie (2)
- Institut für Klinische Epidemiologie und Biometrie (1)
- Klinik und Poliklinik für Allgemein-, Viszeral-, Gefäß- und Kinderchirurgie (Chirurgische Klinik I) (1)
- Klinik und Poliklinik für Kinder- und Jugendpsychiatrie, Psychosomatik und Psychotherapie (1)
- Klinik und Poliklinik für Unfall-, Hand-, Plastische und Wiederherstellungschirurgie (Chirurgische Klinik II) (1)
EU-Project number / Contract (GA) number
- 211078 (1)
Task-based measures that capture neurocognitive processes can help bridge the gap between brain and behavior. To transfer tasks to clinical application, reliability is a crucial benchmark because it imposes an upper bound to potential correlations with other variables (e.g., symptom or brain data). However, the reliability of many task readouts is low. In this study, we scrutinized the retest reliability of a probabilistic reversal learning task (PRLT) that is frequently used to characterize cognitive flexibility in psychiatric populations. We analyzed data from N = 40 healthy subjects, who completed the PRLT twice. We focused on how individual metrics are derived, i.e., whether data were partially pooled across participants and whether priors were used to inform estimates. We compared the reliability of the resulting indices across sessions, as well as the internal consistency of a selection of indices. We found good to excellent reliability for behavioral indices as derived from mixed-effects models that included data from both sessions. The internal consistency was good to excellent. For indices derived from computational modeling, we found excellent reliability when using hierarchical estimation with empirical priors and including data from both sessions. Our results indicate that the PRLT is well equipped to measure individual differences in cognitive flexibility in reinforcement learning. However, this depends heavily on hierarchical modeling of the longitudinal data (whether sessions are modeled separately or jointly), on estimation methods, and on the combination of parameters included in computational models. We discuss implications for the applicability of PRLT indices in psychiatric research and as diagnostic tools.
Background
Morphology and glenoid involvement determine the necessity of surgical management in scapula fractures. While being present in only a small share of patients with shoulder trauma, numerous classification systems have been in use over the years for categorization of scapula fractures. The purpose of this study was to evaluate the established AO/OTA classification in comparison to the classification system of Euler and Rüedi (ER) with regard to interobserver reliability and confidence in clinical practice.
Methods
Based on CT imaging, 149 patients with scapula fractures were retrospectively categorized by two trauma surgeons and two radiologists using the classification systems of ER and AO/OTA. To measure the interrater reliability, Fleiss kappa (κ) was calculated independently for both fracture classifications. Rater confidence was stated subjectively on a five-point scale and compared with Wilcoxon signed rank tests. Additionally, we computed the intraclass correlation coefficient (ICC) based on absolute agreement in a two-way random effects model to assess the diagnostic confidence agreement between observers.
Results
In scapula fractures involving the glenoid fossa, interrater reliability was substantial (κ = 0.722; 95% confidence interval [CI] 0.676–0.769) for the AO/OTA classification in contrast to moderate agreement (κ = 0.579; 95% CI 0.525–0.634) for the ER classification system. Diagnostic confidence for intra-articular fracture patterns was superior using the AO/OTA classification compared to ER (p < 0.001) with higher confidence agreement (ICC: 0.882 versus 0.831). For extra-articular fractures, ER (κ = 0.817; 95% CI 0.771–0.863) provided better interrater reliability compared to AO/OTA (κ = 0.734; 95% CI 0.692–0.776) with higher diagnostic confidence (p < 0.001) and superior agreement between confidence ratings (ICC: 0.881 versus 0.912).
Conclusions
The AO/OTA classification is most suitable to categorize intra-articular scapula fractures with glenoid involvement, whereas the classification system of Euler and Rüedi appears to be superior in extra-articular injury patterns with fractures involving only the scapula body, spine, acromion and coracoid process.
Background:
Employees insured in pension insurance, who are incapable of working due to ill health, are entitled to a disability pension. To assess whether an individual meets the medical requirements to be considered as disabled, a work capacity evaluation is conducted. However, there are no official guidelines on how to perform an external quality assurance for this evaluation process. Furthermore, the quality of medical reports in the field of insurance medicine can vary substantially, and systematic evaluations are scarce. Reliability studies using peer review have repeatedly shown insufficient ability to distinguish between high, moderate and low quality. Considering literature recommendations, we developed an instrument to examine the quality of medical experts’reports.
Methods:
The peer review manual developed contains six quality domains (formal structure, clarity, transparency, completeness, medical-scientific principles, and efficiency) comprising 22 items. In addition, a superordinate criterion (survey confirmability) rank the overall quality and usefulness of a report. This criterion evaluates problems of innerlogic and reasoning. Development of the manual was assisted by experienced physicians in a pre-test. We examined the observable variance in peer judgements and reliability as the most important outcome criteria. To evaluate inter-rater reliability, 20 anonymous experts’ reports detailing the work capacity evaluation were reviewed by 19 trained raters (peers). Percentage agreement and Kendall’s W, a reliability measure of concordance between two or more peers, were calculated. A total of 325 reviews were conducted.
Results:
Agreement of peer judgements with respect to the superordinate criterion ranged from 29.2 to 87.5%. Kendall’s W for the quality domain items varied greatly, ranging from 0.09 to 0.88. With respect to the superordinate criterion, Kendall’s W was 0.39, which indicates fair agreement. The results of the percentage agreement revealed systemic peer preferences for certain deficit scale categories.
Conclusion:
The superordinate criterion was not sufficiently reliable. However, in comparison to other reliability studies, this criterion showed an equivalent reliability value. This report aims to encourage further efforts to improve evaluation instruments. To reduce disagreement between peer judgments, we propose the revision of the peer review instrumentand the development and implementation of a standardized rater training to improve reliability.
Background
Pneumonia frequently complicates stroke and has amajor impact on outcome. We derived and internally validated a simple clinical risk score for predicting stroke-associated pneumonia (SAP), and compared the performance with an existing score (A\(^{2}\)DS\(^{2}\)).
Methods and Results
We extracted data for patients with ischemic stroke or intracerebral hemorrhage from the Sentinel Stroke National Audit Programme multicenter UK registry. The data were randomly allocated into derivation (n=11 551) and validation (n=11 648) samples. A multivariable logistic regression model was fitted to the derivation data to predict SAP in the first 7 days of admission. The characteristics of the score were evaluated using receiver operating characteristics (discrimination) and by plotting predicted versus observed SAP frequency in deciles of risk (calibration). Prevalence of SAP was 6.7% overall. The final 22-point score (ISAN: prestroke Independence [modified Rankin scale], Sex, Age, National Institutes of Health Stroke Scale) exhibited good discrimination in the ischemic stroke derivation (C-statistic 0.79; 95% CI 0.77 to 0.81) and validation (C-statistic 0.78; 95% CI 0.76 to 0.80) samples. It was well calibrated in ischemic stroke and was further classified into meaningful risk groups (low 0 to 5, medium6 to 10, high 11 to 14, and very high >= 15) associated with SAP frequencies of 1.6%, 4.9%, 12.6%, and 26.4%, respectively, in the validation sample. Discrimination for both scores was similar, although they performed less well in the intracerebral hemorrhage patients with an apparent ceiling effect.
Conclusions
The ISAN score is a simple tool for predicting SAP in clinical practice. External validation is required in ischemic and hemorrhagic stroke cohorts.
Objective: The assessment of response to lithium maintenance treatment in bipolar disorder (BD) is complicated by variable length of treatment, unpredictable clinical course, and often inconsistent compliance. Prospective and retrospective methods of assessment of lithium response have been proposed in the literature. In this study we report the key phenotypic measures of the "Retrospective Criteria of Long-Term Treatment Response in Research Subjects with Bipolar Disorder" scale currently used in the Consortium on Lithium Genetics (ConLiGen) study.
Materials and Methods: Twenty-nine ConLiGen sites took part in a two-stage case-vignette rating procedure to examine inter-rater agreement [Kappa (\(\kappa\))] and reliability [intra-class correlation coefficient (ICC)] of lithium response. Annotated first-round vignettes and rating guidelines were circulated to expert research clinicians for training purposes between the two stages. Further, we analyzed the distributional properties of the treatment response scores available for 1,308 patients using mixture modeling.
Results: Substantial and moderate agreement was shown across sites in the first and second sets of vignettes (\(\kappa\) = 0.66 and \(\kappa\) = 0.54, respectively), without significant improvement from training. However, definition of response using the A score as a quantitative trait and selecting cases with B criteria of 4 or less showed an improvement between the two stages (\(ICC_1 = 0.71\) and \(ICC_2 = 0.75\), respectively). Mixture modeling of score distribution indicated three subpopulations (full responders, partial responders, non responders).
Conclusions: We identified two definitions of lithium response, one dichotomous and the other continuous, with moderate to substantial inter-rater agreement and reliability. Accurate phenotypic measurement of lithium response is crucial for the ongoing ConLiGen pharmacogenomic study.
Since the first description of a systematic mis-reaching by Balint in 1909, a reasonable number of patients showing a similar phenomenology, later termed optic ataxia (OA), has been described. However, there is surprising inconsistency regarding the behavioral measures that are used to detect OA in experimental and clinical reports, if the respective measures are reported at all. A typical screening method that was presumably used by most researchers and clinicians, reaching for a target object in the peripheral visual space, has never been evaluated. We developed a set of instructions and evaluation criteria for the scoring of a semi-standardized version of this reaching task. We tested 36 healthy participants, a group of 52 acute and chronic stroke patients, and 24 patients suffering from cerebellar ataxia. We found a high interrater reliability and a moderate test-retest reliability comparable to other clinical instruments in the stroke sample. The calculation of cut-off thresholds based on healthy control and cerebellar patient data showed an unexpected high number of false positives in these samples due to individual outliers that made a considerable number of errors in peripheral reaching. This study provides first empirical data from large control and patient groups for a screening procedure that seems to be widely used but rarely explicitly reported and prepares the grounds for its use as a standard tool for the description of patients who are included in single case or group studies addressing optic ataxia similar to the use of neglect, extinction, or apraxia screening tools.
Minimal invasive chirurgische Techniken zur Therapie des primären Hyperparathyreoidismus (pHPT) konkurrieren mit dem bisherigen Standardverfahren, der bilateralen zervikalen Exploration, sofern eine lokalisierte Eindrüsenerkrankung vorliegt. Zusätzlich vorhandene, operationspflichtige Schilddrüsenknoten können jedoch ein minimal invasives Vorgehen verhindern. Ziel der hier vorliegenden Untersuchung war es für das eigene Krankengut prospektiv zu analysieren, ob die minimal invasive offene Parathyreoidektomie (MIOP) zur Therapie des pHPT bei Eindrüsenerkrankungen sicher durchführbar war, ob die postoperativen Ergebnisse denen des konventionellen Vorgehens entsprechen und mit welcher Genauigkeit der intraoperative Parathormonschnelltest (PTH-Quick-Test) die biochemische Heilung des Patienten vorhersagen bzw. eine Mehrdrüsenerkrankung ausschließen konnte.
Die Arbeit beschäftigt sich mit der Beurteilung von Musiktherapie. Hierzu wurden Skalen entwickelt die musiktherapeutische Improvisation abbilden sollen. Zur Bewertung dieser Skalen wurde die Interraterreliabilität berechnet. Unterschiedliche Spielarten zeigten deutlich unterschiedlich gute Übereinstimmungen. Diese wurden herausgearbeitet und Vorschläge erarbeitet zur weiteren Optimierung dieser Skalen.