TY - JOUR A1 - Strahl, André A1 - Gerlich, Christian A1 - Alpers, Georg W. A1 - Gehrke, Jörg A1 - Müller-Garnn, Annette A1 - Vogel, Heiner T1 - An instrument for quality assurance in work capacity evaluation: development, evaluation, and inter-rater reliability JF - BMC Health Services Research N2 - Background: Employees insured in pension insurance, who are incapable of working due to ill health, are entitled to a disability pension. To assess whether an individual meets the medical requirements to be considered as disabled, a work capacity evaluation is conducted. However, there are no official guidelines on how to perform an external quality assurance for this evaluation process. Furthermore, the quality of medical reports in the field of insurance medicine can vary substantially, and systematic evaluations are scarce. Reliability studies using peer review have repeatedly shown insufficient ability to distinguish between high, moderate and low quality. Considering literature recommendations, we developed an instrument to examine the quality of medical experts’reports. Methods: The peer review manual developed contains six quality domains (formal structure, clarity, transparency, completeness, medical-scientific principles, and efficiency) comprising 22 items. In addition, a superordinate criterion (survey confirmability) rank the overall quality and usefulness of a report. This criterion evaluates problems of innerlogic and reasoning. Development of the manual was assisted by experienced physicians in a pre-test. We examined the observable variance in peer judgements and reliability as the most important outcome criteria. To evaluate inter-rater reliability, 20 anonymous experts’ reports detailing the work capacity evaluation were reviewed by 19 trained raters (peers). Percentage agreement and Kendall’s W, a reliability measure of concordance between two or more peers, were calculated. A total of 325 reviews were conducted. Results: Agreement of peer judgements with respect to the superordinate criterion ranged from 29.2 to 87.5%. Kendall’s W for the quality domain items varied greatly, ranging from 0.09 to 0.88. With respect to the superordinate criterion, Kendall’s W was 0.39, which indicates fair agreement. The results of the percentage agreement revealed systemic peer preferences for certain deficit scale categories. Conclusion: The superordinate criterion was not sufficiently reliable. However, in comparison to other reliability studies, this criterion showed an equivalent reliability value. This report aims to encourage further efforts to improve evaluation instruments. To reduce disagreement between peer judgments, we propose the revision of the peer review instrumentand the development and implementation of a standardized rater training to improve reliability. KW - work capacity evaluation KW - insurance medicine KW - quality assurance KW - peer review KW - reliability Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-200289 VL - 19 ER - TY - JOUR A1 - Strahl, André A1 - Gerlich, Christian A1 - Alpers, Georg W. A1 - Ehrmann, Katja A1 - Gehrke, Jörg A1 - Müller-Garnn, Annette A1 - Vogel, Heiner T1 - Development and evaluation of a standardized peer-training in the context of peer review for quality assurance in work capacity evaluation JF - BMC Medical Education N2 - Background: The German quality assurance programme for evaluating work capacity is based on peer review that evaluates the quality of medical experts' reports. Low reliability is thought to be due to systematic differences among peers. For this purpose, we developed a curriculum for a standardized peer-training (SPT). This study investigates, whether the SPT increases the inter-rater reliability of social medical physicians participating in a cross-institutional peer review. Methods: Forty physicians from 16 regional German Pension Insurances were subjected to SPT. The three-day training course consist of nine educational objectives recorded in a training manual. The SPT is split into a basic module providing basic information about the peer review and an advanced module for small groups of up to 12 peers training peer review using medical reports. Feasibility was tested by assessing selection, comprehensibility and subjective use of contents delivered, the trainers' delivery and design of training materials. The effectiveness of SPT was determined by evaluating peer concordance using three anonymised medical reports assessed by each peer. Percentage agreement and Fleiss' kappa (κ\(_m\)) were calculated. Concordance was compared with review results from a previous unstructured, non-standardized peer-training programme (control condition) performed by 19 peers from 12 German Pension Insurances departments. The control condition focused exclusively on the application of peer review in small groups. No specifically training materials, methods and trainer instructions were used. Results: Peer-training was shown to be feasible. The level of subjective confidence in handling the peer review instrument varied between 70 and 90%. Average percentage agreement for the main outcome criterion was 60.2%, resulting in a κ\(_m\) of 0.39. By comparison, the average percentage concordance was 40.2% and the κ\(_m\) was 0.12 for the control condition. Conclusion: Concordance with the main criterion was relevant but not significant (p = 0.2) higher for SPT than for the control condition. Fleiss' kappa coefficient showed that peer concordance was higher for SPT than randomly expected. Nevertheless, a score of 0.39 for the main criterion indicated only fair inter-rater reliability, considerably lower than the conventional standard of 0.7 for adequate reliability. KW - inter-rater reliability KW - peer review KW - quality assurance KW - training curriculum KW - work capacity evaluation Y1 - 2018 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-175738 VL - 18 IS - 135 ER -