Skip to main content

Impact of the surgical approach to thymectomy upon complete stable remission rates in myasthenia gravis: a meta-analysis

Cite this dataset

Solis Pazmino, Andrea Paola et al. (2021). Impact of the surgical approach to thymectomy upon complete stable remission rates in myasthenia gravis: a meta-analysis [Dataset]. Dryad.


Objectives: To determine whether any of the available operative techniques confer variable chances for complete stable remission (CSR) in myasthenia gravis (MG), we performed a meta-analysis of all comparative studies of surgical approaches.

Methods: Meta-analysis of all studies providing comparative data on thymectomy approaches, with CSR reported and minimum 3 years mean follow-up.

Results: 12 cohort studies and one randomized clinical trial, containing 1598 patients, met entry criteria. At 3 years, CSR from MG was similar following VATS extended vs. both basic (RR 1.00, p=1.00, 95% CI 0.39-2.58) and extended (RR 0.96, p=0.74, CI: 0.72-1.27) transsternal approaches. CSR at 3 years was also similar following extended transsternal vs. combined transcervical-subxiphoid (RR 1.08, p=0.62, CI: 0.8-1.44) approaches. VATS extended approaches remained statistically equivalent to extended transsternal approaches through 9 years of follow-up (RR 1.51, p=0.05, CI: 0.99-2.30).  The only significant difference in CSR rate between a traditionally open and a minimally invasive approach was seen at 10 years when comparing the now-abandoned basic (non-sternum-lifting) transcervical approach and the extended transsternal approach (RR 0.4, p=0.01, CI: 0.2-0.8).

Conclusions: A significant difference in the rate of CSR among various surgical approaches for thymectomy in MG was identified only at long-term follow-up, and only between what might be considered the most aggressive approach (extended transsternal thymectomy) and the least aggressive approach (basic transcervical thymectomy).  Extended minimally invasive approaches appear to have equivalent CSR rates to extended transsternal approaches and are therefore appropriate in the hands of experienced surgeons.


The protocol for this study is registered at PROSPERO (CRD42020166827)21. The manuscript is written using the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) guidelines22.

Eligibility criteria

We included studies comparing at least two thymectomy techniques in patients older than 18 years with MG.  Since the data is unclear whether the presence of a thymoma impacts MG outcomes following thymectomy23, we included studies of patients both with or without thymoma. The primary outcome measure for our study was Complete Stable Remission (CSR). We included in our definition of CSR patients who are asymptomatic with no medications for at least 6 months; or asymptomatic and taking only single drug immunosuppression for at least 6 months (slightly broader than the strict MGFA definition, in order to include more studies in the analysis).  Because remissions following thymectomy in all studies accumulate progressively with time, and the steepest upward slope of this curve is in the first several years, we only included studies with a minimum of 3 years of follow-up. We classified the method of thymectomy according to the published, MGFA modified classification for thymectomy approaches24.

Data sources and searches

A comprehensive search of studies, in any language, from several databases, beginning on May 6th, 2019 was conducted. The databases included Ovid MEDLINE(R) and Epub Ahead of Print, In-Process & Other Non-Indexed Citations, and Daily, Ovid EMBASE, Ovid Cochrane Central Register of Controlled Trials, Ovid Cochrane Database of Systematic Reviews, and Scopus. The search strategy was designed and conducted by an experienced librarian with input from the study's principle investigator. Controlled vocabulary supplemented with keywords was used to search for thymectomy performed in adults with myasthenia gravis25–27. Conference abstracts, literature reviews, case reports and editorials were excluded. The list of all search terms and combinations is available in tables e-1 and e-2.

Study selection

Search records were uploaded into a systematic review software program (DistillerSR, Ottawa, ON, Canada)28. Screening was performed by four reviewers (P.S-P., E.L-N., W.T., I.B.) separately and in duplicate, using standardized instructions. Prior to the abstract screening phase, a pilot was performed with 20 articles to assess for understanding and accuracy of the eligibility criteria. Articles included by at one reviewer during abstract screening were included into the next phase: screening of full-text reports. A pilot before full text screening was also performed. At this level, disagreements were resolved through consensus by three reviewers (P.S-P., E.L-N., O.J.P.). Screening agreement was assessed using Cohen’s kappa (k=0.42). 

Data collection and variables

Two reviewers (P.S-P., E.L-N), working independently and in duplicate, extracted study data into a standardized form, and a third reviewer (O.J.P) randomly audited 30% of this data for accuracy and completeness. The variables of interest were: 1) general characteristics (first author, date of publication, country, study design, data collection period); 2) setting (single center, multicenter); 3) patient characteristics before surgery (age at diagnosis of MG, sex, signs and symptoms, antibody titers, treatments before surgery); 4) thymectomy techniques; 5) patient characteristics after surgery (histology, complications, operative time, and hospital length of stay [LOS]); 6) outcomes of interest (CSR ).

Data management

Extracted data was later classified in several ways. Clinical MG findings were grouped according to the MGFA classification29. Table e-3 describes the five classes and corresponding subclasses. Surgical techniques were classified according to the primary approach (e.g., transcervical, videoscopic, transsternal, or any combination),30 and further subdivisions were made according to the MGFA thymectomy classification24, as follows:  T-1: transcervical thymectomy (basic, extended, extended with partial sternal split, extended with videoscopic technology); T-2: videoscopic thymectomy (classic unilateral VATS, bilateral VATS with neck dissection, unilateral videoscopic with robotic technology, bilateral videoscopic with robotic technology); T-3: transsternal thymectomy (basic, extended); T-4: transcervical and transsternal thymectomy; T-5: infra-sternal thymectomy (combined transcervical/subxiphoid, videoscopic subxiphoid, videoscopic with robotic technology subxiphoid, infrasternal mediastinoscopy) (Table e-4).

The most common groups of surgeries in the publications that met our selection criteria were the transsternal and the various minimally invasive approaches to thymectomy (video-assisted extended thymectomy, robotic approaches, basic transcervical, subxiphoid, combined transcervical/subxiphoid). Classically, and in the MGFA classification, “extended” thymectomy refers to removing not only the intracapsular thymus, but also all of the anterior mediastinal fat between the phrenic nerves and from the thoracic inlet to the diaphragm31.  If authors used the term “extended” in their procedure descriptor, we classified it as such. For all approaches, if the authors did not use the term “extended”, but describes an operation that matches the above classical description of an “extended” thymectomy, then it was classified as “extended”.  All other operations were classified as “basic” thymectomies. There is no standard accepted histological classification for excised, non-neoplastic thymic tissue29. We therefore described histology using the most common histologic types reported by the studies (hyperplastic, normal, involuted).  

Author contact

We emailed corresponding authors of included studies asking them to share data about the association of different treatment approaches with CSR when this was unclear in the manuscript, or to otherwise clarify reported information. If authors failed to respond in a 2-week period, we contacted them with a second e-mail. If authors did not reply after 1 additional week, a final e-mail was sent. Those who still did not respond in another 1-week period were contacted by telephone.  Four authors replied. Two helped to clarify reported data, and two shared hazard ratio information32,33. The data shared by one of the authors did not match what was reported in the publication and that publication was therefore excluded from this study34.

Risk of bias in individual studies

Risk of bias was assessed by two reviewers (P.S-P., E.L-N.). Disagreements were resolved by including a third reviewer (O.J.P.). To assess the risk of bias of cohort-studies, we used the CLARITY tool35. The domains for cohort studies were: 1) whether the selection of exposed and non-exposed cohorts were chosen from the same population; 2) confidence in the assessment of exposure; 3) whether the outcome of interest was absent at the start of the study; 4) if the study matched exposed and unexposed for all variables that are associated with the outcome of interest or if the statistical analyses adjusted for the specified prognostic variables; 5) confidence in the assessment regarding the presence or the absence of prognostic factors; 6) confidence in the assessment of the outcome; 7) if the follow up of cohort was adequate; and 8) if co-interventions between groups were similar. In this tool, there were four possible responses for each domain: “definitively yes”, “probably yes”, “probably no”, and “definitively no”.  For a better understanding, we modified these responses to low, high, or unclear risk of bias. “Definitively yes” was interpreted as low; “probably yes” and “probably no” – unclear; and “definitively no” – high risk of bias. The risk of bias of clinical trials was assessed using the Cochrane Collaboration risk assessment tool, which included the following domains: 1) randomization sequence generation; 2) allocation concealment; 3) blinding of participants and personnel; 4) blinding of outcome assessment; 5) incomplete outcome data; 6) selective reporting; and 7) other bias. Each of these domains had three possible responses: low, unclear, and high risk of bias36.

Studies with at least one domain considered as “high risk of bias” were judged  have a high overall risk of bias; studies with at least two domains at “unclear risk of bias” and without domains assessed as “high risk of bias” were considered to be at overall unclear risk of bias; and those studies with domains classified as “low risk of bias” without any “unclear” or “high risk of bias” domain were considered at overall low risk of bias.

Certainty in the body of evidence

The quality or certainty in the evidence was assessed with the “Grading of Recommendations Assessment, Development and Evaluation” (GRADE) approach37. This assessment reflects the level of confidence that the effect sizes or estimates from this systematic review are correct37.

We evaluated each treatment-comparison-outcome triad (e.g. extended transsternal thymectomy - basic transsternal thymectomy - complete stable remission). To ease interpretation, only the longest follow-up of each triad was evaluated and reported. Two reviewers (P.S-P., E.L-N), working individually, assessed the quality of evidence, and consensus was used to resolve disagreements by involving a third reviewer (O.J.P).

Overall, the quality of the evidence of each treatment-comparison-outcome triad can be graded as very low, low, moderate, and high. To assign these, we began by rating randomized trials as high quality of evidence and observational studies as low quality of evidence. Then, based on different factors, we either downgraded (risk of bias, inconsistency, indirectness, imprecision, and publication bias) or upgraded (large magnitude of effect, plausible confounding, and dose-response gradient) the initial rating. A detailed description of this methodology is found online37.

Statistical analyses

We had hoped to analyze time-to-event outcomes; however, none of included studies reported the necessary data. Although we contacted the authors asking to share this information, the low response rate rendered us unable to perform a meta-analysis of time-to-event data. As a second approach, we extracted data and analyzed it as relative risk (RR).

We calculated the RR for each outcome-intervention-outcome triad and their 95% confidence intervals (CIs) using random-effects model with the restricted maximum-likelihood estimator. To observe how the effect changed over time, we calculated these effect sizes for different follow-up times.

Heterogeneity across studies was assessed with the I2 statistic and visually38. We considered that I2< 25% reflected low inconsistency and I2> 75% reflected high inconsistency. Although we planned to perform subgroups analyses by age, sex, MGFA score, Osserman score, and anti-acetylcholine-receptor antibodies, not enough data was available. To perform this analysis and create the forest plots, we used the statistical program R Studio, an integrated development environment for R39.

To ease interpretation and comparability of patient characteristics before and after surgery, we converted medians to means and ranges or interquartile ranges to standard deviations.