Skip to main content

Lin28a induces SOX9 and chondrocyte reprogramming via HMGA2 and blunts cartilage loss in mice

Cite this dataset

Hay, Eric (2022). Lin28a induces SOX9 and chondrocyte reprogramming via HMGA2 and blunts cartilage loss in mice [Dataset]. Dryad.


Among tissues undergoing permanent stress, cartilage has low regenerative capacity. Irreversible cartilage lesions characterize osteoarthritis (OA), in which cartilage loss is not followed by tissue repair. We investigated the physiopatological role of LIN28a, an RNA binding protein, in cartilage in adults. Lin28a was detected in only damaged cartilage of humans and mice. Here, inducible conditional cartilage deletion of Lin28a upregulates Mmp13 expression in sham mice and exacerbates cartilage destruction in OA mice. Lin28a-specific cartilage overexpression protects mice against cartilage breakdown, stimulates chondrocyte proliferation and the expression of Prg4 and Sox9, but reduces the expression of Mmp13. Lin28a overexpression inhibits Let-7b and Let-7c miRNA levels, facilitates the expression of HMGA2, and thereby activates the transcription of Sox9, a factor required for chondrocyte reprogramming. Finally, administration of siRNA against HMGA2 inhibits the cartilage protective effect in Lin28a overexpressing mice. This study provides insights into a new pathway to promote chondrocyte anabolism in injured cartilage involving the Lin28a–Let7 axis.



RNAseq analysis involved using duplicates for the following conditions: primary murine chondrocyte were transduced with empty vector (CMV500 empty vector Addgene plasmid #33348) for control condition and with pMSCV-mLin28A (Addgene plasmid #26357) for Lin28 over expression. Total RNA were extracted and sent to IntegraGen SA Company (evry France). Libraries are prepared with NEBNext Ultra II Directional RNA Library Prep Kit for Illumina protocol according supplier recommendations. Briefly the key stages of this protocol are successively, the purification of PolyA containing mRNA molecules using poly-T oligo attached magnetic beads from 1µg total RNA (with the Magnetic mRNA Isolation Kit from NEB), a fragmentation using divalent cations under elevated temperature to obtain approximately 300bp pieces, double strand cDNA synthesis and finally Illumina adapters ligation and cDNA library amplification by PCR for sequencing. Sequencing is then carried out on Paired End 100b reads of Illumina NovaSeq. Image analysis and base calling is performed using Illumina Real Time Analysis (3.4.4) with default parameters.

Quantification of gene expression

STAR was used to obtain the number of reads associated to each gene in the Gencode vM24 annotation (restricted to protein-coding genes, antisense and lincRNAs). Raw counts for each sample were imported into R statistical software. Extracted count matrix was normalized for library size and coding length of genes to compute FPKM expression levels.

Unsupervised analysis

The Bioconductor edgeR package was used to import raw counts into R statistical software, and compute vnormalized log2 CPM (counts per millions of mapped reads) using the TMM (weighted trimmed mean of M-values) as normalization procedure. The normalized expression matrix from the 1000 most variant genes (based on standard deviation) was used to classify the samples according to their gene expression patterns using principal component analysis (PCA), hierarchical clustering and consensus clustering. PCA vwas performed by FactoMineR::PCA function with “ncp = 10, scale.unit = FALSE” parameters. Hierarchical clustering was performed by stats::hclust function (with euclidean distance and ward.D method). Consensus clustering was performed by Consensus ClusterPlus::ConsensusClusterPlus function to examine the stability of the clusters. We established consensus partitions of the data set in K clusters (for K = 2, 3, . . . , 8), on the basis of 1,000 resampling iterations (80% of genes, 80% of sample) of hierarchical clustering, with euclidean distance and ward.D method. Then, the cumulative distribution functions (CDFs) of the consensus matrices were used to determine the optimal number of clusters (K = 3 for instance), considering both the shape of the functions and the area under the CDF curves. tSNE analysis was performed with the Bioconductor Rtsne package applied to the PCA object (theta=0.0, perplexity=, max_iter=1000).


Institut National de la Santé et de la Recherche Médicale