Section 16 Paralogs in Complexes MS-Data
In this script, the information from protein complexes stability and subunits variability calculated at MS data, for the different dataset is used to calculate, the proportion of variable subunits and stable subunits in term of their stoichiometries that have paralogs. We show also here that, as already observed for the RNASeq data, paralogs are usually predominantly present in complexes that have variable stoichiometries during neurogenesis. We also address each single paralog substitution in terms of relative fold-changes.
In particular in the code below, for each dataset considered, we run the script GetParalogs.R
that outputs different plot on the stability and variability of paralogs inside protein complexes in the different dataset. Plus it returns a file Experiment_Condition_PutativeSwitch.csv
that contains all the putative switch for the different datasets. The *_PutativeSwitch.csv
files, are for annotating purpose only, are not further processed down in the analysis. These files contains comparisons between paralog pairs across different organism during neurogenesis, we can consider them as a “paralog paired” version of the subunit_stability
files.
#Change Directory
setwd("../ComplexScript/")
#Identifier
<- "SYMBOL"
identifier
#Organism experims, and organism must be in the same order.
<- c("Djuric","Drerio","Frese","MouseTMT")
experims <- c("hsapiens","drerio","rnorvegicus","mmusculus")
organism
#Output Dir
<- "../out/complex_coexpr/"
Dir
#Run here <-----
#Files Dir
<- list.files(Dir)
Files
#Dataset information
<- as.data.frame(cbind(experims,organism))
DataDF
#Take Subunits Stabilities Files
<- sort(Files[grep("subunits_stability",Files)])
SubunitsStabilities <- sort(Files[grep("complex_stability",Files)])
ComplexStabilities
<- as.data.frame(cbind(SubunitsStabilities,ComplexStabilities))
DF <- (sapply(DF[,"SubunitsStabilities"],function(x){strsplit(as.character(x),"_")[[1]][1]}))
expLab $exp <- expLab
DF$organism <- sapply(DF$exp,function(x){DataDF[DataDF$exp==x,"organism"]})
DF$identifier <- rep(identifier,nrow(DF))
DFc(1,2)] <- apply(DF[,c(1,2)],2,function(x){paste(Dir,x,sep = "")}) DF[,
In this case the files that we need are already here, in the folder, so we don’t need to run this script again.
#Change Directory
setwd("../ComplexScript/")
#Run Script Command Line Args
for (R in c(1:nrow(DF)))
{<- paste("/gsc/biosw/src/R-4.0.3/bin/Rscript GetParalogs.R",DF[R,1],DF[R,2],DF[R,4],DF[R,5])
cmd system(cmd)
}
- The results plot are located inside
ComplexScript/Plots
directory. - The resulting datasets are located inside
ComplexScript/Out
directory.