runINIT Generates a model using the INIT algorithm, based on proteomics and/or transcriptomics and/or metabolomics and/or metabolic tasks model a reference model structure rxnScores a vector of scores for the reactions in the model. Positive scores are reactions to keep and negative scores are reactions to exclude (opt, default all 0.0) presentMets cell array with unique metabolite names that the model should produce (opt, default []) essentialRxns cell array of reactions that are essential and that have to be in the resulting model. This is normally used when fitting a model to task (see fitTasks) (opt, default []) prodWeight a score that determines the value of having net-production of metabolites. This is a way of having a more functional network as it provides a reason for including bad reactions for connectivity reasons. This score is for each metabolite, and the sum of these weights and the scores for the reactions is what is optimized (opt, default 0.5) allowExcretion true if excretion of all metabolites should be allowed. This results in fewer reactions being considered dead-ends, but all reactions in the resulting model may not be able to carry flux. If this is "false" then the equality constraints are taken from model.b. If the input model lacks exchange reactions then this should probably be "true", or a large proportion of the model would be excluded for connectivity reasons (opt, default false) noRevLoops true if reversible reactions should be constrained to only carry flux in one direction. This prevents reversible reactions from being wrongly assigned as connected (the forward and backward reactions can form a loop and therefore appear connected), but it makes the problem significantly more computationally intensive to solve (two more integer constraints per reversible reaction) (opt, default false) params parameter structure as used by getMILPParams (opt, default []) outModel the resulting model structure deletedRxns reactions which were deleted by the algorithm metProduction array that indicates which of the metabolites in presentMets that could be produced -2: metabolite name not found in model -1: metabolite found, but it could not be produced 1: metabolite could be produced fValue objective value (sum of (the negative of) reaction scores for the included reactions and prodWeight*number of produced metabolites) This function is the actual implementation of the algorithm. See getINITModel for a higher-level function for model reconstruction. See PLoS Comput Biol. 2012;8(5):e1002518 for details regarding the implementation. Usage: [outModel deletedRxns metProduction fValue]=runINIT(model,... rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,... noRevLoops,params) Rasmus Agren, 2013-07-16
0001 function [outModel deletedRxns metProduction fValue]=runINIT(model,rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,noRevLoops,params) 0002 % runINIT 0003 % Generates a model using the INIT algorithm, based on proteomics and/or 0004 % transcriptomics and/or metabolomics and/or metabolic tasks 0005 % 0006 % model a reference model structure 0007 % rxnScores a vector of scores for the reactions in the model. 0008 % Positive scores are reactions to keep and negative 0009 % scores are reactions to exclude (opt, default all 0.0) 0010 % presentMets cell array with unique metabolite names that the model 0011 % should produce (opt, default []) 0012 % essentialRxns cell array of reactions that are essential and that 0013 % have to be in the resulting model. This is normally 0014 % used when fitting a model to task (see fitTasks) (opt, 0015 % default []) 0016 % prodWeight a score that determines the value of having 0017 % net-production of metabolites. This is a way of having 0018 % a more functional network as it provides a reason for 0019 % including bad reactions for connectivity reasons. This 0020 % score is for each metabolite, and the sum of these weights 0021 % and the scores for the reactions is what is optimized 0022 % (opt, default 0.5) 0023 % allowExcretion true if excretion of all metabolites should be allowed. 0024 % This results in fewer reactions being considered 0025 % dead-ends, but all reactions in the resulting model may 0026 % not be able to carry flux. If this is "false" then the 0027 % equality constraints are taken from model.b. If the 0028 % input model lacks exchange reactions then this should 0029 % probably be "true", or a large proportion of the model 0030 % would be excluded for connectivity reasons 0031 % (opt, default false) 0032 % noRevLoops true if reversible reactions should be constrained to 0033 % only carry flux in one direction. This prevents 0034 % reversible reactions from being wrongly assigned as 0035 % connected (the forward and backward reactions can form a 0036 % loop and therefore appear connected), but it makes the 0037 % problem significantly more computationally intensive to 0038 % solve (two more integer constraints per reversible reaction) 0039 % (opt, default false) 0040 % params parameter structure as used by getMILPParams (opt, 0041 % default []) 0042 % 0043 % outModel the resulting model structure 0044 % deletedRxns reactions which were deleted by the algorithm 0045 % metProduction array that indicates which of the 0046 % metabolites in presentMets that could be 0047 % produced 0048 % -2: metabolite name not found in model 0049 % -1: metabolite found, but it could not be produced 0050 % 1: metabolite could be produced 0051 % fValue objective value (sum of (the negative of) 0052 % reaction scores for the included reactions and 0053 % prodWeight*number of produced metabolites) 0054 % 0055 % This function is the actual implementation of the algorithm. See 0056 % getINITModel for a higher-level function for model reconstruction. See 0057 % PLoS Comput Biol. 2012;8(5):e1002518 for details regarding the 0058 % implementation. 0059 % 0060 % Usage: [outModel deletedRxns metProduction fValue]=runINIT(model,... 0061 % rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,... 0062 % noRevLoops,params) 0063 % 0064 % Rasmus Agren, 2013-07-16 0065 % 0066 0067 if nargin<2 0068 rxnScores=zeros(numel(model.rxns),1); 0069 end 0070 if isempty(rxnScores) 0071 rxnScores=zeros(numel(model.rxns),1); 0072 end 0073 if nargin<3 0074 presentMets={}; 0075 end 0076 if isempty(presentMets) 0077 presentMets={}; 0078 end 0079 presentMets=presentMets(:); 0080 if nargin<4 0081 essentialRxns={}; 0082 end 0083 if isempty(essentialRxns) 0084 essentialRxns={}; 0085 end 0086 essentialRxns=essentialRxns(:); 0087 if nargin<5 0088 prodWeight=0.5; 0089 end 0090 if isempty(prodWeight) 0091 prodWeight=0.5; 0092 end 0093 if nargin<6 0094 allowExcretion=false; 0095 end 0096 if nargin<7 0097 noRevLoops=false; 0098 end 0099 if nargin<8 0100 params=[]; 0101 end 0102 echo=0; 0103 if isfield(params,'printReport') 0104 if params.printReport==true 0105 echo=3; 0106 end 0107 end 0108 0109 if numel(presentMets)~=numel(unique(presentMets)) 0110 throw(MException('','Duplicate metabolite names in presentMets')); 0111 end 0112 0113 %Default is that the metabolites cannot be produced 0114 if ~isempty(presentMets) 0115 metProduction=ones(numel(presentMets),1)*-2; 0116 presentMets=upper(presentMets); 0117 pmIndexes=find(ismember(presentMets,upper(model.metNames))); 0118 metProduction(pmIndexes)=-1; %Then set that they are at least found 0119 else 0120 metProduction=[]; 0121 pmIndexes=[]; 0122 end 0123 0124 %The model should be in the reversible format and all relevant exchange 0125 %reactions should be open 0126 if isfield(model,'unconstrained') 0127 fprintf('WARNING: Exchange metabolites are still present in the model. Use simplifyModel if this is not intended.\n'); 0128 end 0129 0130 %The irreversible reactions that are essential must have a flux and are therefore not 0131 %optimized for using MILP, which reduces the problem size. However, reversible 0132 %reactions must have a flux in one direction, so they have to stay in 0133 %the problem. The essentiality constraint on reversible reactions is 0134 %implemented in the same manner as for reversible reactions when 0135 %noRevLoops==true, but with the additional constraint that C ub=-1. This 0136 %forces one of the directions to be active. 0137 revRxns=find(model.rev~=0); 0138 essentialReversible=find(ismember(model.rxns(revRxns),essentialRxns)); 0139 essentialRxns=intersect(essentialRxns,model.rxns(model.rev==0)); 0140 0141 %Convert the model to irreversible 0142 irrevModel=convertToIrrev(model); 0143 rxnScores=[rxnScores;rxnScores(model.rev==1)]; 0144 %These are used if noRevLoops is true 0145 if noRevLoops==true 0146 forwardIndexes=find(model.rev~=0); 0147 backwardIndexes=(numel(model.rxns)+1:numel(irrevModel.rxns))'; 0148 else 0149 %Then they should only be used for essential reversible reactions 0150 forwardIndexes=revRxns(essentialReversible); 0151 backwardIndexes=essentialReversible+numel(model.rxns); 0152 end 0153 0154 %Get the indexes of the essential reactions and remove them from the 0155 %scoring vector 0156 essentialIndex=find(ismember(irrevModel.rxns,essentialRxns)); 0157 rxnScores(essentialIndex)=[]; 0158 0159 %Go through each of the presentMets (if they exist) and modify the S matrix 0160 %so that each reaction which produces any of them also produces a 0161 %corresponding fake metabolite and the opposite in the reverse direction. 0162 0163 %This is to deal with the fact that there is no compartment info regarding 0164 %the presentMets. This modifies the irrevModel structure, but that is fine 0165 %since it's the model structure that is returned. 0166 if any(pmIndexes) 0167 irrevModel.metNames=upper(irrevModel.metNames); 0168 metsToAdd.mets=strcat({'FAKEFORPM'},num2str(pmIndexes)); 0169 metsToAdd.metNames=metsToAdd.mets; 0170 metsToAdd.compartments=irrevModel.comps{1}; 0171 0172 %There is no constraints on the metabolites yet, since maybe not all of 0173 %them could be produced 0174 irrevModel=addMets(irrevModel,metsToAdd); 0175 end 0176 0177 %Modify the matrix 0178 for i=1:numel(pmIndexes) 0179 %Get the matching mets 0180 I=ismember(irrevModel.metNames,presentMets(pmIndexes(i))); 0181 0182 %Find the reactions where any of them are used. 0183 [crap K L]=find(irrevModel.S(I,:)); 0184 0185 %This ugly loop is to avoid problems if a metabolite occurs several 0186 %times in one reaction 0187 KK=unique(K); 0188 LL=zeros(numel(KK),1); 0189 for j=1:numel(KK) 0190 LL(j)=sum(L(K==KK(j))); 0191 end 0192 irrevModel.S(numel(irrevModel.mets)-numel(pmIndexes)+i,KK)=LL; 0193 end 0194 0195 %Some nice to have numbers 0196 nMets=numel(irrevModel.mets); 0197 nRxns=numel(irrevModel.rxns); 0198 nEssential=numel(essentialIndex); 0199 nNonEssential=nRxns-nEssential; 0200 nonEssentialIndex=setdiff(1:nRxns,essentialIndex); 0201 S=irrevModel.S; 0202 0203 %Add so that each non-essential reaction produces one unit of a fake metabolite 0204 temp=sparse(1:nRxns,1:nRxns,1); 0205 temp(essentialIndex,:)=[]; 0206 S=[S;temp]; 0207 0208 %Add another set of reactions (will be binary) which also produce these 0209 %fake metabolites, but with a stoichiometry of 1000 0210 temp=sparse(1:nNonEssential,1:nNonEssential,1000); 0211 temp=[sparse(nMets,nNonEssential);temp]; 0212 S=[S temp]; 0213 0214 %Add reactions for net-production of (real) metabolites 0215 if prodWeight~=0 0216 temp=[speye(nMets-numel(pmIndexes))*-1;sparse(nNonEssential+numel(pmIndexes),nMets-numel(pmIndexes))]; 0217 S=[S temp]; 0218 %To keep the number of reactions added like this 0219 nNetProd=nMets-numel(pmIndexes); 0220 else 0221 nNetProd=0; 0222 end 0223 0224 %Add constraints so that reversible reactions can only be used in one 0225 %direction. This is done by adding the fake metabolites A, B, C for each 0226 %reversible reaction in the following manner 0227 % forward: A + .. => ... 0228 % backwards: B + ... => ... 0229 % int1: C => 1000 A 0230 % int2: C => 1000 B 0231 % A ub=999.9 0232 % B ub=999.9 0233 % C lb=-1 0234 % int1 and int2 are binary 0235 if any(forwardIndexes) 0236 nRevBounds=numel(forwardIndexes); 0237 0238 %Add the A metabolites for the forward reactions and the B 0239 %metabolites for the reverse reactions 0240 I=speye(numel(irrevModel.rxns))*-1; 0241 temp=[I(forwardIndexes,:);I(backwardIndexes,:)]; 0242 0243 %Padding 0244 temp=[temp sparse(size(temp,1),size(S,2)-numel(irrevModel.rxns))]; 0245 0246 %Add the int1 & int2 reactions that produce A and B 0247 temp=[temp speye(nRevBounds*2)*1000]; 0248 0249 %And add that they also consume C 0250 temp=[temp;[sparse(nRevBounds,size(S,2)) speye(nRevBounds)*-1 speye(nRevBounds)*-1]]; 0251 0252 %Add the new reactions and metabolites 0253 S=[S sparse(size(S,1),nRevBounds*2)]; 0254 S=[S;temp]; 0255 else 0256 nRevBounds=0; 0257 end 0258 0259 %Add so that the essential reactions must have a small flux and that the 0260 %binary ones (and net-production reactions) may have zero flux. The 0261 %integer reactions for reversible reactions have [0 1] 0262 prob.blx=[irrevModel.lb;zeros(nNonEssential+nNetProd+nRevBounds*2,1)]; 0263 prob.blx(essentialIndex)=max(0.1,prob.blx(essentialIndex)); 0264 0265 %Add so that the binary ones and net-production reactions can have at the most flux 1.0 0266 prob.bux=[irrevModel.ub;ones(nNonEssential+nNetProd+nRevBounds*2,1)]; 0267 0268 %Add that the fake metabolites must be produced in a small amount and that 0269 %the A and B metabolites for reversible reactions can be [0 999.9] and C 0270 %metabolites [-1 0] 0271 prob.blc=[irrevModel.b(:,1);ones(nNonEssential,1);zeros(nRevBounds*2,1);ones(nRevBounds,1)*-1]; 0272 0273 %Add that normal metabolites can be freely excreted if allowExcretion==true, 0274 %and that the fake ones can be excreted 1000 units at most. C metabolites 0275 %for essential reversible reactions should have an upper bound of -1. 0276 %If noRevLoops is false, then add this constraint for all the reactions instead. 0277 if noRevLoops==true 0278 revUB=zeros(nRevBounds,1); 0279 revUB(essentialReversible)=-1; 0280 else 0281 revUB=ones(nRevBounds,1)*-1; 0282 end 0283 if allowExcretion==true 0284 metUB=inf(nMets,1); 0285 else 0286 metUB=irrevModel.b(:,min(size(irrevModel.b,2),2)); 0287 end 0288 prob.buc=[metUB;ones(nNonEssential,1)*1000;ones(nRevBounds*2,1)*999.9;revUB]; 0289 0290 %Add objective coefficients for the binary reactions. The negative is used 0291 %since we're minimizing. The negative is taken for the prodWeight 0292 %as well, in order to be consistent with the syntax that positive scores 0293 %are good 0294 prob.c=[zeros(nRxns,1);rxnScores;ones(nNetProd,1)*prodWeight*-1;zeros(nRevBounds*2,1)]; 0295 prob.a=S; 0296 0297 %We still don't know which of the presentMets that can be produced. Go 0298 %through them, force production, and see if the problem can be solved 0299 params.MSK_IPAR_OPTIMIZER='MSK_OPTIMIZER_FREE_SIMPLEX'; 0300 for i=1:numel(pmIndexes) 0301 prob.blc(numel(irrevModel.mets)-numel(pmIndexes)+i)=1; 0302 [crap,res] = mosekopt('minimize echo(0)', prob,getMILPParams(params)); 0303 isFeasible=checkSolution(res); 0304 if ~isFeasible 0305 %Reset the constraint again 0306 prob.blc(numel(irrevModel.mets)-numel(pmIndexes)+i)=0; 0307 else 0308 %Metabolite produced 0309 metProduction(pmIndexes(i))=1; 0310 end 0311 end 0312 0313 %Add that the binary reactions may only take integer values. 0314 allInt=[(nRxns+1):(nRxns+nNonEssential) size(S,2)-nRevBounds*2+1:size(S,2)]; 0315 prob.ints.sub=allInt; 0316 [crap,res] = mosekopt(['minimize echo(' num2str(echo) ')'], prob,getMILPParams(params)); 0317 0318 %I don't think that this problem can be infeasible, so this is mainly a way 0319 %of checking the licence stuff 0320 if ~checkSolution(res) 0321 throw(MException('','The problem is infeasible')); 0322 end 0323 0324 fValue=res.sol.int.pobjval; 0325 0326 %Get all reactions used in the irreversible model 0327 usedRxns=(nonEssentialIndex(res.sol.int.xx(nRxns+1:nRxns+nNonEssential)<0.1))'; 0328 0329 %Map to reversible model IDs 0330 usedRxns=[usedRxns(usedRxns<=numel(model.rxns));revRxns(usedRxns(usedRxns>numel(model.rxns))-numel(model.rxns))]; 0331 0332 %Then get the ones that are not used in either direction or is essential 0333 I=true(numel(model.rxns),1); 0334 I(usedRxns)=false; 0335 I(essentialIndex)=false; 0336 deletedRxns=model.rxns(I); 0337 outModel=removeRxns(model,I,true,true); 0338 end