runINIT Generates a model using the INIT algorithm, based on proteomics and/or transcriptomics and/or metabolomics and/or metabolic tasks model a reference model structure rxnScores a vector of scores for the reactions in the model. Positive scores are reactions to keep and negative scores are reactions to exclude (opt, default all 0.0) presentMets cell array with unique metabolite names that the model should produce (opt, default []) essentialRxns cell array of reactions that are essential and that have to be in the resulting model. This is normally used when fitting a model to task (see fitTasks) (opt, default []) prodWeight a score that determines the value of having net-production of metabolites. This is a way of having a more functional network as it provides a reason for including bad reactions for connectivity reasons. This score is for each metabolite, and the sum of these weights and the scores for the reactions is what is optimized (opt, default 0.5) allowExcretion true if excretion of all metabolites should be allowed. This results in fewer reactions being considered dead-ends, but all reactions in the resulting model may not be able to carry flux. If this is "false" then the equality constraints are taken from model.b. If the input model lacks exchange reactions then this should probably be "true", or a large proportion of the model would be excluded for connectivity reasons (opt, default false) noRevLoops true if reversible reactions should be constrained to only carry flux in one direction. This prevents reversible reactions from being wrongly assigned as connected (the forward and backward reactions can form a loop and therefore appear connected), but it makes the problem significantly more computationally intensive to solve (two more integer constraints per reversible reaction) (opt, default false) params parameter structure as used by getMILPParams (opt, default []) outModel the resulting model structure deletedRxns reactions which were deleted by the algorithm metProduction array that indicates which of the metabolites in presentMets that could be produced -2: metabolite name not found in model -1: metabolite found, but it could not be produced 1: metabolite could be produced fValue objective value (sum of (the negative of) reaction scores for the included reactions and prodWeight*number of produced metabolites) This function is the actual implementation of the algorithm. See getINITModel for a higher-level function for model reconstruction. See PLoS Comput Biol. 2012;8(5):e1002518 for details regarding the implementation. Usage: [outModel deletedRxns metProduction fValue]=runINIT(model,... rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,... noRevLoops,params) Rasmus Agren, 2012-12-12
0001 function [outModel deletedRxns metProduction fValue]=runINIT(model,rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,noRevLoops,params) 0002 % runINIT 0003 % Generates a model using the INIT algorithm, based on proteomics and/or 0004 % transcriptomics and/or metabolomics and/or metabolic tasks 0005 % 0006 % model a reference model structure 0007 % rxnScores a vector of scores for the reactions in the model. 0008 % Positive scores are reactions to keep and negative 0009 % scores are reactions to exclude (opt, default all 0.0) 0010 % presentMets cell array with unique metabolite names that the model 0011 % should produce (opt, default []) 0012 % essentialRxns cell array of reactions that are essential and that 0013 % have to be in the resulting model. This is normally 0014 % used when fitting a model to task (see fitTasks) (opt, 0015 % default []) 0016 % prodWeight a score that determines the value of having 0017 % net-production of metabolites. This is a way of having 0018 % a more functional network as it provides a reason for 0019 % including bad reactions for connectivity reasons. This 0020 % score is for each metabolite, and the sum of these weights 0021 % and the scores for the reactions is what is optimized 0022 % (opt, default 0.5) 0023 % allowExcretion true if excretion of all metabolites should be allowed. 0024 % This results in fewer reactions being considered 0025 % dead-ends, but all reactions in the resulting model may 0026 % not be able to carry flux. If this is "false" then the 0027 % equality constraints are taken from model.b. If the 0028 % input model lacks exchange reactions then this should 0029 % probably be "true", or a large proportion of the model 0030 % would be excluded for connectivity reasons 0031 % (opt, default false) 0032 % noRevLoops true if reversible reactions should be constrained to 0033 % only carry flux in one direction. This prevents 0034 % reversible reactions from being wrongly assigned as 0035 % connected (the forward and backward reactions can form a 0036 % loop and therefore appear connected), but it makes the 0037 % problem significantly more computationally intensive to 0038 % solve (two more integer constraints per reversible reaction) 0039 % (opt, default false) 0040 % params parameter structure as used by getMILPParams (opt, 0041 % default []) 0042 % 0043 % outModel the resulting model structure 0044 % deletedRxns reactions which were deleted by the algorithm 0045 % metProduction array that indicates which of the 0046 % metabolites in presentMets that could be 0047 % produced 0048 % -2: metabolite name not found in model 0049 % -1: metabolite found, but it could not be produced 0050 % 1: metabolite could be produced 0051 % fValue objective value (sum of (the negative of) 0052 % reaction scores for the included reactions and 0053 % prodWeight*number of produced metabolites) 0054 % 0055 % This function is the actual implementation of the algorithm. See 0056 % getINITModel for a higher-level function for model reconstruction. See 0057 % PLoS Comput Biol. 2012;8(5):e1002518 for details regarding the 0058 % implementation. 0059 % 0060 % Usage: [outModel deletedRxns metProduction fValue]=runINIT(model,... 0061 % rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,... 0062 % noRevLoops,params) 0063 % 0064 % Rasmus Agren, 2012-12-12 0065 % 0066 0067 if nargin<2 0068 rxnScores=zeros(numel(model.rxns),1); 0069 end 0070 if isempty(rxnScores) 0071 rxnScores=zeros(numel(model.rxns),1); 0072 end 0073 if nargin<3 0074 presentMets={}; 0075 end 0076 if nargin<4 0077 essentialRxns={}; 0078 end 0079 if isempty(essentialRxns) 0080 essentialRxns={}; 0081 end 0082 if nargin<5 0083 prodWeight=0.5; 0084 end 0085 if isempty(prodWeight) 0086 prodWeight=0.5; 0087 end 0088 if nargin<6 0089 allowExcretion=false; 0090 end 0091 if nargin<7 0092 noRevLoops=false; 0093 end 0094 if nargin<8 0095 params=[]; 0096 end 0097 echo=0; 0098 if isfield(params,'printReport') 0099 if params.printReport==true 0100 echo=3; 0101 end 0102 end 0103 0104 if numel(presentMets)~=numel(unique(presentMets)) 0105 throw(MException('','Duplicate metabolite names in presentMets')); 0106 end 0107 0108 %Default is that the metabolites cannot be produced 0109 if ~isempty(presentMets) 0110 metProduction=ones(numel(presentMets),1)*-2; 0111 presentMets=upper(presentMets); 0112 pmIndexes=find(ismember(presentMets,upper(model.metNames))); 0113 metProduction(pmIndexes)=-1; %Then set that they are at least found 0114 else 0115 metProduction=[]; 0116 pmIndexes=[]; 0117 end 0118 0119 %The model should be in the reversible format and all relevant exchange 0120 %reactions should be open 0121 if isfield(model,'unconstrained') 0122 fprintf('WARNING: Exchange metabolites are still present in the model. Use simplifyModel if this is not intended.\n'); 0123 end 0124 0125 %The irreversible reactions that are essential must have a flux and are therefore not 0126 %optimized for using MILP, which reduces the problem size. However, reversible 0127 %reactions must have a flux in one direction, so they have to stay in 0128 %the problem. The essentiality constraint on reversible reactions is 0129 %implemented in the same manner as for reversible reactions when 0130 %noRevLoops==true, but with the additional constraint that C ub=-1. This 0131 %forces one of the directions to be active. 0132 revRxns=find(model.rev~=0); 0133 essentialReversible=find(ismember(model.rxns(revRxns),essentialRxns)); 0134 essentialRxns=intersect(essentialRxns,model.rxns(model.rev==0)); 0135 0136 %Convert the model to irreversible 0137 irrevModel=convertToIrrev(model); 0138 rxnScores=[rxnScores;rxnScores(model.rev==1)]; 0139 %These are used if noRevLoops is true 0140 if noRevLoops==true 0141 forwardIndexes=find(model.rev~=0); 0142 backwardIndexes=(numel(model.rxns)+1:numel(irrevModel.rxns))'; 0143 else 0144 %Then they should only be used for essential reversible reactions 0145 forwardIndexes=revRxns(essentialReversible); 0146 backwardIndexes=essentialReversible+numel(model.rxns); 0147 end 0148 0149 %Get the indexes of the essential reactions and remove them from the 0150 %scoring vector 0151 essentialIndex=find(ismember(irrevModel.rxns,essentialRxns)); 0152 rxnScores(essentialIndex)=[]; 0153 0154 %Go through each of the presentMets (if they exist) and modify the S matrix 0155 %so that each reaction which produces any of them also produces a 0156 %corresponding fake metabolite and the opposite in the reverse direction. 0157 %This is to deal with the fact that there is no compartment info regarding 0158 %the presentMets. This modifies the irrevModel structure, but that is fine 0159 %since it's the model structure that is returned. 0160 if any(pmIndexes) 0161 irrevModel.metNames=upper(irrevModel.metNames); 0162 metsToAdd.mets=strcat({'FAKEFORPM'},num2str(pmIndexes)); 0163 metsToAdd.metNames=metsToAdd.mets; 0164 metsToAdd.compartments=irrevModel.compNames{1}; 0165 0166 %There is no constraints on the metabolites yet, since maybe not all of 0167 %them could be produced 0168 irrevModel=addMets(irrevModel,metsToAdd); 0169 end 0170 0171 %Modify the matrix 0172 for i=1:numel(pmIndexes) 0173 %Get the matching mets 0174 I=ismember(irrevModel.metNames,presentMets(pmIndexes(i))); 0175 0176 %Find the reactions where any of them are used. 0177 [crap K L]=find(irrevModel.S(I,:)); 0178 0179 %This ugly loop is to avoid problems if a metabolite occurs several 0180 %times in one reaction 0181 KK=unique(K); 0182 LL=zeros(numel(KK),1); 0183 for j=1:numel(KK) 0184 LL(j)=sum(L(K==KK(j))); 0185 end 0186 irrevModel.S(numel(irrevModel.mets)-numel(pmIndexes)+i,KK)=LL; 0187 end 0188 0189 %Some nice to have numbers 0190 nMets=numel(irrevModel.mets); 0191 nRxns=numel(irrevModel.rxns); 0192 nEssential=numel(essentialIndex); 0193 nNonEssential=nRxns-nEssential; 0194 nonEssentialIndex=setdiff(1:nRxns,essentialIndex); 0195 S=irrevModel.S; 0196 0197 %Add so that each non-essential reaction produces one unit of a fake metabolite 0198 temp=sparse(1:nRxns,1:nRxns,1); 0199 temp(essentialIndex,:)=[]; 0200 S=[S;temp]; 0201 0202 %Add another set of reactions (will be binary) which also produce these 0203 %fake metabolites, but with a stoichiometry of 1000 0204 temp=sparse(1:nNonEssential,1:nNonEssential,1000); 0205 temp=[sparse(nMets,nNonEssential);temp]; 0206 S=[S temp]; 0207 0208 %Add reactions for net-production of (real) metabolites 0209 if prodWeight~=0 0210 temp=[speye(nMets-numel(pmIndexes))*-1;sparse(nNonEssential+numel(pmIndexes),nMets-numel(pmIndexes))]; 0211 S=[S temp]; 0212 %To keep the number of reactions added like this 0213 nNetProd=nMets-numel(pmIndexes); 0214 else 0215 nNetProd=0; 0216 end 0217 0218 %Add constraints so that reversible reactions can only be used in one 0219 %direction. This is done by adding the fake metabolites A, B, C for each 0220 %reversible reaction in the following manner 0221 % forward: A + .. => ... 0222 % backwards: B + ... => ... 0223 % int1: C => 1000 A 0224 % int2: C => 1000 B 0225 % A ub=999.9 0226 % B ub=999.9 0227 % C lb=-1 0228 % int1 and int2 are binary 0229 if any(forwardIndexes) 0230 nRevBounds=numel(forwardIndexes); 0231 0232 %Add the A metabolites for the forward reactions and the B 0233 %metabolites for the reverse reactions 0234 I=speye(numel(irrevModel.rxns))*-1; 0235 temp=[I(forwardIndexes,:);I(backwardIndexes,:)]; 0236 0237 %Padding 0238 temp=[temp sparse(size(temp,1),size(S,2)-numel(irrevModel.rxns))]; 0239 0240 %Add the int1 & int2 reactions that produce A and B 0241 temp=[temp speye(nRevBounds*2)*1000]; 0242 0243 %And add that they also consume C 0244 temp=[temp;[sparse(nRevBounds,size(S,2)) speye(nRevBounds)*-1 speye(nRevBounds)*-1]]; 0245 0246 %Add the new reactions and metabolites 0247 S=[S sparse(size(S,1),nRevBounds*2)]; 0248 S=[S;temp]; 0249 else 0250 nRevBounds=0; 0251 end 0252 0253 %Add so that the essential reactions must have a small flux and that the 0254 %binary ones (and net-production reactions) may have zero flux. The 0255 %integer reactions for reversible reactions have [0 1] 0256 prob.blx=[irrevModel.lb;zeros(nNonEssential+nNetProd+nRevBounds*2,1)]; 0257 prob.blx(essentialIndex)=max(0.1,prob.blx(essentialIndex)); 0258 0259 %Add so that the binary ones and net-production reactions can have at the most flux 1.0 0260 prob.bux=[irrevModel.ub;ones(nNonEssential+nNetProd+nRevBounds*2,1)]; 0261 0262 %Add that the fake metabolites must be produced in a small amount and that 0263 %the A and B metabolites for reversible reactions can be [0 999.9] and C 0264 %metabolites [-1 0] 0265 prob.blc=[irrevModel.b(:,1);ones(nNonEssential,1);zeros(nRevBounds*2,1);ones(nRevBounds,1)*-1]; 0266 0267 %Add that normal metabolites can be freely excreted if allowExcretion==true, 0268 %and that the fake ones can be excreted 1000 units at most. C metabolites 0269 %for essential reversible reactions should have an upper bound of -1. 0270 %If noRevLoops is false, then add this constraint for all the reactions instead. 0271 if noRevLoops==true 0272 revUB=zeros(nRevBounds,1); 0273 revUB(essentialReversible)=-1; 0274 else 0275 revUB=ones(nRevBounds,1)*-1; 0276 end 0277 if allowExcretion==true 0278 metUB=inf(nMets,1); 0279 else 0280 metUB=model.b(:,min(size(model.b,2),2)); 0281 end 0282 prob.buc=[metUB;ones(nNonEssential,1)*1000;ones(nRevBounds*2,1)*999.9;revUB]; 0283 0284 %Add objective coefficients for the binary reactions. The negative is used 0285 %since we're minimizing. The negative is taken for the prodWeight 0286 %as well, in order to be consistent with the syntax that positive scores 0287 %are good 0288 prob.c=[zeros(nRxns,1);rxnScores;ones(nNetProd,1)*prodWeight*-1;zeros(nRevBounds*2,1)]; 0289 prob.a=S; 0290 0291 %We still don't know which of the presentMets that can be produced. Go 0292 %through them, force production, and see if the problem can be solved 0293 params.MSK_IPAR_OPTIMIZER=7; 0294 for i=1:numel(pmIndexes) 0295 prob.blc(numel(irrevModel.mets)-numel(pmIndexes)+i)=1; 0296 [crap,res] = mosekopt('minimize echo(0)', prob,getMILPParams(params)); 0297 if res.rcode==1001 0298 throw(MException('','The Mosek licence has expired')); 0299 end 0300 if res.rcode==1010 0301 throw(MException('','The Mosek licence used only supports small problems (up to 300 variables). Have you requested the correct licence?')); 0302 end 0303 0304 if any(strfind(res.sol.bas.prosta,'INFEASIBLE')) 0305 %Reset the constraint again 0306 prob.blc(numel(irrevModel.mets)-numel(pmIndexes)+i)=0; 0307 else 0308 %Metabolite produced 0309 metProduction(pmIndexes(i))=1; 0310 end 0311 end 0312 0313 %This should work in theory, but the initial solution is still infeasible. 0314 %Disable temporarily 0315 % %It can take a very long time to find a feasible solution, so that is done 0316 % %here 0317 % prob2=prob; 0318 % params2.MSK_IPAR_OPTIMIZER=7; 0319 % params2.relGap=0.9; 0320 % params2.maxTime=60; %Only spend an hour at most to get a feasible solution 0321 % 0322 % %This is done in the following manner. Alternatively maximize for sum of 0323 % %fluxes and minimize for sum of fluxes and keep track of when integer 0324 % %variables hit a bound. This is done for the continous problem. Then fix 0325 % %the flux of those reactions and continue. This gradually reduces the 0326 % %number of "free" integer reactions. When no more reactions can be fixed 0327 % %like this, then solve for the remaining ones using MILP 0328 % allInt=[(nRxns+1):(nRxns+nNonEssential) size(S,2)-nRevBounds*2+1:size(S,2)]; 0329 % currentInt=allInt; 0330 % while 1 0331 % addedSome=false; 0332 % 0333 % %First maximize for all fluxes and fit integer variables that are close to 0334 % %1.0 to 1.0. Note that there are no integer constraints at this point 0335 % prob2.c=ones(size(prob2.c))*-1; 0336 % while 1 0337 % [crap,res]=mosekopt('minimize echo(0)', prob2,getMILPParams(params2)); 0338 % used=intersect(currentInt,find(res.sol.bas.xx>0.999999)); 0339 % %If no new reactions was found in this way, break the loop 0340 % if isempty(used) 0341 % break; 0342 % end 0343 % prob2.blx(used)=1; 0344 % prob2.bux(used)=1; 0345 % currentInt=setdiff(currentInt,used); 0346 % addedSome=true; 0347 % end 0348 % %Then do the same thing when minimizing 0349 % prob2.c=ones(size(prob2.c)); 0350 % while 1 0351 % [crap,res]=mosekopt('minimize echo(0)', prob2,getMILPParams(params2)); 0352 % used=intersect(currentInt,find(res.sol.bas.xx<10^-10)); 0353 % %If no new reactions was found in this way, break the loop 0354 % if isempty(used) 0355 % break; 0356 % end 0357 % prob2.blx(used)=0; 0358 % prob2.bux(used)=0; 0359 % currentInt=setdiff(currentInt,used); 0360 % addedSome=true; 0361 % end 0362 % if addedSome==false 0363 % %Abort if no more reactions could be added in this manner 0364 % break; 0365 % end 0366 % end 0367 % 0368 % %After this has terminated there might be some remaining integer variables 0369 % %that could not be determined 0370 % prob2.ints.sub=currentInt; 0371 % [crap,res] = mosekopt('minimize echo(0)',prob2,getMILPParams(params2)); 0372 % 0373 % %Get the values for the integers 0374 % prob.sol.int.xx=res.sol.int.xx; 0375 0376 %Add that the binary reactions may only take integer values. 0377 allInt=[(nRxns+1):(nRxns+nNonEssential) size(S,2)-nRevBounds*2+1:size(S,2)]; 0378 prob.ints.sub=allInt; 0379 [crap,res] = mosekopt(['minimize echo(' num2str(echo) ')'], prob,getMILPParams(params)); 0380 if res.rcode==1001 0381 throw(MException('','The Mosek licence has expired')); 0382 end 0383 if res.rcode==1010 0384 throw(MException('','The Mosek licence used only supports small problems (up to 300 variables). Have you requested the correct licence?')); 0385 end 0386 0387 fValue=res.sol.int.pobjval; 0388 0389 %Get all reactions used in the irreversible model 0390 usedRxns=(nonEssentialIndex(res.sol.int.xx(nRxns+1:nRxns+nNonEssential)<0.1))'; 0391 0392 %Map to reversible model IDs 0393 usedRxns=[usedRxns(usedRxns<=numel(model.rxns));revRxns(usedRxns(usedRxns>numel(model.rxns))-numel(model.rxns))]; 0394 0395 %Then get the ones that are not used in either direction or is essential 0396 I=true(numel(model.rxns),1); 0397 I(usedRxns)=false; 0398 I(essentialIndex)=false; 0399 deletedRxns=model.rxns(I); 0400 outModel=removeRxns(model,I,true,true); 0401 end