Home > RAVEN > runINIT.m

runINIT

PURPOSE ^

runINIT

SYNOPSIS ^

function [outModel deletedRxns metProduction fValue]=runINIT(model,rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,noRevLoops,params)

DESCRIPTION ^

 runINIT
    Generates a model using the INIT algorithm, based on proteomics and/or
   transcriptomics and/or metabolomics and/or metabolic tasks

   model           a reference model structure
   rxnScores       a vector of scores for the reactions in the model.
                   Positive scores are reactions to keep and negative
                   scores are reactions to exclude (opt, default all 0.0)
   presentMets     cell array with unique metabolite names that the model
                   should produce (opt, default [])
   essentialRxns   cell array of reactions that are essential and that
                   have to be in the resulting model. This is normally
                   used when fitting a model to task (see fitTasks) (opt,
                   default [])
   prodWeight      a score that determines the value of having
                   net-production of metabolites. This is a way of having
                   a more functional network as it provides a reason for
                   including bad reactions for connectivity reasons. This
                   score is for each metabolite, and the sum of these weights
                   and the scores for the reactions is what is optimized
                   (opt, default 0.5)
   allowExcretion  true if excretion of all metabolites should be allowed.
                   This results in fewer reactions being considered
                   dead-ends, but all reactions in the resulting model may
                   not be able to carry flux. If this is "false" then the
                   equality constraints are taken from model.b. If the
                   input model lacks exchange reactions then this should
                   probably be "true", or a large proportion of the model
                   would be excluded for connectivity reasons
                   (opt, default false)
   noRevLoops      true if reversible reactions should be constrained to
                   only carry flux in one direction. This prevents
                   reversible reactions from being wrongly assigned as
                   connected (the forward and backward reactions can form a
                   loop and therefore appear connected), but it makes the 
                   problem significantly more computationally intensive to
                   solve (two more integer constraints per reversible reaction)
                   (opt, default false)
   params          parameter structure as used by getMILPParams (opt,
                   default [])

   outModel        the resulting model structure
   deletedRxns     reactions which were deleted by the algorithm
   metProduction   array that indicates which of the
                   metabolites in presentMets that could be
                   produced 
                   -2: metabolite name not found in model
                   -1: metabolite found, but it could not be produced
                   1: metabolite could be produced
   fValue          objective value (sum of (the negative of) 
                   reaction scores for the included reactions and
                   prodWeight*number of produced metabolites)

   This function is the actual implementation of the algorithm. See
   getINITModel for a higher-level function for model reconstruction. See 
   PLoS Comput Biol. 2012;8(5):e1002518 for details regarding the
   implementation.

   Usage: [outModel deletedRxns metProduction fValue]=runINIT(model,...
           rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,...
           noRevLoops,params)

   Rasmus Agren, 2012-12-12

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [outModel deletedRxns metProduction fValue]=runINIT(model,rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,noRevLoops,params)
0002 % runINIT
0003 %    Generates a model using the INIT algorithm, based on proteomics and/or
0004 %   transcriptomics and/or metabolomics and/or metabolic tasks
0005 %
0006 %   model           a reference model structure
0007 %   rxnScores       a vector of scores for the reactions in the model.
0008 %                   Positive scores are reactions to keep and negative
0009 %                   scores are reactions to exclude (opt, default all 0.0)
0010 %   presentMets     cell array with unique metabolite names that the model
0011 %                   should produce (opt, default [])
0012 %   essentialRxns   cell array of reactions that are essential and that
0013 %                   have to be in the resulting model. This is normally
0014 %                   used when fitting a model to task (see fitTasks) (opt,
0015 %                   default [])
0016 %   prodWeight      a score that determines the value of having
0017 %                   net-production of metabolites. This is a way of having
0018 %                   a more functional network as it provides a reason for
0019 %                   including bad reactions for connectivity reasons. This
0020 %                   score is for each metabolite, and the sum of these weights
0021 %                   and the scores for the reactions is what is optimized
0022 %                   (opt, default 0.5)
0023 %   allowExcretion  true if excretion of all metabolites should be allowed.
0024 %                   This results in fewer reactions being considered
0025 %                   dead-ends, but all reactions in the resulting model may
0026 %                   not be able to carry flux. If this is "false" then the
0027 %                   equality constraints are taken from model.b. If the
0028 %                   input model lacks exchange reactions then this should
0029 %                   probably be "true", or a large proportion of the model
0030 %                   would be excluded for connectivity reasons
0031 %                   (opt, default false)
0032 %   noRevLoops      true if reversible reactions should be constrained to
0033 %                   only carry flux in one direction. This prevents
0034 %                   reversible reactions from being wrongly assigned as
0035 %                   connected (the forward and backward reactions can form a
0036 %                   loop and therefore appear connected), but it makes the
0037 %                   problem significantly more computationally intensive to
0038 %                   solve (two more integer constraints per reversible reaction)
0039 %                   (opt, default false)
0040 %   params          parameter structure as used by getMILPParams (opt,
0041 %                   default [])
0042 %
0043 %   outModel        the resulting model structure
0044 %   deletedRxns     reactions which were deleted by the algorithm
0045 %   metProduction   array that indicates which of the
0046 %                   metabolites in presentMets that could be
0047 %                   produced
0048 %                   -2: metabolite name not found in model
0049 %                   -1: metabolite found, but it could not be produced
0050 %                   1: metabolite could be produced
0051 %   fValue          objective value (sum of (the negative of)
0052 %                   reaction scores for the included reactions and
0053 %                   prodWeight*number of produced metabolites)
0054 %
0055 %   This function is the actual implementation of the algorithm. See
0056 %   getINITModel for a higher-level function for model reconstruction. See
0057 %   PLoS Comput Biol. 2012;8(5):e1002518 for details regarding the
0058 %   implementation.
0059 %
0060 %   Usage: [outModel deletedRxns metProduction fValue]=runINIT(model,...
0061 %           rxnScores,presentMets,essentialRxns,prodWeight,allowExcretion,...
0062 %           noRevLoops,params)
0063 %
0064 %   Rasmus Agren, 2012-12-12
0065 %
0066 
0067 if nargin<2
0068     rxnScores=zeros(numel(model.rxns),1);
0069 end
0070 if isempty(rxnScores)
0071     rxnScores=zeros(numel(model.rxns),1);
0072 end
0073 if nargin<3
0074     presentMets={};
0075 end
0076 if nargin<4
0077     essentialRxns={};
0078 end
0079 if isempty(essentialRxns)
0080     essentialRxns={};
0081 end
0082 if nargin<5
0083     prodWeight=0.5;
0084 end
0085 if isempty(prodWeight)
0086     prodWeight=0.5;
0087 end
0088 if nargin<6
0089     allowExcretion=false;
0090 end
0091 if nargin<7
0092     noRevLoops=false;
0093 end
0094 if nargin<8
0095     params=[];
0096 end
0097 echo=0;
0098 if isfield(params,'printReport')
0099     if params.printReport==true
0100         echo=3;
0101     end
0102 end
0103 
0104 if numel(presentMets)~=numel(unique(presentMets))
0105     throw(MException('','Duplicate metabolite names in presentMets'));
0106 end
0107 
0108 %Default is that the metabolites cannot be produced
0109 if ~isempty(presentMets)
0110     metProduction=ones(numel(presentMets),1)*-2;
0111     presentMets=upper(presentMets);
0112     pmIndexes=find(ismember(presentMets,upper(model.metNames)));
0113     metProduction(pmIndexes)=-1; %Then set that they are at least found
0114 else
0115     metProduction=[];
0116     pmIndexes=[];
0117 end
0118 
0119 %The model should be in the reversible format and all relevant exchange
0120 %reactions should be open
0121 if isfield(model,'unconstrained')
0122     fprintf('WARNING: Exchange metabolites are still present in the model. Use simplifyModel if this is not intended.\n'); 
0123 end
0124 
0125 %The irreversible reactions that are essential must have a flux and are therefore not
0126 %optimized for using MILP, which reduces the problem size. However, reversible
0127 %reactions must have a flux in one direction, so they have to stay in
0128 %the problem. The essentiality constraint on reversible reactions is
0129 %implemented in the same manner as for reversible reactions when
0130 %noRevLoops==true, but with the additional constraint that C ub=-1. This
0131 %forces one of the directions to be active.
0132 revRxns=find(model.rev~=0);
0133 essentialReversible=find(ismember(model.rxns(revRxns),essentialRxns));
0134 essentialRxns=intersect(essentialRxns,model.rxns(model.rev==0));
0135 
0136 %Convert the model to irreversible
0137 irrevModel=convertToIrrev(model);
0138 rxnScores=[rxnScores;rxnScores(model.rev==1)];
0139 %These are used if noRevLoops is true
0140 if noRevLoops==true
0141     forwardIndexes=find(model.rev~=0);
0142     backwardIndexes=(numel(model.rxns)+1:numel(irrevModel.rxns))';
0143 else
0144     %Then they should only be used for essential reversible reactions
0145     forwardIndexes=revRxns(essentialReversible);
0146     backwardIndexes=essentialReversible+numel(model.rxns);
0147 end
0148 
0149 %Get the indexes of the essential reactions and remove them from the
0150 %scoring vector
0151 essentialIndex=find(ismember(irrevModel.rxns,essentialRxns));
0152 rxnScores(essentialIndex)=[];
0153 
0154 %Go through each of the presentMets (if they exist) and modify the S matrix
0155 %so that each reaction which produces any of them also produces a
0156 %corresponding fake metabolite and the opposite in the reverse direction.
0157 %This is to deal with the fact that there is no compartment info regarding
0158 %the presentMets. This modifies the irrevModel structure, but that is fine
0159 %since it's the model structure that is returned.
0160 if any(pmIndexes)
0161     irrevModel.metNames=upper(irrevModel.metNames);
0162     metsToAdd.mets=strcat({'FAKEFORPM'},num2str(pmIndexes));
0163     metsToAdd.metNames=metsToAdd.mets;
0164     metsToAdd.compartments=irrevModel.compNames{1};
0165     
0166     %There is no constraints on the metabolites yet, since maybe not all of
0167     %them could be produced
0168     irrevModel=addMets(irrevModel,metsToAdd);
0169 end
0170 
0171 %Modify the matrix
0172 for i=1:numel(pmIndexes)
0173     %Get the matching mets
0174     I=ismember(irrevModel.metNames,presentMets(pmIndexes(i)));
0175     
0176     %Find the reactions where any of them are used.
0177     [crap K L]=find(irrevModel.S(I,:));
0178     
0179     %This ugly loop is to avoid problems if a metabolite occurs several
0180     %times in one reaction
0181     KK=unique(K);
0182     LL=zeros(numel(KK),1);
0183     for j=1:numel(KK)
0184        LL(j)=sum(L(K==KK(j)));
0185     end
0186     irrevModel.S(numel(irrevModel.mets)-numel(pmIndexes)+i,KK)=LL;
0187 end
0188 
0189 %Some nice to have numbers
0190 nMets=numel(irrevModel.mets);
0191 nRxns=numel(irrevModel.rxns);
0192 nEssential=numel(essentialIndex);
0193 nNonEssential=nRxns-nEssential;
0194 nonEssentialIndex=setdiff(1:nRxns,essentialIndex);
0195 S=irrevModel.S;
0196 
0197 %Add so that each non-essential reaction produces one unit of a fake metabolite
0198 temp=sparse(1:nRxns,1:nRxns,1);
0199 temp(essentialIndex,:)=[];
0200 S=[S;temp];
0201 
0202 %Add another set of reactions (will be binary) which also produce these
0203 %fake metabolites, but with a stoichiometry of 1000
0204 temp=sparse(1:nNonEssential,1:nNonEssential,1000);
0205 temp=[sparse(nMets,nNonEssential);temp];
0206 S=[S temp];
0207 
0208 %Add reactions for net-production of (real) metabolites
0209 if prodWeight~=0
0210     temp=[speye(nMets-numel(pmIndexes))*-1;sparse(nNonEssential+numel(pmIndexes),nMets-numel(pmIndexes))];
0211     S=[S temp];
0212     %To keep the number of reactions added like this
0213     nNetProd=nMets-numel(pmIndexes);
0214 else
0215     nNetProd=0;
0216 end
0217 
0218 %Add constraints so that reversible reactions can only be used in one
0219 %direction. This is done by adding the fake metabolites A, B, C for each
0220 %reversible reaction in the following manner
0221 % forward: A + .. => ...
0222 % backwards: B + ... => ...
0223 % int1: C => 1000 A
0224 % int2: C => 1000 B
0225 % A ub=999.9
0226 % B ub=999.9
0227 % C lb=-1
0228 % int1 and int2 are binary
0229 if any(forwardIndexes)
0230     nRevBounds=numel(forwardIndexes);
0231     
0232     %Add the A metabolites for the forward reactions and the B
0233     %metabolites for the reverse reactions
0234     I=speye(numel(irrevModel.rxns))*-1;
0235     temp=[I(forwardIndexes,:);I(backwardIndexes,:)];
0236    
0237     %Padding
0238     temp=[temp sparse(size(temp,1),size(S,2)-numel(irrevModel.rxns))];
0239     
0240     %Add the int1 & int2 reactions that produce A and B
0241     temp=[temp speye(nRevBounds*2)*1000];
0242     
0243     %And add that they also consume C
0244     temp=[temp;[sparse(nRevBounds,size(S,2)) speye(nRevBounds)*-1 speye(nRevBounds)*-1]];
0245     
0246     %Add the new reactions and metabolites
0247     S=[S sparse(size(S,1),nRevBounds*2)];
0248     S=[S;temp];
0249 else
0250     nRevBounds=0;
0251 end
0252 
0253 %Add so that the essential reactions must have a small flux and that the
0254 %binary ones (and net-production reactions) may have zero flux. The
0255 %integer reactions for reversible reactions have [0 1]
0256 prob.blx=[irrevModel.lb;zeros(nNonEssential+nNetProd+nRevBounds*2,1)];
0257 prob.blx(essentialIndex)=max(0.1,prob.blx(essentialIndex));
0258 
0259 %Add so that the binary ones and net-production reactions can have at the most flux 1.0
0260 prob.bux=[irrevModel.ub;ones(nNonEssential+nNetProd+nRevBounds*2,1)];
0261 
0262 %Add that the fake metabolites must be produced in a small amount and that
0263 %the A and B metabolites for reversible reactions can be [0 999.9] and C
0264 %metabolites [-1 0]
0265 prob.blc=[irrevModel.b(:,1);ones(nNonEssential,1);zeros(nRevBounds*2,1);ones(nRevBounds,1)*-1];
0266 
0267 %Add that normal metabolites can be freely excreted if allowExcretion==true,
0268 %and that the fake ones can be excreted 1000 units at most. C metabolites
0269 %for essential reversible reactions should have an upper bound of -1.
0270 %If noRevLoops is false, then add this constraint for all the reactions instead.
0271 if noRevLoops==true
0272     revUB=zeros(nRevBounds,1);
0273     revUB(essentialReversible)=-1;
0274 else
0275     revUB=ones(nRevBounds,1)*-1;
0276 end
0277 if allowExcretion==true
0278     metUB=inf(nMets,1);
0279 else
0280     metUB=model.b(:,min(size(model.b,2),2));
0281 end
0282 prob.buc=[metUB;ones(nNonEssential,1)*1000;ones(nRevBounds*2,1)*999.9;revUB];
0283 
0284 %Add objective coefficients for the binary reactions. The negative is used
0285 %since we're minimizing. The negative is taken for the prodWeight
0286 %as well, in order to be consistent with the syntax that positive scores
0287 %are good
0288 prob.c=[zeros(nRxns,1);rxnScores;ones(nNetProd,1)*prodWeight*-1;zeros(nRevBounds*2,1)];
0289 prob.a=S;
0290 
0291 %We still don't know which of the presentMets that can be produced. Go
0292 %through them, force production, and see if the problem can be solved
0293 params.MSK_IPAR_OPTIMIZER=7;
0294 for i=1:numel(pmIndexes)
0295     prob.blc(numel(irrevModel.mets)-numel(pmIndexes)+i)=1;
0296     [crap,res] = mosekopt('minimize echo(0)', prob,getMILPParams(params));
0297     if res.rcode==1001
0298         throw(MException('','The Mosek licence has expired'));
0299     end
0300     if res.rcode==1010
0301         throw(MException('','The Mosek licence used only supports small problems (up to 300 variables). Have you requested the correct licence?'));
0302     end
0303     
0304     if any(strfind(res.sol.bas.prosta,'INFEASIBLE'))
0305         %Reset the constraint again
0306         prob.blc(numel(irrevModel.mets)-numel(pmIndexes)+i)=0;
0307     else
0308         %Metabolite produced
0309         metProduction(pmIndexes(i))=1;
0310     end
0311 end
0312 
0313 %This should work in theory, but the initial solution is still infeasible.
0314 %Disable temporarily
0315 % %It can take a very long time to find a feasible solution, so that is done
0316 % %here
0317 % prob2=prob;
0318 % params2.MSK_IPAR_OPTIMIZER=7;
0319 % params2.relGap=0.9;
0320 % params2.maxTime=60; %Only spend an hour at most to get a feasible solution
0321 %
0322 % %This is done in the following manner. Alternatively maximize for sum of
0323 % %fluxes and minimize for sum of fluxes and keep track of when integer
0324 % %variables hit a bound. This is done for the continous problem. Then fix
0325 % %the flux of those reactions and continue. This gradually reduces the
0326 % %number of "free" integer reactions. When no more reactions can be fixed
0327 % %like this, then solve for the remaining ones using MILP
0328 % allInt=[(nRxns+1):(nRxns+nNonEssential) size(S,2)-nRevBounds*2+1:size(S,2)];
0329 % currentInt=allInt;
0330 % while 1
0331 %     addedSome=false;
0332 %
0333 %     %First maximize for all fluxes and fit integer variables that are close to
0334 %     %1.0 to 1.0. Note that there are no integer constraints at this point
0335 %     prob2.c=ones(size(prob2.c))*-1;
0336 %     while 1
0337 %         [crap,res]=mosekopt('minimize echo(0)', prob2,getMILPParams(params2));
0338 %         used=intersect(currentInt,find(res.sol.bas.xx>0.999999));
0339 %         %If no new reactions was found in this way, break the loop
0340 %         if isempty(used)
0341 %             break;
0342 %         end
0343 %         prob2.blx(used)=1;
0344 %         prob2.bux(used)=1;
0345 %         currentInt=setdiff(currentInt,used);
0346 %         addedSome=true;
0347 %     end
0348 %     %Then do the same thing when minimizing
0349 %     prob2.c=ones(size(prob2.c));
0350 %     while 1
0351 %         [crap,res]=mosekopt('minimize echo(0)', prob2,getMILPParams(params2));
0352 %         used=intersect(currentInt,find(res.sol.bas.xx<10^-10));
0353 %         %If no new reactions was found in this way, break the loop
0354 %         if isempty(used)
0355 %             break;
0356 %         end
0357 %         prob2.blx(used)=0;
0358 %         prob2.bux(used)=0;
0359 %         currentInt=setdiff(currentInt,used);
0360 %         addedSome=true;
0361 %     end
0362 %     if addedSome==false
0363 %         %Abort if no more reactions could be added in this manner
0364 %         break;
0365 %     end
0366 % end
0367 %
0368 % %After this has terminated there might be some remaining integer variables
0369 % %that could not be determined
0370 % prob2.ints.sub=currentInt;
0371 % [crap,res] = mosekopt('minimize echo(0)',prob2,getMILPParams(params2));
0372 %
0373 % %Get the values for the integers
0374 % prob.sol.int.xx=res.sol.int.xx;
0375 
0376 %Add that the binary reactions may only take integer values.
0377 allInt=[(nRxns+1):(nRxns+nNonEssential) size(S,2)-nRevBounds*2+1:size(S,2)];
0378 prob.ints.sub=allInt;
0379 [crap,res] = mosekopt(['minimize echo(' num2str(echo) ')'], prob,getMILPParams(params));
0380 if res.rcode==1001
0381     throw(MException('','The Mosek licence has expired'));
0382 end
0383 if res.rcode==1010
0384     throw(MException('','The Mosek licence used only supports small problems (up to 300 variables). Have you requested the correct licence?'));
0385 end
0386 
0387 fValue=res.sol.int.pobjval;
0388 
0389 %Get all reactions used in the irreversible model
0390 usedRxns=(nonEssentialIndex(res.sol.int.xx(nRxns+1:nRxns+nNonEssential)<0.1))';
0391 
0392 %Map to reversible model IDs
0393 usedRxns=[usedRxns(usedRxns<=numel(model.rxns));revRxns(usedRxns(usedRxns>numel(model.rxns))-numel(model.rxns))];
0394 
0395 %Then get the ones that are not used in either direction or is essential
0396 I=true(numel(model.rxns),1);
0397 I(usedRxns)=false;
0398 I(essentialIndex)=false;
0399 deletedRxns=model.rxns(I);
0400 outModel=removeRxns(model,I,true,true);
0401 end

Generated on Tue 23-Apr-2013 15:18:37 by m2html © 2005