getExpressionStructure Loads a representation of an experiment from an Excel file (see comments further down) fileName an Excel representation on an experiment experiment an experiment structure data matrix with expression values orfs the corresponding ORFs experiments the titles of the experiments boundNames reaction names for the bounds upperBoundaries matrix with the upper bound values fitNames reaction names for the measured fluxes fitTo matrix with the measured fluxes A very common data set when working with genome-scale metabolic models is that you have measured fermentation data, gene expression data, and some different 'bounds' (for example different carbon sources or genes that are knocked out) in a number of conditions. This function reads an Excel representation of such an experiment. The Excel file must contain three sheets, 'EXPRESSION', 'BOUNDS', 'FITTING'. Below are some examples to show how they should be formatted: -EXPRESSION ORF dsm_paa wisc_paa Pc00e00030 79.80942723 78.14755338 Shows the expression of the gene Pc00e00030 under two different conditions (in this case a DSM strain and a Wisconsin strain of P. chrysogenum with PSS in the media) -BOUNDS Fixed Upper dsm_paa wisc_paa paaIN 0.1 0.2 The upper bound for the reaction paaIN should be 0.1 for the first condition and 0.2 for the second -FITTING Fit to dsm_paa wisc_paa co2OUT 2.85 3.05 glcIN 1.2 0.9 The measured fluxes for CO2 production and glucose uptake for the two conditions. The model(s) can later be fitted to match these values as good as possible. Usage: experiment=getExpressionStructure(fileName) Rasmus Agren, 2013-08-01
0001 function experiment=getExpressionStructure(fileName) 0002 % getExpressionStructure 0003 % Loads a representation of an experiment from an Excel file (see 0004 % comments further down) 0005 % 0006 % fileName an Excel representation on an experiment 0007 % 0008 % experiment an experiment structure 0009 % data matrix with expression values 0010 % orfs the corresponding ORFs 0011 % experiments the titles of the experiments 0012 % boundNames reaction names for the bounds 0013 % upperBoundaries matrix with the upper bound values 0014 % fitNames reaction names for the measured fluxes 0015 % fitTo matrix with the measured fluxes 0016 % 0017 % A very common data set when working with genome-scale metabolic models 0018 % is that you have measured fermentation data, gene expression data, 0019 % and some different 'bounds' (for example different carbon sources 0020 % or genes that are knocked out) in a number of conditions. This function 0021 % reads an Excel representation of such an experiment. 0022 % The Excel file must contain three sheets, 'EXPRESSION', 'BOUNDS', 0023 % 'FITTING'. Below are some examples to show how they should be 0024 % formatted: 0025 % 0026 % -EXPRESSION 0027 % ORF dsm_paa wisc_paa 0028 % Pc00e00030 79.80942723 78.14755338 0029 % Shows the expression of the gene Pc00e00030 under two different 0030 % conditions (in this case a DSM strain and a Wisconsin strain of P. 0031 % chrysogenum with PSS in the media) 0032 % 0033 % -BOUNDS 0034 % Fixed Upper dsm_paa wisc_paa 0035 % paaIN 0.1 0.2 0036 % The upper bound for the reaction paaIN should be 0.1 for the first 0037 % condition and 0.2 for the second 0038 % 0039 % -FITTING 0040 % Fit to dsm_paa wisc_paa 0041 % co2OUT 2.85 3.05 0042 % glcIN 1.2 0.9 0043 % The measured fluxes for CO2 production and glucose uptake for the two 0044 % conditions. The model(s) can later be fitted to match these values as 0045 % good as possible. 0046 % 0047 % Usage: experiment=getExpressionStructure(fileName) 0048 % 0049 % Rasmus Agren, 2013-08-01 0050 % 0051 0052 [type, sheets]=xlsfinfo(fileName); 0053 0054 %Check if the file is a Microsoft Excel Spreadsheet 0055 if ~strcmp(type,'Microsoft Excel Spreadsheet') 0056 dispEM('The file is not a Microsoft Excel Spreadsheet'); 0057 end 0058 0059 %Check that all sheets are present and saves the index of each 0060 exprSheet=find(strcmp('EXPRESSION', sheets)); 0061 boundSheet=find(strcmp('BOUNDS', sheets)); 0062 fitSheet=find(strcmp('FITTING', sheets)); 0063 0064 if length(exprSheet)~=1 || length(boundSheet)~=1 || length(fitSheet)~=1 0065 dispEM('Not all required spreadsheets are present in the file'); 0066 end 0067 0068 %Load the expression data 0069 [discard,dataSheet]=xlsread(fileName,exprSheet); 0070 experiment.data=discard; 0071 experiment.orfs=dataSheet(2:size(dataSheet,1),1); 0072 experiment.experiments=dataSheet(1,2:size(dataSheet,2)); 0073 0074 %Loads the maximal boundaries 0075 [discard,dataSheet]=xlsread(fileName,boundSheet); 0076 experiment.boundNames=dataSheet(2:size(dataSheet,1),1); 0077 experiment.upperBoundaries=discard; 0078 0079 %Loads the experimental data to fit to 0080 [discard,dataSheet]=xlsread(fileName,fitSheet); 0081 experiment.fitNames=dataSheet(2:size(dataSheet,1),1); 0082 experiment.fitTo=discard; 0083 0084 %Check to see that the dimensions are correct 0085 if length(experiment.orfs)~=size(experiment.data,1) || (length(experiment.experiments)~=size(experiment.data,2) && ~isempty(experiment.data)) 0086 dispEM('The expression data does not seem to be formated in the expected manner'); 0087 end 0088 end