Difference between revisions of "Tutorial:Using regular expression: Selecting sequence motifs of a Chain"
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
This is an example on how to select sequence motifs from '''[[MSL Objects:Chain|Chain]]''' objects. A '''[[MSL Objects:Chain|Chain object]]''' versus a '''[[MSL Objects:System|System object]]''' is used because regular expressions can not span across chains. | This is an example on how to select sequence motifs from '''[[MSL Objects:Chain|Chain]]''' objects. A '''[[MSL Objects:Chain|Chain object]]''' versus a '''[[MSL Objects:System|System object]]''' is used because regular expressions can not span across chains. | ||
− | |||
− | |||
[http://mslib.svn.sourceforge.net/viewvc/mslib/trunk/examples/example_regular_expressions.cpp?view=markup Complete source of example_regular_expressions.cpp] | [http://mslib.svn.sourceforge.net/viewvc/mslib/trunk/examples/example_regular_expressions.cpp?view=markup Complete source of example_regular_expressions.cpp] | ||
Line 17: | Line 15: | ||
Go to the main directory and run the command ('''note''', the location of the exampleFiles subdirectory needs to be provided as an argument) | Go to the main directory and run the command ('''note''', the location of the exampleFiles subdirectory needs to be provided as an argument) | ||
<source lang="text"> | <source lang="text"> | ||
− | % bin/ | + | % bin/example_regular_expressions exampleFiles/example0004.pdb |
</source> | </source> | ||
Latest revision as of 22:52, 1 April 2010
This is an example on how to select sequence motifs from Chain objects. A Chain object versus a System object is used because regular expressions can not span across chains.
Complete source of example_regular_expressions.cpp
In MSL the program grepSequence utilizes the type of code in this tutorial to search for sequences in a list of PDB files, and then structurally align them.
To compile
% make bin/example_regular_expresssions
To run the program
Go to the main directory and run the command (note, the location of the exampleFiles subdirectory needs to be provided as an argument)
% bin/example_regular_expressions exampleFiles/example0004.pdb
Program description
Read in structure into System object, check that chain "A" exists.
string file = "example0004.pdb";
file = (string)argv[1] + "/" + file;
cout << "Create an AtomContainer and read the atoms from " << file << endl;
System sys;
if (!sys.readPdb(file)) {
// reading failed, error handling code here
cerr << "ERROR could not read in "<<file<<endl;
exit(0);
}
// Check to make sure chain A exits in sys
if (!sys.chainExists("A")){
// error code here.
cerr << "ERROR chain A does not exist in file "<<file<<endl;
exit(0);
}
// Get a Chain object
Chain &ch = sys.getChain("A");
Setup a regular expression object (RegEx) and a regular expression string to match 2 Valines followed by Isoleucine and then a Leucine. The RegEx match gets residue (or position) indices into the parent System object.
// Regular Expression Object
RegEx re;
// Find 3 Prolines surrounded by two Glycines on one side and three Glycines on the other
string regex = "V{2}IL";
// Now do a sequence search...
vector<pair<int,int> > matchingResidueIndices = re.getResidueRanges(ch,regex);
// Loop over each match.
for (uint m = 0; m < matchingResidueIndices.size();m++){
// Loop over each residue for this match
int match = 1;
for (uint r = matchingResidueIndices[m].first; r <= matchingResidueIndices[m].second;r++){
// Get the residue
Residue &res = ch.getResidue(r);
// .. do something cool with matched residues ...
cout << "MATCH("<<match<<"): RESIDUE: "<<res.toString()<<endl;
}
}