Academia.eduAcademia.edu
Advances in E-Activities, Information Security and Privacy Regular Expressions Fusion using Emergent Computing RAFAEL TORRES1, JUNIOR ALTAMIRANDA2, JOSE AGUILAR3 3 1 Centro de Estudios en Microelectrónica y Sistemas Distribuidos (CEMISID) University of Los Andes Faculty of Engineering. Campus La Hechicera. Mérida VENEZUELA torresrafael@ula.ve 2 Centro de Estudios en Microelectrónica y Sistemas Distribuidos (CEMISID) University of Los Andes Faculty of Engineering. Campus La Hechicera. Mérida VENEZUELA altamira@ula.ve Department of Computer Science, Centro de Estudios en Microelectrónica y Sistemas Distribuidos (CEMISID) University of Los Andes Faculty of Engineering. Campus La Hechicera. Mérida VENEZUELA aguilar@ula.ve CHRISTIAN DELAMARCHE Structure et Dynamique des Macromolecules University of Rennes I Campus de Beaulieu, Nb 13, Rennes FRANCE Christian.delamarche@univ-rennes1.fr Abstract: - In this article is presented an approach for the comparison and establishment of patterns (motifs) from regular expressions that denote protein families, through the fusion of the sames. This task has a high computational cost, for this reason our approach uses a combinatorial optimization algorithm based on Ant Colonies to obtain the patterns that help the study and classification of families of protein chains. Particularly, we propose a system that can efficiently generate a pattern of the fusion of two regular expressions that describe protein chains, denoted in PROSITE language, in order to facilitate the analysis and classification of protein families from the structural point of view. Key-Words: - Bioinformatics, Artificial Ant Colony, Combinatorial Optimization, Regulars Expressions, Motifs, Amyloid Protein. Currently, the algorithms that allow to find the optimal fusion of these motifs (patterns) have poor performances (at level of time and computational resources utilization) [4], [5], so it is necessary to develop new algorithms using emerging computational techniques. The solution proposed in this paper is based on a computational technique known as Ant Colony Optimization, which is based 1 Introduction The discovery and the identification of patterns (motifs) in sequences of amino acids are part of one of the main objectives in biosequences analysis [1]. One of the tasks in this domain is the protein patterns fusion described by means of regular expressions denoted in PROSITE language [2], [3], which allows to know to how the proteins evolved and if exist relationship between them. ISBN: 978-960-474-258-5 64 Advances in E-Activities, Information Security and Privacy on the behavior of ants to find the shortest path between a nest and a food source [6], [7]. Fig.1. Fusion of two regular expressions 2 Problem Formulation A Motif is a region or portion of a protein sequence that possesses a specific structure and describes a specific function. Different representations of motifs have been proposed; in this paper we use a deterministic representation. Deterministic motifs are described by a regular expression. On the other hand, PROSITE is a motifs database with biological relevance that describes the motifs using a set of rules for represented them like regular expressions [2], [3] The identification and fusion of motifs in the amino acid sequences is one of the fundamental aims in the modern biology. It‟s implications in the study of proteins known will allow the discovery of new sequences of amino acids with structures and functions that might help to the treatment of diseases. For this reason, it‟s necessary the search of new approaches to find the motifs, denoted in PROSITE language, with biological sense, in a group of proteins. Particularly, is very important to find if there are similarities between them, by then to construct a general pattern of them. The patterns found may be explained by the existence of segments that have been preserved by the natural evolution of proteins, and suggests that the obtained regions play a functional and structural role in these mechanisms. 2.1 Create the route graph Because the problem of fusion of protein motifs emerges from the study of the primary structure of proteins, which is a linear structure consisting of the amino acids constituent of a protein, there are two basic conditions for the design of the graph where will walk the ants: The first stems from an analysis in the construction of motifs, which shows that is essential for this task the position of different amino acids along the protein chains, which can be viewed as onedimensional arrays. So, a two-dimensional data structure will represent the two motifs. In the second one, we establish that the product of the fusion of motifs must generate a new pattern from which to build new amino acid chains that belong to regular expressions fused. For the previous reasons, our graph will be represented in the plane, and each node will have arcs at the right and left sides, in this way the ants can only move them in horizontal direction. The nodes must store the pheromone level deposited by the ants that visit them and the biological information about the amino acid that represent (see Fig. 2). This information will be constituted by the type of amino acid that represents, and the family to which it belongs (see Table 1), or an identifier for special nodes (see Table 2) 3 Problem Solution We propose an algorithm for the fusion of protein motifs denoted as regular expressions. This algorithm can efficiently find the union between two regular expressions described by PROSITE [2], [3], and allows the generation of a new regular expression. This algorithm is based on the Ant Colony Optimization [6], [7], [8] with some modifications with the purpose of to fit to the fusion of protein motifs. In each execution of our algorithm, two regular expressions are fused (see Fig. 1). In general, the macro-algorithm for the fusion process of protein motifs is as follows: 1) Create the route graph. 2) Walk of the ants on the route graph. 3) Choose the best nodes 4) Build the Resultant Regular Expression. ISBN: 978-960-474-258-5 Fig 2. Data structure of a node Amino Acids Family Amino Acids Classification Aliphatic Amino Acids Aromatic Amino Acids Basic Amino Acids Neutral Amino Acids Acid Amino Acids Sulfur Amino Acids Imino Acid GAVILM FYW KRH STNQ DE C P 1 2 3 4 5 6 7 Table 1. Classification and family of the amino acids Information Gap Empty Start End Special identifier x _ Start End Classification 0 -1 -2 -2 Table 2. Identifiers for special nodes 65 Advances in E-Activities, Information Security and Privacy regular expression to fuse. We define an ant type data structure composed of 9 elements, whose characteristics are described in Table 3. It contains the information necessary in order to that the ants can walk on the route graph. For example, this information determines whether an ant can visit a given node, and help to establish the amount of pheromone that it must deposite on each node visited. For the graph construction we transform a regular expression to a stack data structure (for ex., see the regular expression S-A(1,3)-x-[KV] in Fig. 3) Fig. 3. Transformation of a regular expression to a Element Characteristics Address of the node where start the ant Start Node to walk the route graph Stack that contains the regular expression that must follow the ant, Route Map and serves to know that nodes should be visited by the ant in the route graph Real number (0,1), it is used to Pheromone establish the pheromone concentration Increase that deposits the ants in each visited Coefficient node of the route graph. Integer number [0.10], it determines the pheromone level deposited by the Equalities ant, when the node found in the graph Similarity Index is identical to the expected to the route map. Integer number [0.10], it determines the pheromone level deposited by the Families Similarity ant, when the amino acid found in the route graph is not equal to the route Index map, but belongs to the same family of amino acid Integer number [0.10], it determines the pheromone level deposited by the ant, when the amino acid found in the Differences Similarity Index route graph is not equal to the route map, and does not belong to the amino acid family Integer number [0.10], it serves to Gaps Similarity mark the selected node, if node type is Index a Gap. Integer number [0.10], that indicates the level of minimum similarity necessary to consider that the visited node is similar. If visited node has a Approving similarity index superior to the Similarity approving similarity, the ant looks for the following amino acid in the route map, otherwise it continuous looking for the same amino acid in the route graph. Integer number greater to -1. It is the number of no successful searches before to look for a new amino acid in the route map. It serves to avoid that the ant does not remain indefinitely looking for a given amino acid of the Failures Maximum route map on the route graph. Thus, Number when this number of searches is reached, the ant decides with a probability of 0.5 if search in the route map a new amino acid or looking for the same amino acid in the route graph. stack Additionally, two nodes are defined, that serve as guide for the construction of the graph, to indicate the beginning and end of the route (see fig. 4). Then, we proceed to extract the elements that are at the top of the stack iteratively, and built the nodes in the network (amino acids) which are in the same position in the graph. Also adds a node gap, which will serve as an auxiliary route for cases in which the ants must not continue for any of the available nodes. In this way, we avoid that an ant stops itself. On the other hand, when we extracts a gap from the stack it is not necessary any additional node. For the special case when there are 2 values within the parentheses, we define a special node, called empty "_", to avoid the deadlock. It is necessary because the arcs that lead to these nodes must meet certain conditions: when an agent decides to go to an empty node, it would continue its route by nodes of this type until it does not find another node empty. For example, in the case of the Fig. 3 “A(1,3)” indicates that it is possible to have one to three Alanines, for this reason we need to include a given number of items as empty positions ((in this case 2). Finally, when the stack is empty we stop the construction of the graph. In our approach, we build the route graph using the first regular expression to fuse. Fig 4. Route graph of the regular expression of the Fig. 3 2.2 Walk of the ants on the route graph The artificial ants colony, like in natural ant colonies, evolves by the actions performed by its members. This way, the route graph is walked by the N-ants that constitute the colony. So, it is necessary to define the number of individuals of the colony, before they begin to walk on the route graph. On our case, each ant has a route map defined by the second ISBN: 978-960-474-258-5 Table 3. Elements of the ant data structure 66 Advances in E-Activities, Information Security and Privacy In the previous section we build the route graph using the first regular expression (ER1) “S-A(1.3)-x[KV]” to fuse. We will fuse it with the regular expression (ER2) “L (2) - A (2) - Q”. Using ER2 we build the route map of the ants. The stack of ER2 is shown in Fig. 5. to the same family, then we use the families similarity index, otherwise, if the visited node contains gap, then the gaps index is used; otherwise, is used the differences similarity Index. In our example, the final route of an ant using ER2 is observed in the figure 7. Fig. 5. Transformation of ER2 to a stack At the beginning, the ant is placed in the initial node of the route graph, and with the route map it observes the contiguous nodes at the right side (see Fig. 6). Fig. 7 The final route of the ant For a colony, the previous process is repeated for each one of the ants of the colony. Additionally, the same process is executed recursively until the number of colony cycles desired. At the end of a cycle, there is an evaporate pheromone traces, decrementing the pheromone levels of all nodes in the graph, as shown in equation 3, where "ρ" is the pheromone evaporation coefficient. (3) Fig. 6 Ant in the initial node of the route graph. The ant executes the function of transition to each one of the nodes that can visit in the next position. This function consists of two phases; the first phase calculates the probability of visiting each one of the contiguous nodes ( ) based on its pheromone level „τr‟ and the index of similarity `φr' of each node (‘r’ indicates the neighboring node in the position `k', and `n' is the number of neighboring nodes at the right side for that position `k' (see Equation 1) 2.3 Choose the best nodes Once the colony has completed its work, we delete the arcs that lead to those nodes with a pheromone level below the pheromone threshold that the user has defined (for our example, we fixe the pheromone threshold to 1,0), which help to preselect to the amino acids that contribute to the best solutions. Figure 8 shows the nodes selected because they exceeded the threshold (in blue). (1) The second phase decides the node to visit using the simulation of Monte Carlo. For that, we adjust all the probabilities calculated for the position `k', and choose a random number between 0 and 1. With this number we determine the node to be visited by the ant. When the ant moves to a node deposits pheromone, that increases the pheromone concentration in the node. The quantity of pheromone deposited depend on the similarity index with respect to the amino acid waited according to the route map (Equation 2) (2) Fig. 8. Route graph with the pheromone levels of each node 2.4 Build the Resultant Regular Expression Finally, the route graph modified is filtered to delete irrelevant information and to define the resulting patterns. To carry out this task we analyze the marked nodes of the graph, position by position, and insert the amino acids selected in a list of chains that will contain the value of the amino acids corresponding to each pattern position. To achieve this goal the following criteria are used: The similarity index is defined as follows: if the amino acid of the route graph is equal to the amino acid of the route map of the ant, then we use the equalities similarity index; otherwise, if both belong ISBN: 978-960-474-258-5 67 Advances in E-Activities, Information Security and Privacy 1. If in the position exists only one node (amino acid) that has passed the pheromone threshold, it will be inserted in the list. 2. If exist more than one node in the same position, then we check if there is a gap or empty node. If there is not one on these nodes, we insert a string list with the amino acids for that position. Otherwise, If there is a gap or empty node, the following conditions apply: a) If the level of pheromone of the gap node is superior to the result of the multiplication of the threshold for the rest of the nodes on this position, then we establis a new threshold for the position studied, which has the same value of the pheromone level of the gap node, and compare the rest of the nodes in the same position with this new threshold. If at least one of them exceeds the new pheromone level, then we discard the gap insertion for this position and instead adding a string with the amino acids with a higher level of pheromone that it, otherwise we insert the gap in the list. b) If the pheromone level of the gap node is inferior, then we discard this node and a chain with the information of the remaining nodes on the same position is inserted to the list c) We apply the same conditions of the gap nodes for the empty nodes. We take the list that contains the amino acids corresponding to each position, and we use the PROSITE language to build the fusion regular expression (see Fig. 9) System Parameters Pheromone Increase Coefficient Similarity indices for the amino acids that are the same Similarity indices for the amino acids that belong to a family Similarity indices for the amino acids that are different Similarity indices for Gaps Approving Similarity Index Failures Maximum number Pheromone Initial level on the graph nodes Pheromone evaporation coefficient 10 8 1 3 3 0 1,0 0,05 Table 4. Parameter List a) Fusion of [ST]-x(2)-[ST] with [ST]-x-[RK] It‟s possible to observe that with the expression “[ST]-x(2)-[ST]” can be obtained a chain of 4 amino acids (for example, SAKT), whereas with the expression “[ST]-x-[RK]” is obtained a chain of 3 amino acids (for example, TER). Our algorithm takes the regular expression "[ST]-x(2)-[ST]” as ER1 for the construction of the route graph (the longest), and the other regular expression is ER2 (with it the ants define the route map). In addition, the pheromone threshold is equal to the pheromone initial level in the nodes (1,0). For the tests, thirty fusions of the two regular expressions with the same group of parameters were studied. Also, we study the average time used by the algorithm for obtaining the respective solutions and thus determine when a set of parameters is better than another. The first tests were carried out for the parameters set: ants numbers = 4, cycles number = 4. We observe that the algorithm converges in 90% of the times (see Table 5), with an average time of 0.76 seconds. Fig 9. Resultant fusion pattern. Pattern S-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) 3 Experiments 3.1 Test Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) T-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) T-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Table 5. Results for the first tests The motif proteins were taken from the database AMYPdb [9], [10]. To run the system is necessary to adjust a set of parameters. Because the number of adjustable parameters in the developed system is quite extensive, some values for the tests were left fix (see Table 4). The only parameters that we have varied are the parameters that determine the collective behavior: the cycles number and the ants number. This way, the solution depends fundamentally on the behavior of the colony. ISBN: 978-960-474-258-5 Value 0,1 The second tests were carried out for the parameters set: ants number = 4, cycles number = 8. We observe that the algorithm converges in a 92.33% of the times (see Table 6). In addition, the algorithm had an average time of 1.10 seconds. 68 Advances in E-Activities, Information Security and Privacy Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) T-x(3) [GT]-[AN]-[EY]-[IQ]-[PV]-x-K-S-[DP]-[GV]-[HKV] Pattern [ST]-x(3) S-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Table 6. Results for the second tests The third tests are for the parameters set: ants number = 8, cycles number = 4. We see in the Table 7 that the algorithm converges in 96.66% of the times and it presents a best runtime with respect to the previous tests, the average time is 0.89 seconds Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) S-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Table 7. Results for the third tests The last tests is for the next parameters: ants number = 8, cycles number = 8, obtaining with this set a better precision level, since it has a convergence of 100% compared to the expected pattern (see Table 8). Additionally, the average time was 1.09 seconds. Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Fig 10. Graphic representation of the first pattern Pattern [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) [ST]-x(3) Table 8. Results of the last tests The best resultant fusion pattern of the two regular expressions is “[ST]-x(3)”. (see Fig. 10). We conclude that we can obtain the fusion of the expressions [ST]-x (2)-[ST] and [ST]-x-[RK] in a very short time, when the ants and cycles number are equal to the positions number of ER1; Nevertheless, the best performance is obtained when the ants and cycles number duplicates the positions number of ER1 (the selected pattern to construct the route graph) with a very similar runtime. Family Protein Name Protein description Organism Tau O02592_C AEEL (O02592) PTL-1A protein (Protein with tau-like repeats protein 1, isoform a) Caenorhabd itis elegans Tau Q17364_C AEEL (Q17364) TAU-1a (Fragment) Caenorhabd itis elegans Tau Q17365_C AEEL (Q17365) TAU-1b (Fragment) Caenorhabd itis elegans Tau Q53YB1_H UMAN (Q53YB1) Microtubuleassociated protein tau (Microtubuleassociated protein tau, isoform 4) Homo sapiens (Human) Tau Q547J4_M OUSE (Q547J4) Microtubule binding protein tau Mus musculus (Mouse) Tau Q5CZI7_H UMAN (Q5CZI7) Microtubuleassociated protein tau Homo sapiens (Human) Tau TAU_MOU SE (P10637) Microtubuleassociated protein tau (Neurofibrillary tangle protein) (Paired helical filament-tau) (PHFtau) Mus musculus (Mouse) Tau TAU_PAN TR (Q5YCW1) Microtubuleassociated protein tau Pan troglodytes (Chimpanze e) Fig. 10. Graph of the fusion pattern [ST]-x(3) b) Biological Patterns Fusion Now perform the fusion of two patterns and analyze its biological sense, to see if the patterns generated by the system are useful for the study of protein chains. The first pattern is (see Fig. 10 and Table 9): K-x-G-S-L-[DGK]-N-[AIV]-T-H-V-[AP]-G-G-G[AHN]-[KV]-[KQ]-I-E-[NST]-[HR]-K-L-[DST]-F[RS]-x-[AN]-[AS]-[KP]-x-[KV]-[GT]-[DS]-[HK]- ISBN: 978-960-474-258-5 69 Advances in E-Activities, Information Security and Privacy Tau Tau Tau Tau Microtubuleassociated protein tau (Neurofibrillary tangle protein) (Paired helical filament-tau) (PHFtau) Papio hamadryas (Hamadryas baboon) TAU_PON PY (Q5S6V2) Microtubuleassociated protein tau Pongo pygmaeus (Orangutan) TAU_RAT (P19332) Microtubuleassociated protein tau (Neurofibrillary tangle protein) (Paired helical filament-tau) (PHFtau) Rattus norvegicus (Rat) Microtubuleassociated protein tau Spermophil us citellus (European suslik) (Citellus citellus) TAU_PAP HA (Q9MYX8) TAU_SPEC I (Q6TS35) (Q91WK4) tau TAU_BOVIN (P29172) Microtubuleassociated protein tau (Neurofibrillary tangle protein) (Paired helical filament-tau) (PHFtau) Bos taurus (Bovine) Tau TAU_CAPHI (O02828) Microtubuleassociated protein tau (Neurofibrillary tangle protein) (Paired helical filament-tau) (PHFtau) Capra hircus (Goat) Tau TAU_GORG O (Q5YCW0) Microtubuleassociated protein tau Gorilla gorilla gorilla (Lowland gorilla) Tau TAU_HUMA N (P10636) Microtubuleassociated protein tau (Neurofibrillary tangle protein) (Paired helical filament-tau) (PHFtau) Homo sapiens (Human) Tau TAU_HYLL A (Q5YCV9) Microtubuleassociated protein tau Hylobates lar (Common gibbon) Tau TAU_PONPY (Q5S6V2) Microtubuleassociated protein tau Pongo pygmaeus (Orangutan) Microtubuleassociated protein tau Spermophilus citellus (European suslik) (Citellus citellus) Tau Table 9. Some proteins chains that contain the first pattern [9]. The second pattern is (see Fig. 11 and Table 10): G-S-[KT]-D-N-[IM]-[KNR]-H-x-P-G-G-G-[KNS]-VQ-I-[FV]-[DHY]-[EK] Tau TAU_SPECI (Q6TS35) Table 10. Some proteins chains that contain the second pattern [9]. To realize the fusion the parameters are taken from the Table 4. In addition, 94 ants and equal number of colony cycles used. The system execution is realised 3 times and the following results are obtained (see Table 11): Fig 11. Graphic representation of the second pattern Family Protein Name Protein Description Organism Tau O02592_CAE EL (O02592) PTL-1A protein (Protein with taulike repeats protein 1, isoform a) Caenorhabditi s elegans Tau Q17364_CAE EL (Q17364) TAU-1a (Fragment) Caenorhabditi s elegans Tau Q8JIW8_XE NLA (Q8JIW8) Tau-like protein-1 Xenopus laevis (African clawed frog) Tau Q91WK4_M OUSE Microtubuleassociated protein Mus musculus (Mouse) ISBN: 978-960-474-258-5 Execution Pattern 1 x(2)-G-S-x-[DGK]-N-[AIV]-T-H-x-[AP]-G(3)[HN]-[KV]-Q-I-x(2)-[HR]-K-(24) 2 x(2)-G-S-x-[DGK]-N-[AIV]-T-H-x-P-G(3)[AHN]-V-Q-I-x(2)-[HR]-K-x(24) 3 x(2)-G-S-x-[DK]-N-[AIV]-T-H-x-[AP]-G(3)[AHN]-V-Q-I-x(2)-H-K-x(24) Table 11. Fusion Resultant The data base AMYPdb is consult for the results obtained and is observed that the second and third 70 Advances in E-Activities, Information Security and Privacy pattern are not found. For the first pattern the result is the following (see Table 12): Amyloid Protein: x(2)-G-S-x-[DGK]-N-[AIV]-T-H-x-[AP]G(3)-[HN]-[KV]-Q-I-x(2)-[HR]-K-(24) Family Protein Name Protein Description Organism Tau O02592_CAE EL (O02592) PTL-1A protein (Protein with taulike repeats protein 1, isoform a) Caenorhabditi s elegans Tau Q17364_CAE EL (Q17364) TAU-1a (Fragment) Caenorhabditi s elegans Tau Q17364_CAE EL (Q17364) TAU-1b (Fragment) Caenorhabditi s elegans Table 11. Protein that contain the Pattern Resultant 4 Conclusion The algorithm based on Ant Colony Optimization is a good solution for the discovery of protein patterns. It‟s possible to find fusion patterns for groups of regular expressions with biological sense. The main parameters of our algorithm are that determine the collective behavior of the ants. References: [1] Srinivas V., Bioinformatics A modern Approach, Eastern Economy Edition, 2005 [2] Database PROSITE. Available in: [http://www.expasy.ch/prosite/] [3] Bairoch A., Bucher P., Hofmann K., The PROSITE database, its status in 1997 Nucleic Acids Research, Vol. 25, No. 1, pp. 217-221, 1997 [4] Pevsner J., Bioinformatics and Functional Genomics, Second Edition, Wiley – Backwell, 2009 [5] Mathura V., Kangueane P. Bioinformatic A Concept-Based Introduction, Springer, 2009 [6] Dorigo M., Di Caro G., Sampels, M. Ant Algorithms. Bruselas, Springer, 2002 [7] Dorigo M., Thomas, S. Ant Colony Optimization. Massachusetts Institute of Technology (MIT), Boston, 2004. [8] Aguilar J., Rivas F. (Ed.), Introducción a la Computación Inteligente, MERITEC, Venezuela. 2001. [9] AMYPdb. Available in: [http://amypdb.univrennes1.fr] [10] Pawlicki S., Le Béchec A., Delamarche C., AMYPdb: A database dedicated to amyloid precursor proteins, BMC Bioinformatics, Vol. 9, pp. 273-28, 2008 ISBN: 978-960-474-258-5 71