Manuel de la Herrán GascónIndex:
- How to make that the ordering learn
- How to build a program capable of to learn on any thing
- Agents autonomous
Is astonishing as many persons, yet with the knowledge basicos of statistics and the safety rational of that all the combinations of numbers have equal probablidad of to leave rewarded in the lottery, watch (we watch) with distrust the bill of number "11.111", or we preferred a bill brought specifically of certain people where "there is fallen the fat" two years consecutive.
This is reflex of a process mental fundamental in the beings live, on all if it is move and they should to seek food, i cover or couple: the possibility of to establish relationships cause-effect, that in the example shown, operate even where our logic conscious us says that these relationships not exist.
In the example, probably the process mental that carry to these associations it will be something seemed to the that to continuation it is describe. The person observe the numbers that they are rewarded in the different drawings, and unconsciously try of to detect regularities in those numbers. The numbers for it general not they are faciles of to recall, for it that numbers as the "72.719" it is forget easily. Without embargo, there is some very striking, but scarce, as the "33.333" or the "12.345," difficult of to forget. The person detect then that the majority of the numbers rewarded they are of the type of the "difficult of to recall" and for so much distrusts of the "easy of to recall".
The establishment of relationships cause-effect ("after of the autumn comes the i hibernate") and the forecast of the future that us permit these relationships ("is autumn, then it is next the winter") there is been a characteristic evolutionary that there is been seleccionada for its great usefulness in the survival of the individual.
[ Back to Index ]
How to make that the ordering learn
Exist many forms of to make that a program learn. A of they is to use Genetic Algorithms and Artificial Life. These methods they are described extensively to it long of the web. Of this form it is they can to obtain programs that it is adapt to the circumstances and that learn for yes same and they are capable of reacciónar before situations not anticipated.
[ Back to Index ]
How to build a program capable of to learn on any thing
A time that we have the idea of how to make that a program learn, we can to launch us to the adventure of to build one. First we have that to decide which is the problem that the program it must to learn to to solve. It more facil is that the problem to to solve it will be a game. For example we can to think in the chess, the ladies or the tic-tac-toe. The program would begin with a determined quantity of knowledge of the game, and little to little would increase its knowledge in each departure.
The program would make two things mainly: a of they would be to play departures against yes same and to extract conclusions of these departures. The other would be to play against its creative, demonstrating to him until that point there is progressed in its learning.
In it that concern to to play against yes same, these programs they are very dices to it be executed for the nights, weeks you deposit or even months. If we want that the program obtains a knowledge sufficient as for to play a departure against a person without to bore it, probably we may have that dejarlo in execution quite time. Many times, this time it will be excessive, for example, several million of years, and then it is opt for to include the greater quantity of knowledge initial in the program.
But this operation result very costly in efforts of programming. As consequence of all this, the majority of the programmers they should to adopt a commitment between two options:
The first is to accomplish a programming simple that include little knowledge initial about of the problem that it is is going to to solve, it which is a advantage, but that possess the inconvenient of to produce programs very slow in to learn until a level acceptable.
The second consist in it be strengthened for to represent in the program the maximum quantity of knowledge initial on that problem, it which suppose huge efforts of programming, with the end of to obtain a program that quickly, combining the knowledge that already possess, it will be capable of to arrive to other conclusions unknown for we, it that would be a huge advantage.
It is wants to call the attention on the interest theoretical that suppose to carry to the limit the first of these options: to accomplish the minimal programming possible to weigh of to have that to wait and to wait until that the program scope a level acceptable of knowledge.
The minimal programming possible would consist in not to include absolutely nothing of knowledge about of the game to the that the program there is of to play. For example, the program would to move chips of the chess, but initially not would know neither even which they are the rules for to move horses, bishops, etc, and not only would lose playing against we, but that few times would be capable of to end a departure without to commit a breach.
[ Back to Index ]
The program of the tic-tac-toe it is there is accomplished, for comfort, from the point of sight of the process. Exist some operating (as the crossing and the mutation) that they are applied in series to all the entities. Is to say, first it is evaluate all the entities, after it is seleccionan the better, more late it is reproduce, etc. Each agent decide how to play to the tic-tac-toe, but not puede to decide with who to play, with who it be reproduced, or when to share its knowledge.
For to endow to the agents of greater autonomy, to the time that it is achieve a approximation more nearby to the world real, is preferable to program these simulations from the point of sight of the entities (the data), being the own entity the that decide in each moment how to act.
The two styles of programming intend to simulate a parallelism: the first is a parallelism of processes, the second is a parallelism of entities. The ideal would be to have a processing for each entity (parallelism of entities real). To lack of this, the program would to distribute the time of execution between all the entities. This is more complex of to program for the fact of to have that to maintain the state in that it is there is remained each entity before of to happen to the execution of the following, but has the advantage of that great part of the code would be independent of the problem that it is it is resolving.
¿How it is procures the same result letting the initiative to the entities? Good, in reality the agents not go to to have too freedom, though such time they think it opposite. In first place is recomendable to add to the entities some coordinated that make reference to its location physical, permitting that it is move to the random for its world virtual. To to depart of now, two entities they will have to be together for power it be reproduced or to play a departure. This would be it equivalent to the function of to disorder that already possess the program. For other side, we call to the weight "energy". The energy increase to the to earn departures and reduce to the to lose them. We make that only it is they could to reproduce the agents with a determined level of energy, and that expire those in the that this there is arrived for below of certain value. Actions as it be reproduced, it be moved or the simple stagnation they can to subtract energy to the agent. As it is puede to suppose, the agents not go to to have more remedy that to play and to earn if want to follow being "live" or to have descendents. The energy global of the system would have to to stay constant (or to the less not to vary too) for coherence, and for to avoid extinctions or overpopulation, that of all forms they could to happen, and that in general it is they have of to avoid.
This model has many advantages. For a part, a time built the plan general, us we give account of that result much more easy and rapid to program some functions. The reproduction it is simplify much, to the be the own entities the that choose couple. Furthermore this model us permit to widen and to enrich the concept of agent providing to him greater power, convirtiéndolo in a small resolutor of problems. And not only that: for to solve a problem different, will suffice with to define some new agents; the rest of the program will stay equal.
We go to to see little to little all the things that our agents they are capable of to make. Each entity possess a series of actions that is capable of to accomplish, but now not alone put chips: now also puede, for yes same, to share its knowledge, to seek couple, it be reproduced, or to begin a departure. We can to make that in each cycle, each entity only could to accomplish a number determined of actions, elected of the joint of actions possible, before of to happen to the execution of the following (we recall that there is that to simulate a parallelism). Other possibility is to assign a interval of time to each entity and that each a execute all the actions that to him it will be possible. Also we will be able to limit the time of execution (or the number of actions) to some of the entities worse, in time of to eliminate them.
In the program of "Ants and Plants" included in "Ejemplos de Vida", each entity was accomplishing a action for cycle, that was depending of the state in that it is was finding, calculated in function of the neighboring that i had in that moment. In the case of that a same state could to fire more of a action, the action it is was deciding according to a certain probability. Now we are speaking of something much more i complete, more simple of to understand (already that it is seem more to the life real), though more irksome of to program. The entity will possess basically four attributes: objective, beliefs, senses and actions. The agent (automata) it is will devote to a or other task (action, exit) in function of a process of decision internal (intelligence), more or less complex, base in its beliefs (report, rules) and senses (income).
The entity is going to to have a objective that to fulfil and its senses to him will inform, between other things, of when that objective it is fulfil and in what degree. Using its beliefs, the agent accomplish in each case the action that believes that more to him will approach to its goal. In the case of the tic-tac-toe, the objective is to earn the game. The senses inform in each moment of the state of the checkerboard, that it is will store in the report. The actions they are not only the that act directly on the problem that it is it is resolving, as to put a chip, but also the that affect to any other component of the environment (as to share the knowledge with other agent), or those that consist in to handle of some form the knowledge stored in the report, and that thereinafter will determine the sequence of actions to to accomplish on the environment, as to predict the behavior of the opposite.
The information that receive the program about of the environment it is will store in variable. A state it will be a joint of values for those variable. For example, our agent it will be able to see the content of each a of the 9 cabins of the tic-tac-toe, with it that will will have of 9 variable, and the possible combinations of those values will form the joint of states possible.
That in it that the that the entity believes it we can to group in three categories. In first place, will exist a report of the events that they have occurred. For other side is possible that some variable depend of the value of other, for it that will exist rules that define this decomposition. Finally we have the rules that preach the behavior of the environment. The rules used for the game of the tic-tac-toe they are of the type "if i observe such situation, i accomplish such action". These rules serve for to decide what to make, but not explain the why. A format more general would have to to permit rules capable of to describe a state initial, a state final, and the action necessary for to happen of one to other (To see figure). Rules as (1) "If is the shift of X, i game with X, and in the cabins 1 and 2 there is a chip X, being empty the cabin 3, and i accomplish the action of to put a chip in the position 3, then the winning it will be X with a probability of the 100%, and this rule has a importance maximum". They could to exist rules without action y/o reglas donde el estado final se deba calcular en función de otras variables: "Si en las casillas 1 and 2 hay una ficha X, en las casillas 4 y 5 hay una ficha O, estando vacías las casillas 3 y 6, entonces the ganador será el del turno actual, con una probabilidad del 40%, y esta regla tiene una importancia of 0,7".
A format that admit these possibilities is:
Estado, Acción --> Conclusion, Probability, Importance
State is a joint of pars "variable = value", Action is a action simple or complex, Conclusion is a joint of pars "variable = formulate", that a time solved the formulations produces pars "variable = formulate". Probability is the confidence that the agent has in the rule, and Importance is a indicative of the degree in that the rule puede to affect to the objective.
¿How decide the entity that action to execute with this type of rules? The entity possess a variable special, call objective, that represent the degree in the that the entity it is behave according to the form wished for who the created. The variable objective will depend of the values of other variable, according to a formulation determined, in this case, the placement of three chips own in line. The evaluation of the entity, or it that is it same, the checking of until where it is fulfil the objective, it is encapsulado within of the own entity. Not is necessary that other agent it evaluate, the own entity not puede to avoid to lose "energy" if not it procures.
The agent will attempt to accomplish those actions that more to him approach to its objective. For this, it will have to to predict the response of the environment before its actions. But we recall that the agent part of a i complete ignorance of the problem particular. ¿What to make? In first place, observe the situation current (initially, the checkerboard empty)-, and prove if the objective sought it is fulfil. As still nobody there is been declared winning, and to lack of other option better, the agent execute actions to the random. After of a action observe its consequences and its "mind" produces a association cause-effect that store in form of rule. The subsequent actions they will be conditioned for the knowledge stored in the report, it being avoided in it possible to appeal to the random.
If it is accomplish a action and it is passes to a state done not know, it is will store information that represent that new state and it is will create a rule that represent the conditions necessary for to happen to that new state. The problem consist in how to identify what action there is been the responsible of a transition. For example, to put a chip in a determined place puede be the cause immediate of to earn the departure, but has few usefulness to suppose that solely that action is the responsible of to defeat in the game. Is more appropriate to say that the cause is the succession completes of actions from the beginning of the departure. For power to use this type of rules, the actions it is they will be able to organize in a structure hierarchic. Thus, a succession of actions interesting will form a action of level superior, and this action it will be able to form part of a rule.
Without embargo this not explain the majority of the behaviors intelligent known. A departure not it is earn for to accomplish a determined series of actions, but more well for a strategy, for a knowledge of the game in general, for a knowledge of expert obtained to slant of multiple experiences. To this level of analysis, the cause of have cattle is the have accomplished a series of actions -to take a decision is to accomplish a action-, that furthermore of to act on the environment or problem to to solve, handle symbols referring to he and manipulate or indicate how to manipulate the information acquired, determining in last instance which it will be the succession of actions to to accomplish on the game.
Though in each instant only it is could to execute a action basic, the actions of level superior it is they can solapar in the time. The take of a decision whose influence it is manifesto in the development of other actions puede it be understood also as a action. For example, the fact of to decide to use, under certain conditions, solely a subset of the joint total of rules possible, puede it be considered a action (of level superior). This action it is solapará with the actions basic that it is execute in each moment. This example furthermore illustrate as to permit a certain hierarchy within of the rules, already that is possible to include this action, that makes reference to the rules, within of other rule.
In the variable also exist a hierarchy. The number of variable that form a state normally it will be very large, and many times will occur that several states different it is behave as one only. This last situation it is gives in the tic-tac-toe for the symmetry that possess the checkerboard. For these reasons is interesting to represent some combinations of values of variable as a only value for a variable of level superior.
Thus, we will will have of a mechanism that us will permit something seemed to to create concepts. For example, it is would to create the concept of "two chips in line", that it will be a new variable, with a name or identificador anyone, that will adopt the value real in the case of that it is produces this situation. In the variable is possible to apply the logic diffuse. Exist concepts of nature continuous, that it is define through the reference to a property and the degree in that it is manifesto said property. For example, we we can to say that the sky is blue. Without embargo, not exist a frontier numerical that dial the difference between a sky blue and a sky white in function of the number of points of light of each color. Both concepts they can be certain simultaneously in certain measure. Thus, we could to express: the sky possess a tone bluish in a 70%. For other part not always it is will give a degree of confidence total to the information handled. Is possible that a concept it is fulfil in a certain degree and that to this information it is to him assign a certain confidence, and of this form, we could to say: the sky possess a tone bluish in a 70% and this information possess a probability of the 90% of be certain. These values they can be included in each variable.
¿How to recognize that concepts they are important? When a agent arrive to its limit of storage, it will have to seleccionar some beliefs for that they will be erased. For this, we will attempt to relate the states with the objective, of way that it is may have more in account those that more influence in he, already it will be favoring or prejudicing its attainment.
The creation of variable of level superior also makes that the needs of space to the hour of to store a rule they will be smaller. With a number sufficient of rules describing the behavior of the environment, and knowing the state current and the state to the that it is wish to arrive, it is puede to seek the succession of actions that us carry of a state to other, until the goal. The creation of variable superior not there is of be limited to a alone instant of time, to a alone reading of the environment. In reality, the time it is puede to consider as a variable more, and to create thus concepts that describe a behavior that possess continuity.
Programming Guided to... ¿Agents?
It is they have described the possibilities of reasoning within of a agent. The entities that we have studied recall to the processes Unix (that born, die, have processes children, it is communicate it being sent messages), and also have a air of instances of Programming Guided to Objects (POO), already that if in POO we have objects that encapsulan data and procedures, here we have agents that encapsulan a objective, some senses, some beliefs and some actions.
In any case, it that is evident is that if the method of search is independent of the problem to to solve, it is throw dreadfully in lack the existence of some bookstores of entities and of a collector of a language of definition of agents and of environments of problem. In theory, with something thus, us would remain little for to program. In the practical, and given the slowness inherent of the learning evolutionary, whose example more nearby we are we same (that we have delayed 3.800 million of years in evolucionar from the first microorganisms live), it will be more useful to combine these technical with the representation humanizes of the knowledge. Of all this puede to result a Resolutor General of Problems of great power. The possibilities of this new paradigm they are without doubt, very exciting.
¡You are guest to hold, criticize and collaborate with your ideas!
Send your commentaries, questions, suggestions, or criticize to to E-mail
[ Home Page Castellano | Home Page English ]