Thesis on data mining

Among the different data mining algorithms, probabilistic graphical models (in particular Bayesian networks) is a sound and powerful methodology grounded on probability and statistics, which allows building tractable joint probabilistic models that represents the relevant dependencies among a set of variables (hundreds of variables in real-life applications). The resulting models allow for efficient probabilistic inference. For example, a Bayesian network could represent the probabilistic relationships between large-scale synoptic fields and local observation records, providing a new methodology for probabilstic downscaling: . allowing to compute P(observation|large-scale prediction). For instance, the red dots in the figure below correspond to the grid nodes of a GCM, whereas the blue dots correspond to a network of stations with historical records (the links show the relevant dependencies, automatically discovered from data).

Formally, Bayesian networks are directed acyclic graphs whose nodes represent variables, and whose arcs encode conditional independencies between the variables. The graph provides an intuitive description of the dependency model and defines a simple factorization of the joint probability distribution leading to a tractable model which is compatible with the encoded dependencies. Efficient algorithms exist to learn both the graphical and the probabilistic models from data, thus allowing for the automatic application of this methodogy in complex problems. Bayesian networks that model sequences of variables (such as, for example, time series of historical records) are called dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve decision problems under uncertainty are called influence diagrams.

WSN nodes resource constrained. In order to keep the size and the cost of the nodes down, the nodes have limited processing power, memory and radio range. However, the resource constraint which has the most significant impact on many WSNs is the constraint on energy. WSN nodes are battery operated. Many wireless sensor networks are deployed in locations where battery replacement is not feasible. A node has to be discarded when the battery depletes. Energy scavenging may alleviate this problem in some sensor networks. Most WSN protocols are very conscious of the limited supply of energy, and try to conserve energy.

Thesis on data mining

thesis on data mining


thesis on data miningthesis on data miningthesis on data miningthesis on data mining