.

Wednesday, May 15, 2019

Data mining Essay Example | Topics and Well Written Essays - 3000 words

Data archeological site - Essay ExampleAutomated prospective analysis provided by the data mining techniques, as go out be discussed below, go beyond the simple analysis of past records as availed by the retrospective tools apply in decision support systems (DSS). These techniques of data mining were fundamentally as a result of the preponderating long processes of re pursuit and product developments, with the first pressing need as to help in line of business data collection, storage and retrieval. Considering every aspects of data mining, the commonly used techniques are Artificial neural networks Biclustering PageRank catching algorithms Nearest neighbor methods Rule indications. A) Data Mining Classification over large database 1. The kNN k-nearest neighbor mixed bag This algorithm is works by memorizing the entire training data and performing classification on conditions that the attributes of the prove object matches either of the training samples accurately. The kNN se eks a collection of k objects within the training set which well associates with test object, and based the assignment of an indication on the predominance of any particular class in its neighborhood. The key factors in this algorithms include the distance or similarity metric to compute distance that follow between objects a set of the labeled objects and the morsel of nearest neighbor (value of k). Advantages It is simple and easy to witness It is easy to implement its classification techniques. It can also perform so well in varied situations, hence its maximum usability. It is known for its suitability for multi-modal classes and applications in which an object is able to have a physical body of class labels. Disadvantages The choice of k is a limiting factor. If it (k) is too small, the result would be very highly sensitive to noise points. While if k is too large, the neighborhood is likely to comprise of a large number of points even from other classes. This test limit s the numbers of tests records to be classified since it is true that such test records leave behind not in most instances match any of the training records to the latter as recommended. The approach of compounding the class labels is also considered as very complicated. 2. Page Rank This is classified as a expect ranking algorithm that uses hyperlinks on the World all-encompassing electronic network. Page Rank techniques produce static rankings of the sack up pages in a manner that Page Rank value is accurately computed for each and every page that is off-line without depending on the search queries but rather on the democratic nature of the World Wide Web through the use of its wide link architecture as an indicator of any several(prenominal) page quality. It is worth noting that these features have helped in the success of the famous Google search engine. Advantages It is quite dependable as its outputs are always accurate and precise. It is simple and efficient to use on ce one has the knowledge and skills of its usability principle. Disadvantages Database search outcomes are based on literal (keywords, Meta data, and tags) items rather than on their actual meanings. Poor ranking of Web pages in different topological Web structures. I.e. in Googles ranking algorithm. Less page ranks and too overmuch time taken to list and gain high ranks for the new pages. Subsequent quotation of inaccurate tuition on different web pages may lead to indexing of such inaccurate pages, hence resulting to a mess of fiction. 3. Naive Bayes Advantages It is

No comments:

Post a Comment