Project Login
Registration No:
Password:
MAIL ALERTS SMS ALERTS
 
     
   
     

Clustering with Multiviewpoint-Based Similarity Measure

Platform : java

IEEE Projects Years : 2012 - 13

Clustering with Multiviewpoint-Based Similarity Measure

Abstract:

            All clustering methods have to assume some cluster relationship among the data objects that they are applied on. Similarity between a pair of objects can be defined either explicitly or implicitly. In this paper, we introduce a novel multiviewpoint-based similarity measure and two related clustering methods. The major difference between a traditional dissimilarity/similarity measure and ours is that the former uses only a single viewpoint, which is the origin, while the latter utilizes many different viewpoints, which are objects assumed to not be in the same cluster with the two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. Theoretical analysis and empirical study are conducted to support this claim. Two criterion functions for document clustering are proposed based on this new measure. We compare them with several well-known clustering algorithms that use other popular similarity measures on various document collections to verify the advantages of our proposal.

 

Existing System

            A common approach to the clustering problem is to treat it as an optimization process. An optimal partition is found by optimizing a particular function of similarity (or distance) among data. Basically, there is an implicit assumption that the true intrinsic structure of data could be correctly described by the similarity formula defined and embedded in the clustering criterion function. Hence, effectiveness of clustering algorithms under this approach depends on the appropriateness of the similarity measure to the data at hand. For instance, the original k-means has sum-of-squared-error objective function that uses Euclidean distance. In a very sparse and high-dimensional domain like text documents, spherical k-means, which uses cosine similarity (CS) instead of Euclidean distance as the measure, is deemed to be more suitable.

 

Proposed System:

            The work in this paper is motivated by investigations from the above and similar research findings. It appears to us that the nature of similarity measure plays a very important role in the success or failure of a clustering method. Our first objective is to derive a novel method for measuring similarity between data objects in sparse and high-dimensional domain, particularly text documents. From the proposed similarity measure, we then formulate new clustering criterion functions and introduce their respective clustering algorithms, which are fast and scalable like k-means, but are also capable of providing high-quality and consistent performance.

 

Software Requirement Specification

Software Specification

Operating System       :           Windows XP

Technology                 :           JAVA 1.6, Jfreechart

Hardware Specification

Processor                     :           Pentium IV

RAM                           :           512 MB

Hard Disk                   :           80GB

 

Modules:

  • Select File

HTML root file is selected from the list of files displayed in the window

  • Process

By processing the root file, we can get the child files which are linked to root file.

  • Histogram

Histogram displays the no of documents by showing the similarity range between 0 to 1.

  • Clusters

Clusters formed by considering similarity of the documents.

  • Similarity

Similarity is calculated between the keyword tags between two files

  • Result

Result is displayed as a bar chart which axis has similarity between file to file.


NOW GET PROJECTS ! GET TRAINED ! GET PLACED !

IEEE, NON-IEEE, REAL TIME LIVE ACADEMIC PROJECTS,

PROJECTS WITH COMPLETE COURSES,SOFT SKILLS & PLACEMENTS

ALLOVER INDIA & WORLD WIDE,

HOSTEL FACILITY AVAILABLE FOR GIRLS & BOYS SEPARATELY,

CALL: 08985129129 ,  E-Mail Id: support@ascentit.in

REGISTER FOR PROJECTS NOW ! GET DISCOUNT
   
1