Project Login
Registration No:
Password:
MAIL ALERTS SMS ALERTS
 
     
   
     

Slicing: A New Approach for Privacy Preserving Data Publishing

Platform : java

IEEE Projects Years : 2012 - 13

Slicing: A New Approach for Privacy Preserving Data Publishing

 

Abstract

 

            Several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving micro data publishing. Recent work has shown that generalization loses considerable amount of information, especially for high dimensional data. Bucketization, on the other hand, does not prevent membership disclosure and does not apply for data that do not have a clear separation between quasi-identifying attributes and sensitive attributes. In this paper, we present a novel technique called slicing, which partitions the data both horizontally and vertically. We show that slicing preserves better data utility than generalization and can be used for membership disclosure protection. Another important advantage of slicing is that it can handle high-dimensional data. We show how slicing can be used for attribute disclosure protection and develop an efficient algorithm for computing the sliced data that obey the ‘-diversity requirement. Our workload experiments confirm that slicing preserves better utility than generalization and is more effective than bucketization in workloads involving the sensitive attribute. Our experiments also demonstrate that slicing can be used to prevent membership disclosure.

 

 

 

Existing System

 

 

 

        Existing System have generalization loses considerable amount of information, especially for high dimensional data. Bucketization, on the other hand, does not prevent membership disclosure and does not apply for data that do not Have a clear separation between quasi-identifying attributes and sensitive attributes. Generalization could not handle high dimensional data and not reduces the dimensionality of data. Packetization does not improve data utility .It does not give attribute disclosure protection and Membership disclosure protection.

 

Proposed System

 

In this paper, we present a novel technique called Slicing, which partitions the data both horizontally and vertically. We show that slicing preserves better data utility than generalization and can be used for membership disclosure protection. Another important advantage of slicing is that it can handle high-dimensional data. We show how slicing can be used for attribute disclosure protection and develop an efficient algorithm for computing the sliced data that obey the ‘-diversity requirement. Our workload experiments confirm that slicing preserves better utility than generalization and is more effective than bucketization in workloads involving the sensitive attribute. The idea of slicing is to achieve a better trade-off between Privacy and utility by preserving correlations between highly correlated attributes and breaking correlations between uncorrelated attributes.

 

System Requirements

 

 

 

 

 

 

 

Hardware requirement

 

Processor:  Pentium IV

 

Speed     :  1.1GHz

 

RAM      :  512 MB

 

Hard Disk: 40 GB

 

General Keyboard, monitor, and mouse

 

Software Requirement

 

 Operating System: Windows XP

 

 Software              :  Net Beans IDE 6.1

 

 Front End         : JAVA (JDK 6.1),

 

 Back End         : SQL Server 2000

 

 

 

Modules

 

1.Privacy Threads

 

2.Attribute Partitioning

 

3.Column Generalization

 

4.Membership Disclosure Production

 

 

 

Modules Description

 

 

 

1.Privacy Threats

 

        There are three types of privacy disclosure threats The first type is membership disclosure. When the data set to be published is selected

 

From a large population and the selection criteria are sensitive one need to prevent adversaries from learning whether one’s record is included in the published data set.

 

The second type is identity disclosure, which occurs when an individual is linked to a particular record in the released table. In some situations, one wants to protect against identity disclosure when the adversary is uncertain of membership. In this case, protection against membership disclosure helps protect against identity disclosure.

 

 The third type is attribute disclosure, which occurs when new information about some individuals is revealed, the released data make it possible to infer the attributes of an individual more accurately than it would be possible before the release.

 

2.Attribute Partitioning

 

Our algorithm partitions attributes so that highly correlated attributes are in the same column. This is good for both utility and privacy. In terms of data utility, grouping highly correlated attributes preserves the correlations among those attributes. In terms of privacy, the association of uncorrelated Attributes presents higher identification risks than the association of highly correlated attributes because the association of uncorrelated attributes values is much less frequent and thus more identifiable. Therefore

 

 

 

 

 

 

 

We now present an efficient slicing algorithm to achieve ‘-diverse slicing. Given a micro data table T and two parameters c and ‘, the algorithm computes the sliced table that consists of c columns and satisfies the privacy

 

Requirement of ‘-diversity.

 

 

 

 

 

 

 

 

 

 

 

3.Column Generalization

 

Column generalization is not a required phase, it can be useful in several aspects. First, column generalization may be required for identity/membership disclosure protection. If a column value is unique in a column a tuple with this unique column value can only have one matching bucket. This is not good for privacy protection, as in the case of generalization/bucketization where each tuple can belong to only one equivalence-class/bucket.

 

4.Membership Disclosure Production

 

        Slicing offers protection against membership disclosure because QI attributes are partitioned into different columns and correlations among different columns within each bucket are broken. Consider the sliced table in Table 1f. The table has two columns. The first bucket is resulted from four tuples; we call them the original tuples.

 

Future Enhancement

 

This work motivates several directions for future research. First, in this paper, we consider slicing where each attribute is in exactly one column. An extension is the notion of overlapping slicing, which duplicates an attribute in more than one columns. This releases more attribute correlations. For example, in Table 1f, one could choose to include the Disease attribute also in the first column. That is, the two columns are fAge; Sex; Disease and fZipcode; Disease. This could provide better data utility, but the privacy implications need to be carefully studied and understood. It is interesting to study the trade-off between privacy and utility.Second, we plan to study membership disclosure protection in more details. Our experiments show that random grouping is not very effective. We plan to design more effective tuple grouping algorithms.

 

 

 

Third, slicing is a promising technique for handling high-dimensional data. By partitioning attributes into columns, we protect privacy by breaking the association of uncorrelated attributes and preserve data utility by Preserving the association between highly correlated attributes.

 

Design data mining tasks using the anonym zed data Computed by various anonymization techniques.

 

 



NOW GET PROJECTS ! GET TRAINED ! GET PLACED !

IEEE, NON-IEEE, REAL TIME LIVE ACADEMIC PROJECTS,

PROJECTS WITH COMPLETE COURSES,SOFT SKILLS & PLACEMENTS

ALLOVER INDIA & WORLD WIDE,

HOSTEL FACILITY AVAILABLE FOR GIRLS & BOYS SEPARATELY,

CALL: 08985129129 ,  E-Mail Id: support@ascentit.in

REGISTER FOR PROJECTS NOW ! GET DISCOUNT
   
1