Statistical Disclosure Control |
|
Contact:Anco HundepoolStatistics Netherlands P.O. Box 24500 2490 HA The Hague The Netherlands Phone: +31 70 337 5038 Fax: +31 70 337 5990 Last update: 10 Oct 2011 |
Microdata: new disclosure risk assessment methodology (WP 1.2)Leading partner: URVParticipating partners: Soton, IStatThis workpackage is devoted to research oriented to assessing the disclosure risk for microdata at the individual record level. The work mainly focuses on unperturbed microdata (e.g. microdata resulting from sampling a population of records), which are the kind of microdata released by many important statistical offices nowadays (e.g. U.S. Bureau of the Census, ONS, etc.). Unlike WP 1-1, this workpackage does not deal with the development of new SDC masking methods. Thus, both workpackages are complementary, and they are both aimed at improving m-ARGUS. The following are the objectives of this workpackage broken down by tasks: Task T1 (responsible Soton)ObjectivesDisclosure risk can be measured at either the file level or the record level.
Record-level measures are useful for use in conjunction with disclosure limitation
methods which are applied at the record level, for example local suppression.
The objectives of this workpackage are: Description of the workSkinner and Holmes (1998) consider records r with key variable values These measures will depend on specified assumptions about the nature and degree of misclassification both in the microdata and in the external data. In the absence of measurement error, Skinner and Holmes (1998) consider a simple measure of risk for records which are unique in the sample with respect to some categorical key variables. The measure is given by exp(-(1-π)f/π), where π is the sampling fraction and f is a fitted frequency for the combination of key variable values of the given record. The measure may be interpreted as the estimated probability that the variable combination is unique in the population. This is the simplest measure which will be considered in the framework of µ-ARGUS. The computation of the fitted frequencies would require some iterated proportional fitting. The measure could be extended to records which are not unique in the sample. ReferencesCopas, J. B. and Hilton, F. J. (1990) Record Linkage: statistical models for
matching computer records, (with discussion) J. Roy. Statist. Soc., A, 287-320 Milestones and expected result
- Development of theory for record-level measures under misclassification Task T2 (responsible Soton).Objectives
- To apply the methods developed on Task T1 to the Labour Force Survey (an instance of survey of EU-wide interest). Description of the work
The Labour Force Survey will be considered (as a survey raising EU-wide interest).
The following will be done: Milestones and expected result
- Determination of misclassification rates Task T3 (responsible IStat).ObjectivesTo build into µ-ARGUS the individual disclosure risk approach for complex micro-data (hierarchical) as defined in the Esprit n° 20462, SDC and taking advantage of the developments of Task T1. Description of the work
In order to improve the capabilities of µ-ARGUS and give a wider choice of
methodology to the user, the individual unit risk
(called record-level risk in T1 and T2) approach will be implemented in µ-ARGUS.
The programs already available in SAS as output of the SDC project will be used as
a basis to define a C procedure to estimate the individual risk.
Moreover, efficient protection algorithms will have to be developed that take into
account dependencies in the data. Milestones and expected resultThe implementation of individual risk of disclosure into µ-ARGUS will widen user choice. The resulting evaluation of disclosure risk will enable the user to measure the safety levels reached in the micro-data file. |