OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.03.2026, 07:11

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Synthesizing Realistic Substitute Data for a Law Enforcement Database using a Python Library

2022·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2022

Jahr

Abstract

In many databases, there is private or sensitive data that should not be accessible to any but a few individuals, such as HIPAA (Health Insurance Portability and Accountability Act) protected or LE (law enforcement) data. However, there is often a need to work with the data or change it for proper and thorough testing, especially for the developers . In some cases, the developers may be authorized to access and view the data, but it is rarely allowable for that data to be changed. Further, it is unlikely, especially on a large project, that all of the developers will have the authorization to view the data. In this case, it can be profitable to have easily creatable synthetic or 'fake' data to fill the database that mimics the real data enough to be used in all the same tests and to develop endpoints and APIs that will work with the real data. There are many possible ways to achieve this, such as shuffling the sensitive data information, or filling the sensitive data with garbled information. There are, however, drawbacks to such methods, as the data then becomes unwieldy or nonsensical to work with. Therefore, for this study, a Python library called Factory Boy, was used. Factory Boy can inherit the Django database models and then be used to generate randomized but realistic looking data, capable of mimicking all the complexities of actual database relationships and information.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Privacy-Preserving Technologies in DataComputational Physics and Python ApplicationsArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen