Data Loss Prevention

on GCP

Rich Lee

2022/03

Agenda

  • What's DLP?

  • Google Cloud DLP API

  • DLP Solutions

What is DLP?

Data Loss Prevention is a strategy used to ensure that 

sensitive data is not lost, misused or accessed by unauthorized users.

Personally Identifiable Information (PII)

Personally Identifiable Information (PII)

People

  • Lack of awareness
  • Lack of accountability
  • Lack of user responsibility for their actions

Process

Technology

  • Lack of data usage policies
  • Lack of data transmission procedures
  • Lack of data usage monitoring
  • Lack of flexibility in remote connectivity
  • No content-aware DLP tools
  • Lack of secure communication platforms

Data Loss Themes

Google Cloud DLP API

Cloud DLP API

Cloud DLP API

Cloud DLP API provides access to a powerful sensitive data inspection, classification, and de-identification platform

  • Data types: text, structured data, images
  • De-identification techniques: redaction, masking, format-preserving encryption, date-shifting, and more
  • Integration services: Cloud Storage, Healthcare, BigQuery, Apigee

 

Cloud DLP API

Cloud DLP detectors

  • Cloud DLP API uses infoTypes to define what it scans for.
  • with 150+ build-in infoTypes.                                                                                                 

 

Please update my records with the following information:

phone number: 0921123456, email address: rich.lee@google.com

phone_number

email_address

  • Inspect a string for sensitive information

Cloud DLP API - Inspect

{
  "item":{
    "value":"Please update my records: phone number: 0921123456, 
    email address: rich.lee@google.com"
  },
  "inspectConfig":{
    "infoTypes":[
      {
        "name":"PHONE_NUMBER"
      },
      {
        "name":"US_TOLLFREE_PHONE_NUMBER"
      },
      {
        "name":"EMAIL_ADDRESS"
      }
    ],
    "minLikelihood":"POSSIBLE",
    "limits":{
      "maxFindingsPerItem":0
    },
    "includeQuote":true
  }
}

Cloud DLP API - Inspect

  • Inspect api request body
  • Inspect api endpoint: https://dlp.googleapis.com/v2/projects/dasea-lab/content:inspect
{
  "result": {
    "findings": [
      {
        "quote": "886921123456",
        "infoType": {
          "name": "PHONE_NUMBER"
        },
        "likelihood": "LIKELY",
        "location": {
          "byteRange": {
            "start": "71",
            "end": "83"
          },
          "codepointRange": {
            "start": "71",
            "end": "83"
          }
        },
        "createTime": "2022-03-13T08:25:50.056Z",
        "findingId": "2022-03-13T08:25:50.060104Z8545937852827138005"
      },
      {
        "quote": "rich.lee@google.com",
        "infoType": {
          "name": "EMAIL_ADDRESS"
        },
        "likelihood": "VERY_LIKELY",
        "location": {
          "byteRange": {
            "start": "100",
            "end": "119"
          },
          ...

Cloud DLP API - Inspect

  • response body

Please update my records with the following information:

phone number: 0921123456, email address: rich.lee@google.com

  • Redacting sensitive data from text content

Cloud DLP API - Deidentify

Please update my records with the following information:

phone number: ********56, email address: ****.***@*oogle.com

redacting

{
  "item": {
     "value":"Please update my records: phone number: 0921123456, email address: rich.lee@google.com",
   },
   "deidentifyConfig": {
     "infoTypeTransformations":{
          "transformations": [
            {
              "primitiveTransformation": {
                 "characterMaskConfig": {
                   "maskingCharacter": "*",
                   "numberToMask": 8,
                   "reverseOrder": false,
                   "charactersToIgnore": [
                      {
                        "commonCharactersToIgnore": "PUNCTUATION"
                      }
                    ]
                }
              }
            }
          ]
        }
    },
    "inspectConfig": {
      "infoTypes": [
        {
          "name": "EMAIL_ADDRESS"
        },
        {
          "name": "PHONE_NUMBER"
        }
      ]
    }
}

Cloud DLP API - Deidentify

  • request body
  • Deidentify api endpoint: https://dlp.googleapis.com/v2/projects/{PROJECT_ID}/content:deidentify

Cloud DLP API - Deidentify

  • Redacting sensitive data from images
{
  "byteItem": {
    "data": "[BASE64-ENCODED-IMAGE]",
    "type": "IMAGE_JPEG"
  }
}

Cloud DLP API - Deidentify

  • request body

Cloud DLP API - Job Trigger

A job is an action that Cloud Data Loss Prevention (DLP) runs to either scan content for sensitive data or calculate the risk of re-identification. 

 

DLP Solutions

 DLP Solutions - Apigee APIM

 DLP Solutions - Apigee APIM

 DLP Solutions - Apigee APIM

  • Add Extension Callout policy

 DLP Solutions - Healthcare

  • Use the Cloud Healthcare API to remove personally identifiable information (PII)  from medical images

 DLP Solutions - Healthcare

 DLP Solutions - Healthcare

  • De-identifying FHIR data

 DLP Solutions - Healthcare

 DLP Solutions - Database