If an organization's data is the most important resource they have, then that would mean that the loss of data is one of the most devastating things that can happen to a company. While it's true that data can get lost through malicious attackers and even held hostage by ransomware, data is often lost through many other, more benign ways. Sometimes, organizations don't even know it's happening.
With the landscape changing and a heavier focus on data protection and privacy, stopping data loss is becoming more and more important. One of the ways that this is combat is through the use of DLP tools.
In this article, we'll cover what DLP is and how organizations use and DLP tools to stop data loss and oversharing.
What does DLP stand for?
DLP stands for Data Loss Prevention, and like the name suggests, it refers to ways to stop data from being lost. Where this can get confusing is the "loss" part. In this context, "loss" doesn't necessarily mean vanished, as in the data being gone; it can also refer to it going to the wrong location or the wrong people gaining access to it.
Usually, it would refer to one of these scenarios:
- The data is actually gone. Either it got deleted, got sent somewhere by accident with no way to track where it went, or it simply isn't showing up where it should for whatever reason and no one knows why because there isn't an audit trail.
- The wrong people gained access. This doesn't have to mean someone like a competitor got hold of proprietary company secrets. Any time someone has access they shouldn't, it falls under this category.
- External tools or applications have too many permissions. Oversharing with tools or applications is very common, especially when it's unclear what scopes they need to function properly. This can really become a problem if the app/tool is compromised.
DLP tools focus on these and other data loss scenarios and put in safeguards to reduce the likelihood of these situations happening. They can vary from light warnings and nudgings to completely altering permissions and transfers on your behalf to forcing an admin to manually acknowledge and confirm specific actions.
There are many different tools and methods that can be used for data loss prevention. To keep it simple—and to match the acronym—we'll focus on three categories here: Detect, Limit, Protect.
Detect
The fundamental part of DLP is identifying data that is important or sensitive. Different levels of protection apply to data that is classified as sensitive. Then, depending on if the data is sensitive or not, some specific action(s) will be triggered.
So why not just do it for all data?
While some organizations choose to do this, it comes with some tradeoffs. The main ones are performance and efficiency. Scanning significantly slows down file transfers, and scanning files that are almost certainly low risk takes time and can lead to delays. It is also resource-intensive, which often comes with an associated cost, so scanning everything can essentially be flushing money down the drain when it isn't needed.
Because of this, DLP detection is usually precise and targeted around important and sensitive locations. Two frequently used detection methods that we'll cover here are file classification and data labels.
File classification
File classification is a proactive method of DLP that checks each document for specific characteristics. A common method is quickly checking for known patterns, such as social security numbers that would be in the format of XXX-XX-XXXX.
One method of file classification checks files that are in transit, such as when they are being uploaded or transferred to a new directory. Before the file transfers to the final destination, it is scanned for common patterns (either a known list or ones chosen by the organization). Depending on what is found, the action may vary.
Typically, not meeting any pattern will allow it to be uploaded, but in some workflows, a certain pattern is expected, so other file signatures could be denied instead. In some highly-secure environments, having a file that is unable to be classified can also be grounds for denial.
Software that does data scanning may also scan environments at rest. This would typically be used to check the status of existing stores and to periodically check data at rest for files that fit into specified classification types.
Data labels and sensitivity classification
Similar to file classification, data labels allow you to quickly apply a label to denote what type of content is in a file. Depending on the specific software you're using, what else these labels can do can vary drastically.
At its most simple, data labels give a simple label to define what's in a folder, such as "not sensitive" or "confidential". These can serve as a reminder to double-check what you allow to transfer into the folder.
With more advanced functionality, sensitivity classification and data labels limit where data with specific labels can be sent and who can access it. For instance, "confidential" may only be accessible by the C-suite, and data might not be allowed to be transferred or copied once the label is applied.
M365 has one of the more widely-used methods of handling this through their sensitivity labels. They can be used to identify content, protect it through encryption, and restrict actions for specific users.
Limit
This section of DLP focuses on limiting who has access to data and where it's stored. In high-compliance industries like Healthcare and Finance, simply storing the data in an improper place can leave the organization in breach of compliance, even if they're unaware of it.
Putting limitations on data storage and transfer can make this process automatic instead of manual, which is especially useful for Catch 22 situations where the person who would manually do the checking should've never had access to view the file in the first place.
Content filtering
Filtering checks the content that is scanned and then performs an action based on the file contents. Often this is used to either allow the file to continue to be uploaded/transferred, or to deny it.
So what does it mean if a file is "denied"? That will depend on the rules and process that go along with it. It might mean that it is sent to a quarantine zone for manual approval. It could mean that the file is deleted immediately. Or it could be routed to another endpoint. Whatever the specifics, the original, intended process is stopped and another takes precedence.
Filtering can also stop oversharing. Take the example of sensitivity labels above. Files that are transferred into a certain folder may trigger the "CONFIDENTIAL" label, which automatically disallows sharing. Any number of processes could be activated by filtering depending on the organization's needs and compliance requirements.
Manual approvals
Some organizations need to limit what can be transferred to sensitive or secure locations. One way of doing this is to have an authorized person manually approve every file that is added to a particular directory. If approved, the file can be uploaded, and if not, the file will be denied (again this could lead to any number of processes).
The same can apply to sharing files from that folder. Suppose that someone decides to share a file with a coworker, someone who shouldn't have access to the file. Even though the first person is authorized to share data in general, sharing from that particular folder might require oversight or approval. So while they can request to share a file, that request can be denied manually, which can help stop oversharing.
Manual approvals are a powerful lockdown tool, but they also require a lot of oversight, so can become impractical at scale compared to automated filtering.
Protect
When files are sent through an organization's infrastructure that shouldn't be there, DLP settings might activate protection actions. These safeguard the organization by stopping potentially harmful files from entering the file infrastructure.
"Harm" can mean different things depending on the types of file contents involved. It could be directly damaging, or it may put the organization in breach of compliance.
Two common methods for this are malware scanning and data scrubbing.
Malware scanning
Just like it sounds, Malware Scanning checks for files that might be dangerous and could contain viruses or other malware.
Having some kind of malware protection is essential these days, especially with dangerous variants like ransomware that can completely cripple operations. While the file transfer infrastructure shouldn't be the only safeguard (and isn't for businesses that take security seriously), malware scanning at this level can work as an additional layer of protection.
Malware scanning often works by checking files for known malware risks sourced from a database that is updated regularly. Files that have patterns that match can be automatically denied. Depending on the intensity of the scans, this could go even further, like completely stopping files with elements that could be malware.
While an important part of protecting against data loss, malware scanning is more in the realm of cybersecurity than typical DLP, but there can be some overlap.
Scrubbing and masking
Sometimes, it's essential to work with files that have sensitive data like PII, but that data can't be exposed to certain people, apps, or environments. This is where data scrubbing and masking come in.
These data protection features allow files with certain signatures through the filters, but with the caveat of limiting access to specific parts. For instance, patient data used in research will anonymize the patient's personal information while still retaining enough information to be valid for the research purposes.
So how does this work?
Data scrubbing removes personal data in some way once the detection stage catches information that fits one of these categories like names, addresses, or phone numbers. The data is then "scrubbed" of this information. This means that the personal info may be redacted, or it could be replaced with something like hashmarks. In this state, the file is still usable for its intended purpose, but the sensitive aspects have been scrubbed out.
Hashing works similarly, only it transforms data cryptographically so it can still be used, but can't be reversed. If someone were to get ahold of the file, they won't be able to see the original, untransformed version, making it an excellent way to store passwords and other highly-sensitive data like this. In the context of DLP, data that matches specific signatures that comes across in plain-text can be hashed automatically to avoid exposure.
DLP with Couchdrop
Data Loss Prevention can cover a number of different scenarios and include different feature sets. In Couchdrop, we now offer DLP through our Transfer Shield feature.
Transfer Shield stops sensitive data from ending up in the wrong place by watching your file transfers for sensitive data. It can detect and classify files, flag them for potential risks, and give you the controls to stay compliant. Transfer Shield scans files for specific signature types and can identify files that may contain data like banking information, personal information, and UUIDs, as well as recognizing and denying file transfers based on filetype like applications, images, and videos.
Transfer Shield is currently in beta and you can request to have it enabled in your account. To find out more, visit Transfer Shield, or you can request access now by visiting the Transfer Shield request page.