Introduction

Suspicious data modification on a resource is called Data Anomalies. A user or malicious software can make such changes. For example, if a resource in your organization is under attack, the malicious software on the resource can start modifying and deleting files present in the resource. A resource is a device or server or Sharepoint site where data is stored.

❗ Important:

Data Anomalies displays insights about the data protected only for the following resources:
- Endpoints
- Microsoft 365 - OneDrive
- Microsoft 365 - SharePoint
- NAS
- File Servers (Windows/Linux)
- VMware (Windows and Linux -64-bit) with agent and agentless (Cloud -Based)
Data is displayed for up to the last 30 days.

When such a potential threat manipulates the data on a resource, it is suspicious in nature and is unlike how the resource owner works with data on that resource. Since anomalies of this type often indicate issues that require attention, Druva flags any such anomalous behavior in a resource and generates an alert.

Data Anomalies feature for VMware is available as Agent-Based and Cloud-Based.

Prerequisites for VMware Data Anomalies (Applicable for VMware with agent - 7.0.2::r518961 proxy version

If you are using Data Anomalies for virtual machines with 7.0.2::r518961 proxy version, ensure that the following prerequisites are met:

VMware tools are installed and enabled on the virtual machine. For more information, see Install and Upgrade VMware tools.
Keep the Guest OS credentials handy as you need to provide these details
For Windows virtual machines: The user credentials provided must have administrator privileges or access rights
For Linux virtual machines: The user credentials provided must have either root privileges or sudo user access rights. For more information about configuring and managing sudo user credentials, see Manage credentials for VMware servers.
The Data Anomalies algorithm requires a minimum 7.0.0::r438902 proxy version to detect anomalous file actions such as bulk file creation, deletion, and modification.
The Data Anomalies algorithm requires a minimum 7.0.2::r518961 proxy version to detect anomalous encryption file actions. Contact support for assistance related to anomalous encryption file actions and alerts.

For Windows virtual machines: Enable USN journal for each drive with enough storage.

The default Windows USN journal size for most Windows versions is 32 MB which is insufficient for Data Anomalies on large virtual machines. Druva recommends the following USN journal sizes for different disk sizes:

File Count	Disk Size	Maximum Size
Files > 10 million	500 GB	2 GB
Files > 5 million	200 GB	1 GB
Files > 2 million	50 GB	512 MB
Files > 1 million	10 GB	256 MB

To increase the USN journal manually, see Microsoft 365 documentation.

For Linux virtual machines: The iNotify watches maximum limit value must be more than the number of directories on the virtual machine
For Linux virtual machines: Any one of these file system types should be present on the virtual machine - ‘xfs’ , ‘ext4’, ‘ext3’

For more information about the software requirements for VMware, see the Support matrix for VMware.

Ensure that the following URLs are whitelisted and allowed for a successful VMware Data Anomalies scan:

*s3.amazonaws.com/*

s3-*.amazonaws.com

s3*.*.amazonaws.com

For more information, see,

Support matrix for VMware Data Anomalies

The following are the supported windows versions for VMware Data Anomalies:

Windows Server 2016 (64-bit)
Windows Server 2019 (64-bit)
Windows Server 2022 (64-bit)

The following are the supported Linux (64-bit) versions for VMware Data Anomalies:

Red Hat Enterprise Linux (RHEL) 7.0 , 7.1, 7.2, 7.3, 7.4, 7.5
CentOS 7.0 , 7.1, 7.2, 7.3, 7.4, 7.5
Ubuntu 16.04, 18.04

Things to Consider for VMware with agent - 7.0.2::r518961 proxy version

Following are a few limitations that you should know before using Data Anomalies for VMware:

Error: Data Anomalies scan fails with the following error: Invalid pid for Guest VM execution.

Description: This error is observed in the following scenarios:

The glibc library version of the guest virtual machine is lower than 2.14
The default SELinux restriction enforced by Red Hat. This is specifically observed for the SELinux policy version- selinux-policy-3.13.1-268.el7_9.2.noarch

Workaround: To resolve this issue, do the following:

Upgrade the glibc library version of the guest virtual machine to 2.14 or above
To bypass the SELinux restriction enforced by Red Hat, perform the following steps:
1. Run the following command to check the SELinux status : # sestatus
2. Set SELinux policy to permissive using # setenforce 0 command
3. To persist enforcement policy, update selinux config file using # sudo vi /etc/sysconfig/selinux command.

For more information, see Red Hat documentation.

Data Anomalies for VMware - Linux: A Modified alert displays an event count in case of a change made only to file permissions without any modification in the file contents. You can safely ignore those events.
Data Anomalies for VMware -Windows: When you delete files, Data Anomalies scan cannot find file metadata from USN Journal or Windows with the given file ID. Data Anomalies scan displays the timestamp for such files as the Data Anomalies scan launch timestamp.

Cloud-Based VMware Data Anomalies (Agentless)

To enhance security, streamline operations, and improve scalability, Druva is transitioning VMware UDA to an innovative cloud-based architecture. This strategic move aims to eliminate the dependencies of the traditional agent-based model.

❗Important: To ensure your environment is ready to benefit from this simplified, agentless protection immediately, we require you to update your VMWare Proxy to version 7.0.3 or higher.

What's the difference?

Feature	Agent-Based	Cloud-Based - Agentless
Data Collection Method	Dependency on an installed agent within the guest VM.	Zero Touch deployment.
VM Status Requirement	VM must be powered on and VMware Tools must be running.	No requirement for the VM to be powered on.
Credential Requirement	Requires administrators to provide root or administrator credentials for the guest VM.	No direct access or credentials to the guest VM is required.

📝 Note:

This feature is SLA-supported, with a specific focus on optimizing for high service availability when multiple backups are performed daily. Contact support for additional information.
Cloud-based VMware Data Anomalies detection defaults to a learning period based on the number of days.

How does Druva detect Data Anomalies

Druva’s automated intelligence analyzes and monitors the data activity trend for a given resource, and after a sufficient sample size, it builds the anomaly baseline. An alert is automatically generated and reported in case of any anomalous activity.

What do we mean by baseline?

In the Data Anomalies feature context, a baseline refers to the expected pattern of data behavior over a specific period. It serves as a reference point or benchmark against which you can detect deviations or anomalies.

Step 1 Learning period: In this step, Druva performs a data backup pattern analysis. See Data backup pattern analysis period.

Step 2 Data Anomalies detection process: In this step, Druva checks the backed-up files to detect anomalous file actions such as creation, update, deletion, and encryption.

❗ Important

For VMware resources, backup and the Data Anomalies detection process run simultaneously.

Step 3: Generate and send a Data Anomalies alert: If any data anomalous activity is detected, a Data Anomalies alert is sent.

Following are the algorithm input parameters that Druva requires and uses to analyze the data activity trend and generate alerts in case of any suspicious data activity:

Data backup pattern analysis period for resources - Endpoints, File Server, NAS, VMware, Microsoft 365 (OneDrive and SharePoint): Displayed in Days or Snapshots
Number of files in a snapshot: A minimum number of files required within a snapshot to initiate Data Anomalies learning and scanning.

💡 Tip: If the total number of files in a snapshot is less than the minimum number of files, then that snapshot is not scanned for Data Anomalies detection.

Deviation in the files from the baseline and total files in a snapshot: Percentage deviation threshold compared to the baseline and total files in a snapshot required to qualify as anomalous data.

Recommended Data Anomalies settings

Following are the recommended Data Anomalies settings for analysis period to start encryption checks for a resource and generating Data Anomalies alerts.

❗ Important

We recommend that you keep the default - Recommended Data Anomalies Settings if you are not sure about the data backup pattern of your organization.

Data backup pattern analysis period for resources

30 days (For Endpoints and OneDrive): The default and recommended setting for Endpoints and OneDrive data backup pattern analysis. The Data Anomalies detection for Endpoints and OneDrive will start only if data has been successfully backed for the past 30 days. The permissible settings for days or snapshots are between 2 and 45.
30 days (For File Server/NAS/VMware/SharePoint): The default and recommended setting for File Server/NAS/VMware/SharePoint data backup pattern analysis. The Data Anomalies detection for File Server/NAS/VMware/SharePoint will start only if data has been successfully backed up for the past 30 days. The permissible settings for days or snapshots are between 2 and 45.
100 or more files in a snapshot: The default and recommended setting for the minimum required files in a snapshot of a resource to initiate Data Anomalies detection for resources - Endpoints, OneDrive, File Server, NAS, VMware, and SharePoint. The permissible setting for a minimum count of files is between 20 and 500.
75% of baseline in the snapshot: The default and recommended maximum setting for the file actions (Create, Update, and Delete) in a snapshot for a resource to generate a Data Anomalies alert. Data Anomalies alert is generated if the deviation is observed beyond the set baseline value. The permissible setting for baseline is between 50 and 99%.
% of the total files in a snapshot: The default and recommended setting for the minimum change in the count of files out of the total files in a snapshot to generate Data Anomalies alert.

Endpoints, File Server, NAS, and OneDrive: 70% of the total files in a snapshot

VMware and SharePoint: 20% of the total files in a snapshot

The permissible setting for a minimum change in the count of files is between 5 and 90% for Endpoints, File Server, NAS, and OneDrive.

The permissible setting for a minimum change in the count of files is between 5 and 90% for VMware and SharePoint.

Both the 4th and 5th conditions should be met for Data Anomalies alert to get generated.

You can use the Data Anomalies Settings > Edit option to customize and update the Data Anomalies configuration settings as per your organizational requirements and if you are aware of the data backup patterns.

❗ Important

If you have selected snapshots as your data backup pattern learning period criteria, ensure that the learning duration is completed within 45 days.

The following table explains the Data Anomalies behavior for Endpoints, OneDrive, and File Server/NAS/VMware/SharePoint resources:

❗ Important: First backup is not considered for Data Anomalies detection.

Example

Scenario: Data Anomalies is enabled for a resource with the following Data Anomalies settings with total 500 files.

Backup Pattern learning period	Minimum number of files required in a snapshot for Data Anomalies detection	Maximum Deviation	Minimum percent of total file change
05 snapshots	125	50%	20%

The following example explains the Data Anomalies behavior using the Data Anomalies settings mentioned in the table above.

For the first backup, there were 500 files backed up. Being the first backup, this will be excluded by the Data Anomalies algorithm.

Let's consider subsequent backups in the following trend:

Snapshot#	Created	Modified	Deleted
2	20	5	8
3	12	7	1
4	0	0	10
5	0	0	0
6	5	0	8

We have a total of 520 files after the 6th backup. Learning duration is complete - 05 Snapshots. Data Anomalies detection starts and alerts can be generated in case of anomaly.

Now, the baseline is as follows:

Baseline for creation = maximum of new files created in the last learning duration of snapshots. i.e. Maximum of 20, 12, 0, 0, 5 which is 20
Baseline for modification/update= maximum of modified/updated files in the last learning duration of snapshots. i.e. Maximum of 5, 7, 0, 0, 0 which is 7
Baseline for delete=maximum of deleted files in the last learning duration of snapshots. i.e. Maximum of 8, 1, 10, 0, 8 which is 10

The baseline for creation, modification, and deletion is 20, 7, and 10 respectively.

Let's proceed with the next round of backups in the following trend:

Snapshot#	Created (Baseline for Creation)	Modified (Baseline for Modification)	Deleted (Baseline for Deletion)	Total files in last backup
7	10 (20)	5 (7)	7 (10)	520
8	2 (Max of 12, 0, 0, 5, 10 = 12 )	1 (Max of 7, 0, 0, 0, 5 = 7 )	2 (Max of 1, 10, 0, 8, 7 = 10 )	523
9	100 (Max of 0,0,5,10,2 = 10 )	0 (Max of 0,0,0,5,1 = 5 )	8 (Max of 10, 0, 8, 7,2 = 10 )	523
10	80 (Max of 0,5,10,2, 100 = 100 )	4 (Max of 0,0,5,1,0 = 5 )	10 (Max of 0, 8, 7,2,8 = 8 )	615
11	0 (Max of 5,10,2, 100, 80 = 100 )	12 (Max of 0,5,1,0, 4 = 5 )	8 (Max of 8,7,2,8, 10 = 10 )	685
12	5 (Max of 10,2, 100, 80, 0 = 100 )	50 (Max of 5,1,0, 4, 12 = 12 )	70 (Max of 7,2,8, 10, 8 = 10 )	677
13	200 (Max of 2, 100, 80, 0,5 = 100 )	0 (Max of 1,0, 4, 12, 50 = 50)	0 (Max of 2,8, 10, 8, 70 = 70 )	612

At the 9th snapshot, a creation alert is generated wherein 100 files are created and all the three required conditions are met:

Total number of files > minimum number of files required i.e.125
Baseline for creation = 10; number of files created > Baseline * max deviation
New files created > minimum percent of total files change

Similarly, at the 12th snapshot, modification and deletion alerts are generated as all three required conditions are met for both.

Administrators can take action based on the security policies of the organization to identify and isolate a possible threat and prevent additional losses.

❗ Important: Anomaly detection kicks in only after the backup job is complete and a snapshot is created. For incomplete backup jobs or interrupted backup jobs, no anomalous behavior is tracked.

View Data Anomalies alerts

📝 Note
In the case of deleted resources (devices, sites, and backupsets) you cannot view the alerts for those resources. However, you can retrieve the deleted resources and view their alerts with the Rollback Action option.

Log in to the Management Console and go to Cyber Resiliency > Posture & Observability > Data Anomalies > Anomalies tab to view Data Anomalies details.

Take action on an alert

For any Data Anomalies alert, you can do any of the following:

Ignore the alert : If you deem any alert as a false positive, click the resource name and select the false positive alert. Click Ignore to resolve the alert.
Quarantine the resource: Select an alert and click Quarantine Resource to stop the ransomware from spreading further. Before you quarantine, see Know the impact of quarantining to learn more about the effects of quarantining the resource. To learn about the options to quarantine a resource, see Quarantine Response.
You can also download the logs for a particular alert and use them for further inspection.

📝 Note
For each backup of all workloads, you can download logs for up to 1.5 million files.

The downloaded logs provide information about the following:

File Name: Name of the file
Full Path: Path of the file.
File Type: The type of file. For example, .txt
File Size (Bytes): Size of the file
File Modified Timestamp: The date and time when the file was modified
Operation: The operation performed on the file. For example: File created, file modified, file deleted, files encrypted
SHA1 Checksum (Only for Endpoints, OneDrive, and SharePoint): The SHA1 Checksum value of the file
File Owner (Only for Endpoints, OneDrive, and SharePoint): The details of the file owner
File Created Timestamp (Only for Endpoints, OneDrive, and SharePoint): The date and time when the file was created
File Modified By (Only for OneDrive, and SharePoint): The date and time when the file was modified.
Alert Reason: The reason for encryption alerts.

❗ Important: In case of encryption, the downloaded logs will contain details for a maximum of 100 encrypted files.

After you have taken an action, the status of the alert changes to Resolved.

💡 Tip: From the VMware restore UI, you can view insights for Data Anomalies and directly initiate Cyber Recovery for VMware.

Related Keywords:

Unusual Data Activity

UDA

unusualdataactivity

Data Anomalies

dataanomalies

data anomaly

Data Anomaly

Quarantine Bay

Data Anomaly Alerts Report

VMWARE: Data Anomaly Scan failed

Data Anomalies Dashboard

Feature Support matrix for GovCloud (FedRAMP) and AWS GovCloud (US) Update