Use the Scanner CLI utility
The Scanner CLI utility allows you to analyze the file system or NAS shares and provides insight into the file and directory structure. Before you run the Scanner CLI utility for the NAS shares, mount the NAS shares manually.
This utility is bundled with the Enterprise Workloads agent. When you install the latest version of the Enterprise Workloads agent, you get access to the Scanner CLI utility. You can get information such as the number of folders and files present, directory and file level, and data changed rate.
When you run the Scanner CLI utility for the first time, a full scan is performed. All subsequent scans can be incremental or full based on the configuration parameter specified in the configuration file. You will notice a significant improvement in the incremental scans performed after the first full backup.
For NAS, the Scanner CLI utility uses Advanced Smart Scan. For File Server, this utility uses Folderwalk scan.
You can run the Scanner CLI utility using:
Command line interface procedure
Run one of the following commands:
scanner-cli.exe <Configuration file path>
Or
scanner-cli.exe <Directory path for analysis> <Directory in which to create the output files>
If you use this command, then the default parameters will be used. To override these parameters, you must create a configuration file and run it.
You can find the Scanner CLI utility at the following locations:
| Agent version prior to 7.0.0 | Agent version 7.0.0 and later |
Windows | C:\ProgramData\Phoenix | C:\Program Files\Druva\EnterpriseWorkloads |
Linux | /opt/Druva/Phoenix/bin | /opt/Druva/EnterpriseWorkloads/bin |
Scanner CLI using configuration file (YAML)
Perform the following:
Create a configuration file in the YAML format by copying the following snippet to a text file and saving the file in the YAML format. Or, you can download the FS and NAS sample config.yml from Sample files.
root_paths : [Z:\]
fset_dir: Z:\
scan_worker_count: 50
sqlite_n_conns: 8
results_threshold: 10000
results_file: C:\results
processed_data_file: C:\ProcessedDataFile
db_file_path: C:\sqlite.db
use_usn: false
smart_scan: false
force_scan: false
ss_age_threshold: 0
skip_acl: false
statemap: false
log_file: C:\ScannerLog.log
advss: true
debug: 0
username: “ “
password: “ ”
sharename: “ “
filters:
exclude_folders: ["/proc", "/sys", "/dev", "/tmp", "/lost+found", "/etc/Phoenix", "/var/Phoenix", "/selinux", "$Recycle.Bin", "ProgramData", "Recovery", "System Volume information", "RECYCLER",
"C:\\Program Files (x86)", "C:\\Program Files", "C:\\Windows", ".snapshot"]
exclude_extensions: ""
include_extensions: ""root_paths: Specify the absolute or full path of the directories that you want to scan. For NAS, the root path should be the path where the share is mounted.
Default: [ ]fset_dir: Specify the drive letter.
For Windows and CIFS share, the fset directory is '<Drive letter:\>', and
For Linux, the fset directory is '/'
For SMB and NFS share, the fset directory is the mount path of the share. For example, if the SMB share is mounted on Z:\, then the fset directory will be Z:\.Default: NA
This is a mandatory parameter.
scan_worker_count: The number of threads that are to be used for scanning.
Default: 50sqlite_n_conns: The number of connections to be established with SQLite.
Default: 8 (recommended)results_threshold: Minimum batch size using which the output will be displayed on the console when the utility is run.
Default: 10000 (recommended)results_file: Specify the location of the results file, which will contain information about the changed data. A timestamp is appended to the results file name after each Scanner CLI utility run.
ResultsFile_<FsetDir>_<Timestamp>use_usn: Windows USN
Default: falseforce_scan: Set to 'true' if you want to run a full scan forcefully instead of an incremental scan. The recommended value is 'false'. This is not applicable for the first full scan.
Default: falseskip_acl: Set to 'false' to skip detecting the Access Control Lists (ACLs) changes. This is not applicable for the first full scan.
Default: falselog_file: Specify the location where you want to save the scanner log files.
Default: ScannerLog_<FsetDir>.logadvss: Run the utility with Advanced Smart Scan. This option applies only to NAS devices. The default value is ‘true’. For File Server, the value is set to ‘false.’ If Advanced Smart Scan fails, then the folder walk scan is performed.
debug: Enable debug logs. The default value is ‘0’. You can set the value to ‘9’ to print debug logs.
username: Username for the NAS device.
password: Password for the NAS device.
sharename: Name of the share for which the Scanner CLI utility is run.
statemap: Set to false.
Default: false
📝 Note
This improves scan performance. If you are planning to run an incremental scan after the first full scan, this parameter needs to be set to 'true' for all runs, including the first full run.
db_file_path: Specify the location of the file which will be used to store the persistent state of the scanner.
Default: DBFile_<FsetDir>.dbexclude_folders: List of folders to be excluded from the scan. For example,
exclude_folders: [dev, /proc, /etc, Phoenix]
Default: ["/proc",
"/sys", "/dev", "/tmp", "/lost+found", "/etc/Phoenix",
"/var/Phoenix", "/selinux", "$Recycle.Bin", "ProgramData",
"Recovery", "System Volume information",
"RECYCLER", "C:\\Program Files (x86)",
"C:\\Program Files", "C:\\Windows",
".snapshot", ".Snapshot", ".SNAPSHOT"]exclude_extensions: List of file extensions to be excluded from the scan. The extensions must be separated by a semicolon. For example,
exclude_extensions: "*.log;*.bat"
Default: " " (no extensions are excluded)include_extensions: List of file extensions to be included in the scan. The extensions must be separated by a semicolon. For example,
include_extensions: include_extensions:
If a file extension is added to the include list and exclude list, then the file extension will be excluded as the exclusion takes precedence over inclusion.
Default: " " (no extensions are included)
📝 NoteYou can customize the values of the above parameters by specifying them in the YAML file.
Ensure that the parameter fset_dir: <Specify fset directory path> is included in the YAML file, as it is mandatory.
If the results_file, processed_data_file, log_file, and db_file_path parameters are not defined in the YAML file, the utility will automatically create files with default names in the directory where it is executed.
📝 Note
The statistics in the ProcessedDataFile is applicable only in the first run.
Download and install the latest version of the Enterprise Workloads agent from the Downloads page.
In case of Linux, increase the file descriptor (FD) limit by using the following command:
ulimit -n 65000
Review scan result
Once the scan is complete,
A result file is generated at the location specified in the configuration file. This result file contains the following information about the changed data:
ChangeType - Indicates the type of change, such as file added, file modified, file deleted.
ItemType - Indicates the type of file: 'F' indicates a file, 'D' indicates a directory, and 'L' indicates link.
Mode - Indicates the Standard OS File Mode (uint32).
MTime - Indicates the modification time of the file or the folder.
Size - Indicates the size of the file in bytes.
Path - Indicates the full path of the file.A log file is generated at the location specified in the configuration file. The output file contains the following telemetry information.
An output file (processed_data_file) with the formatted data is generated that contains the following telemetry information.
Scanned directory: Z:\
Include path(s): [Z:\perfdata\10M_ZeroSize-Accel\800k_exl]
Exclude folders: /proc, /sys, /dev, /tmp, /lost+found, /etc/Phoenix, /var/Phoenix, /selinux, /var/log/Phoenix, /etc/*/EnterpriseWorkloads, /var/*/EnterpriseWorkloads, /var/log/*/EnterpriseWorkloads, $Recycle.Bin, ProgramData, Recovery, System Volume information, RECYCLER, C:\Program Files (x86), C:\Program Files, C:\Windows, .snapshot
Exclude extensions: NA
Include extensions: NA
Summary
Total Count (files and folders): 800767
Directories/Folders Count: 767
Files Count: 800000
Softlink Files Count: 0
Total Size of the files: 0 Bytes, or 0 B
Average file size: 0 Bytes, or 0 B
Directory modification age distribution:
Age distribution Count Count %
0-90 Days 0 0.00 %
90-180 Days 0 0.00 %
180-270 Days 0 0.00 %
270 Days-1 Year 0 0.00 %
1-2 Years 0 0.00 %
> 2 Years 0 0.00 %
Total Folders Count: 767
File size distribution:
Size distribution Count Count % Size Size % Avg Size
0-1KB 800000 100.00 % 0 B 0.00 % 0 B
>1-10KB 0 0.00 % 0 B 0.00 % 0 B
>10-100KB 0 0.00 % 0 B 0.00 % 0 B
>100KB-1MB 0 0.00 % 0 B 0.00 % 0 B
>1-16MB 0 0.00 % 0 B 0.00 % 0 B
>16MB 0 0.00 % 0 B 0.00 % 0 B
File modification age distribution:
Age distribution Count Count % Size Size %
0-90 Days 0 0.00 % 0 B 0.00 %
90-180 Days 0 0.00 % 0 B 0.00 %
180-270 Days 0 0.00 % 0 B 0.00 %
270 Days-1 Year 0 0.00 % 0 B 0.00 %
1-2 Years 800000 100.00 % 0 B 0.00 %
> 2 Years 0 0.00 % 0 B 0.00 %
Total Files Count: 800000
Extensions list sorted by files count:
Large ext: Files with >=5 chars filename extension
No ext : files with no extension to the filename
File extension Count Count % Size Size %
.COB 365973 45.75 % 0 B 0.00 %
.RC 126754 15.84 % 0 B 0.00 %
.H 126251 15.78 % 0 B 0.00 %
.XML 56317 7.04 % 0 B 0.00 %
.SQL 30456 3.81 % 0 B 0.00 %
.GNT 25644 3.21 % 0 B 0.00 %
.XAML 22269 2.78 % 0 B 0.00 %
.DES 7230 0.90 % 0 B 0.00 %
.K1 6674 0.83 % 0 B 0.00 %
.SQB 5827 0.73 % 0 B 0.00 %
.DSC 5643 0.71 % 0 B 0.00 %
.DAT 5462 0.68 % 0 B 0.00 %
.C 3837 0.48 % 0 B 0.00 %
.K2 3112 0.39 % 0 B 0.00 %
.K3 1588 0.20 % 0 B 0.00 %
.K4 829 0.10 % 0 B 0.00 %
.ZIP 779 0.10 % 0 B 0.00 %
.DOC 501 0.06 % 0 B 0.00 %
.PDF 448 0.06 % 0 B 0.00 %
No Ext 426 0.05 % 0 B 0.00 %
.LIB 421 0.05 % 0 B 0.00 %
Large Ext 28 0.00 % 0 B 0.00 %
Extensions list sorted by the size of files:
File extension Count Count % Size Size %
.DIR 56 0.01 % 0 B 0.00 %
.log 29 0.00 % 0 B 0.00 %
.INT 7 0.00 % 0 B 0.00 %
.K3 1588 0.20 % 0 B 0.00 %
.DES 7230 0.90 % 0 B 0.00 %
.db 2 0.00 % 0 B 0.00 %
.INI 43 0.01 % 0 B 0.00 %
.IDX 7 0.00 % 0 B 0.00 %
.SH 28 0.00 % 0 B 0.00 %
.OBJ 47 0.01 % 0 B 0.00 %
.dat 18 0.00 % 0 B 0.00 %
.SQL 30456 3.81 % 0 B 0.00 %
.WPM 100 0.01 % 0 B 0.00 %
.PDF 448 0.06 % 0 B 0.00 %
.K8 63 0.01 % 0 B 0.00 %
.CLI 48 0.01 % 0 B 0.00 %
.HLP 149 0.02 % 0 B 0.00 %
.H 126251 15.78 % 0 B 0.00 %
.LIB 421 0.05 % 0 B 0.00 %
.GNT 25644 3.21 % 0 B 0.00 %
No Ext 426 0.05 % 0 B 0.00 %
Large Ext 28 0.00 % 0 B 0.00 %
Average Width: 1042 (average number of files in each directory)
Average Depth: 9 (average directory depth)
Maximum Depth: 10 (max directory depth found during the scan)
Maximum Width: 7491 (max number of files found in a single directory)
Scanning Rate: 7357 (files scanned per second)
Scanning Time: 108 (total scan time in seconds)
Scanned directory: Directory path for analysis.
Include path(s): Path(s) to include under scanned directory.
Exclude folders: Shows the folders to be excluded.
Exclude extensions: Shows the extensions to be excluded.
Include extensions: Shows the extensions to be included.
Total Count (files and folders): Shows the total number of files and folders.
Directories/Folders Count: Shows the count of all the folders/directories.
Files Count: Shows the total count of files.
Softlink Files Count: Count of soft link files
Total Size of the files: Shows the total size of all the files in the directory.
Average file size: Shows the average size of a file in the directory.
File size Distribution: Shows the size distribution of files in a backup set. A [0-1KB : 1] indicates that only a single file with a file size between 0 to 1 KB was encountered during the scan.
File modification age distribution: Distribution of files according to their modification age.
Directory modification age distribution: Distribution of directories according to their modification age.
Extensions list sorted by files count: Shows the list of file extensions sorted by file count.
Large ext: Shows the files that have greater than or equal to five characters in the filename extensions.
No ext: Shows the files that have no extensions to the filename.
Extensions list sorted by the size of files: Shows the list of file extensions sorted by file size.
Average Width: Shows the average number of files in each directory.
Average Depth: Shows the average depth of the directory tree.
Maximum Depth: Shows the maximum depth of the directory tree during the scan.
Maximum Width: Shows the maximum number of files found in a single directory.
Scanning Rate: Shows the rate (in files per second) with which the files were scanned.
Scanning time: Shows the total scan duration (in seconds).