According to Microsoft, Microsoft Windows Server 2019 still does not support Windows Search on Data Deduplication enabled volumes (source):
Windows Search doesn't support Data Deduplication. Data Deduplication uses reparse points, which Windows Search can't index, so Windows Search skips all deduplicated files, excluding them from the index. As a result, search results might be incomplete for deduplicated volumes. Vote for this item for Windows Server vNext on the Windows Server Storage UserVoice.
This has been a problem/challenge for a long time now (example).
I am maintaining a Windows Server 2019 file server, that stores its data on a Data Deduplication enabled ReFS-Volume and I am also facing the problem to provide a working search functionality.
Before implementing a solution by using a 3rd party search engine, I'd like to know if there are already any workarounds available to make Windows Search work on Data Deduplication enabled volumes by using on-board tools.
So, if someone is aware of a valid workaround, I'd appreciate any information on a way to implement this without using 3rd party software.
Two options:
You exclude files that do have fulltext content from deduplication, like pdf, doc, docx, xls, xlsx, htm, html, etc. Those are often not the very big files. At least in our office, where the big files are photoshop and CAD. So dedup happens to the big files, search to the text containing files.
You create a .vhdx, mount it and have your files on there. "Inside" the vhdx NTFS and search, "outside" the vhdx is on a ReFS with dedup. Works very well. It's like a file server as Hyper-V VM, only without the VM. vhdx mount can be done with a Windows task at startup using diskpart with a diskpart script.