Features Map

Indexing and Searching Overview


dtSearch products instant searching across terabytes of text in a wide range of online and offline data types. Search time (including concurrent search time) is typically less then a second.

  • dtSearch Desktop with Spider and dtSearch Network with Spider run in a classic Windows environment for individual or shared network-based searching.
  • dtSearch Web with Spider runs in an Internet or Intranet environment, with no limit on the number of concurrent searches.
  • The dtSearch Engine developer SDK comes in multiple different versions for different platforms. Running in an Internet or Intranet server-based environment, the dtSearch Engine supports efficient multithreaded searching, with no limit on the number of concurrent search threads.

Building an Index. dtSearch products can instantly search terabytes of text because dtSearch builds a search index that stores each unique word and its location in the data.

  • A single index can hold up to a terabyte of data, spanning multiple directories, emails and attachments, online data and other databases. (See supported data types.)
  • dtSearch can build and simultaneously search any number of terabyte indexes.
  • Indexing is easy: just point to the folders or online data you want to index.
  • No need to tell dtSearch what files, emails or other content you have; dtSearch will figure that out for itself.
  • Indexing, searching and display of documents does not alter original files or other data, including Hash values.
  • dtSearch also offers automated indexing via the Windows Task Scheduler.
  • See optimizing indexing of large collections of data for important tips on indexing building.
  • See indexing tips for information on unindexed searching, forensics tips, etc.

Updating an Index. dtSearch can update your indexes by adding only new or updated items, removing deleted items, and compressing the index, without affecting searching.

Indexing Tip #1: Build an index. Unindexed searching is almost never more efficient. While indexing is much slower than searching, the time it takes to build an index and then search for multiple search terms (as is typical in forensics and e-discovery) is significantly less than the time it takes to run multiple unindexed search terms. And once the index is in place, if you think of more search terms, additional search time is pretty much instantaneous.

Indexing Tip #2: Watch for encrypted files. After building an index, dtSearch’s “off the shelf” products, for example, create a log of encrypted files dtSearch cannot read. Take a look at this log so you know what you need to separately decrypt and run again through dtSearch. (More)

Indexing Tip #3: Access emails directly as PST, OST, MSG etc. files, instead of going through Outlook/MAPI. If you are not searching your own personal email collection (and sometimes even if you are searching your own emails and have a large collection), it is much more efficient to bypass the Outlook/MAPI “middleman,” and directly access the data. (More) And don’t forget fuzzy searching to sift through potential typographical errors in emails and attachments!

Indexing Tip #4: Update your indexes by telling dtSearch to add any new or changed documents, remove deleted documents and compress the updated index. This type of update tends to be much less time consuming than completely re-indexing. Even better, dtSearch can update its indexes automatically with no effect on ongoing concurrent searching. (More)

Indexing Tip #5: Check out general tips on optimizing indexing before you start a large index job. Following is just one example of the type of thing you need to know.

While search options like fuzzy searching are adjustable at search time, if you build a case and accent-sensitive index, the only way to change that setting is to rebuild the entire index again. With case and accent sensitive indexing on, your index size will be much larger, as your index will store Frank, frank and FRANK as separate words, instead of the same word. Worse, with case and accent-sensitive indexing on, a search for Frank Harvey would miss both frank harvey and FRANK HARVEY. (More)

Instantly Search Terabytes of Text
Enterprise and developer products
dtSearch’s document filters support popular file types, emails with multilevel attachments, databases, web data
Highlights hits in all data types; 25+ search options
Developer APIs for .NET, Java and C++; SDKs for multiple platforms. (Articles on faceted search, SQL, SharePoint, MS Azure, etc.)
The Smart Choice for Text Retrieval® since 1991