Product Description
MindCite Profiler is a scalable , high-performance text profiling tool, capable of matching thousands of documents a minute against a large set of profiles.
Features:
- Supports up to 100,000 profiles per server.
- Keywords match the document text, as well as metadata.
- Profiles are Boolean expressions consisting of AND, OR and NOT operators. The expression syntax is structured, based on XML.
- The Boolean expression allows combination of data and metadata conditions in the same profile.
- The number of words used in a single profile can be up to 50.
- The engine supports the NEAR operator (directed) with distance designation.
- A pluggable mechanism allows the engine to integrate with various systems in order to notify the client of profile matching.
- The Engine invokes match notification which include:
- The document identifier.
- The matched profile ID.
- The match score.
- The words which caused the match.

Profiler
- The engine supports the following languages and features:
| Language |
Recognition |
Stemming Support |
| Arabic |
Yes |
Yes |
| Danish |
Yes |
Yes |
| English |
Yes |
Yes |
| Farsi |
Yes |
Yes |
| French |
Yes |
Yes |
| German |
Yes |
Yes |
| Hebrew |
Yes |
Yes |
| Italian |
Yes |
Yes |
| Norwegian |
Yes |
Yes |
| Polish |
Yes |
Yes |
| Portuguese |
Yes |
Yes |
| Russian |
Yes |
Yes |
| Spanish |
Yes |
Yes |
| Swedish |
Yes |
Yes |
- Other languages can be added by MindCite upon request in a pluggable manner.
- The engine is able to extract text from the following file formats:
- MS-Word
- MS-Excel
- MS-PowerPoint
- PDF
- HTML
- XML
- RTF (MS-Write / WordPad)
- EML (RFC 822) including attachments.
- API:
- The engine exposes API which enables feeding data (documents):
By invoking an RPC call with the document stream
By an RPC call designating the file location.
- The engine exposes API which allows profile-management:
Add, Remove and update profiles.
Profile can be added dynamically while system is running.
- Optional (Not Included) features – to be developed upon need:
- Wildcard support (*,?).
- Soundex (phonic matching).
- Support for WordPerfect® file-format.
- Engine is cross-platform and can be run on:
- Microsoft Windows 2000 Server/2003 server.
- Linux (kernels 2.4 and above).
- Solaris (Solaris 10/ OpenSolaris recommended).