ADVESTIGO
Home > Technology > Text
Text
The technology developed by Advestigo uses different techniques to extract content information from different media types, or Elementary Media - Text, Image, Sound.

Depending on the type of text to be analyzed, both formal and fuzzy logic approaches are used.
Computer programs or source code, for example, will be analyzed for regular repetitive patterns and forms, whereas text documents are analyzed on the basis of natural language usage. The type of analysis used on text documents is capable of differentiating between formal data - for example, a credit card number which must be copied exactly -- and natural language expressions or clichés, where comparisons can include non-exact equivalent expressions. Dictionaries can be used to enhance or fine-tune recognition of natural language equivalences.
  • Independent of grammatical and syntactical modifications, because fingerprinting is based on concepts extracted from the text using dictionary-based inflexion analysis.
  • Independent of language style changes through the use of dictionary/thesaurus synonym detection.
  • Robust with respect to document editing and formatting - additions, deletions, re-ordering…