| Text |
|
The technology developed by Advestigo uses different techniques to extract content information from different media types, or Elementary Media - Text, Image, Sound. Depending on the type of text to be analyzed, both formal and fuzzy logic approaches are used. Computer programs or source code, for example, will be analyzed for regular repetitive patterns and forms, whereas text documents are analyzed on the basis of natural language usage. The type of analysis used on text documents is capable of differentiating between formal data - for example, a credit card number which must be copied exactly -- and natural language expressions or clichés, where comparisons can include non-exact equivalent expressions. Dictionaries can be used to enhance or fine-tune recognition of natural language equivalences.
|