SYMPTOMS
The Field Appender Auto Tagging phase allows the use of the language recognition done by our Information Management Platform but you wish to optimize the language recognition for Legal Processing for new matters.
CAUSE
The Information Management Platform offers a wide selection of languages which causes the recognition functionality to match on a large selection. This might result in matches for uncommon languages in Legal Processing. The output is stored in zz_lanc and zz_lann fields in the database.
RESOLUTION
In order to restrict the results it is recommended to minimize the number of languages that can be recognized. This is done by temporarily removing them from the Information Management Platform. Follow the below steps:
- Make a backup of \Program Files (x86)\ZyLAB\Information Management Platform\Reference Data\Language Resources\Noise Words\
- Make a backup of \Program Files (x86)\ZyLAB\Information Management Platform\Reference Data\Language Resources\Ocr Language Recognition\
- Make a backup of \Program Files (x86)\ZyLAB\Information Management Platform\Reference Data\Language Resources\Other Language Recognition\
- Decide which languages you wish to exclude
- In each of the above folders, remove the corresponding files for your exclude list (for an overview of which language code corresponds to which language see languages.xml (\Program Files (x86)\ZyLAB\Information Management Platform\ReferenceData\Language Resources\Common\).
As an example, open up the languages.xml
<Language description="Balinese (Latin)">
<Code>ban</Code>
</Language>
The file that you need to remove from the folder Noise Words is 'ban.noi'.
The file that you need to remove from the folder Ocr Language Recognition 'ban.ocr'.
The file that you need to remove from the folder Other Language Recognition 'ban.ngr'.
Changes will only apply to new matters.
APPLIES TO
2.3; 3.0; 3.1
Comments
0 comments
Article is closed for comments.