Tata wrote:Yes. In Slovak glossaries the nouns are almost standardly written in singular nominative nominative.
...
thank you Tata for the detailed answer. I will continue to explore this approach.
I'll try to explain in English my method:
in French, we wrote a horse "
un cheval" (singular), horses, "
des chevaux" (plural).
Regex search:
`(al)\b`
and replace by:
'($1|aux|aus|als)'
result for the regex used to search words:
chev(al|aux|aus|als)
It also cover the specials case like ("un carnaval", "des carnavals").
That is the principle.
This can become complicated with the accumulation of rules, for example in Danish:
ansvar => (a|ø|æ)
nsv(ar|arr|rar|art)(e|er|et|ne|ene|erne)?
For Slovak, it is even worse:
chlap =>
chlap((a)?(mi)?|e|(i|í)(a|ach|am)?|o(ch|m|u|v(i)?(a)?)?|u|y)?
but it finds chlap, chlapi, chlapa, chlapov, chlapovi, chlapom, chlapoch, chlapmi
A word ending in a vowel:
hrdina =>
hrdin(a)?(y|u|e|o(ch|m|u|v|vi|via)?|(ho|m|mu)|(i)?(á|a)?(ch|m|mi)?|a(t|t)(a(m|mi)?|á(ch)?|om|u|i)?|en(iec|cami|ce|com)?)?
finds hrdina, hrdinovia, hrdinu, hrdinov, hrdinovi, hrdinom, hrdinoch, hrdinmi
because it must cover all cases ending in a vowel (feminine, neutral, foreign words etc ...) in all forms and all plurals.
And I have not yet integrated the cases of missing vowels (chlieb => chleba) or the mutations (c, k => s, c).
But the tests are successful this time.
Tata wrote:
but there is also a word "chlapov" which is not noun but adjective and it means belonging to a man (chlapov otec = man's father)
There are indeed cases untreatable when a word can have several meanings depending on the context of its use.
Besides, it is not unique in the plural. But in the context of a technical article or concerning a sport or an activity requiring to define certain words and special terms (for purposes of explanation to the novices for example), I do not think this case will often encountered.
The more you automate something, the more you are exposed to particular cases
I am on holiday soon, that's good
regards,
Ludovic
Glossaire_XH is no longer available.