Written by

Question david clifte · Dec 2, 2016

How to remove accentuation?

How to remove accentuation of a word?

Ex:

Árvore = Arvore

você = voce

Então = entao

The words above are in brazilian portuguese, I need to get rid with the accentuation such that I can compare two sentences.

Thanks in advance.

Comments

Evgeny Shvarov · Dec 2, 2016

Use $translate?

e.g.

ClassMethod NoAccents(stringWithAccents as %String) as %String 

{

 w "before: ",stringWithAccents

 set accent="Áêã",usual="Aea",!

 set val=$translate(stringWithAccents,accent,usual)

 w "after: ",val

 return val

}
0
Jon Willeke · Dec 5, 2016

To handle this in the general case, you would decompose the string, then strip out non-spacing marks. Unicode normalization has been requested previously, and will hopefully make it into the product at some point.

0
Henry Pereira · Dec 6, 2016

Another option is to use a regular expression, like this:

ClassMethod ReplaceAccents(ByRef pWord As %String) As %Status{  Set tSC = $$$OK  Try {
      Set dictionary = ##class(%ArrayOfDataTypes).%New()
      Do dictionary.SetAt("ÀÁÂÃÄÅ","A")
      Do dictionary.SetAt("àáâãäå","a")
      Do dictionary.SetAt("ÈÉÊË","E")
      //.... all the rest
   
      While dictionary.GetNext(.key) {
        Set matcher = ##class(%Regex.Matcher).%New("["_ dictionary.GetAt(key) _ "]", pWord)
        Set pWord = matcher.ReplaceAll(key)
      }
  Catch tException {    Set:$$$ISOK(tSC) tSC = tException.AsStatus()  }  Quit tSC}
0
Pravin Barton · Feb 26, 2024

This is a very delayed answer to an old question, but there is now a $zconvert mode in IRIS that will do this for you:

> write $zconvert("Árvore", "A")

Arvore
0
Enrico Parisi  Feb 26, 2024 to Pravin Barton

WOW! Nice!

But...is there any reason why this is not documented?

When was it introduced?
A quick test shows it was not available in Caché based products, that is 2018 and is available in 2022.1.
At the moment I cannot test version 2019 to 2021.

0