Skip to content

Instantly share code, notes, and snippets.

@rodrigocorreaecastro
Last active May 24, 2021 15:46
Show Gist options
  • Select an option

  • Save rodrigocorreaecastro/2aaba2a46c6cb57e14d9db738191ee9f to your computer and use it in GitHub Desktop.

Select an option

Save rodrigocorreaecastro/2aaba2a46c6cb57e14d9db738191ee9f to your computer and use it in GitHub Desktop.
Remove Special Characters
import unicodedata, re
def removeSpecialCharacters(word):
# Unicode normalize transforma um caracter em seu equivalente em latin.
nfkd = unicodedata.normalize('NFKD', word)
var = u"".join([c for c in nfkd if not unicodedata.combining(c)])
# Usa expressão regular para retornar a palavra apenas com números, letras e espaço
var = re.sub('[^a-zA-Z0-9 $%,./: ]', '', var)
#erros do sistema de gestão de assinantes
var = re.sub('quaquer', 'qualquer', var)
var = re.sub('prazo,entrar', 'prazo, entrar', var)
# quebra de linha
var = re.sub('[ ]', '<br/>', var)
# strip -> retira os espaços finais e iniciais
var = var.strip(" ")
return var
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment