# Algospeak and Adversarial Text Manipulations

Tisane uses a special type of built-in spellchecker module to process text with both unintentional errors (misspellings) and adversarial text manipulations (e.g. algospeak).

The spellchecker employs several different techniques to handle different types of manipulations (masking characters, substitutions, etc.). These corrections are not limited by profanities or slurs, and consider the context. The same misspelled word may be interpreted differently in different sentences.

If corrections were found to be necessary in a sentence, the sentence gets a `corrected_text` attribute where the corrected text is logged. (Set `words` to `true` to output sentence data.)

## Limitations

Spell-checking is not a ["did you mean" tool](https://stackoverflow.com/questions/307291/how-does-the-google-did-you-mean-algorithm-work), as many people seem to believe:

- If the word is a legitimate word, no matter if misused or esoteric, Tisane will not correct it. For example, if *noun* is misspelled as *nun*, or *house* is misspelled as *horse*, Tisane won't help (unless it's part of a known often obfuscated concept, e.g. *corn star* in English).
- The primary purpose of the spellchecker is to decipher obfuscations. Therefore, the spellchecker is biased toward more profane, objectionable, or heavily used concepts.


## Excluding Esoteric Senses And Words To Get Better Results

To get around the issue, you can use the  `min_generic_frequency` parameter.

This allows you to exclude the most esoteric senses and words.

The frequency is graded between 0 and 10, with 10 being the most frequent.

Some esoteric senses are also graded at -10.

We recommend you initially set `min_generic_frequency` to `1` or`2` to see if it works in your situation.

## Excluding Potential Proper Nouns

If you need to avoid spell-checking potential proper nouns, set `lowercase_spellcheck_only` to `true`.

## Example

Request:


```json
{
  "language":"en",
  "content":"I will br*k his neck and kll him",
  "settings": 
  {
    "words":true,"topics":false,"sentiment":false,"snippets":true
  }
}
```

Response:


```json

	"text": "I will br*k his neck and kll him",
	"abuse": [
		{
			"sentence_index": 0,
			"offset": 0,
			"length": 32,
			"text": "I will br*k his neck and kll him",
			"type": "criminal_activity",
			"severity": "medium",
			"tags": [
				"threat",
				"violence",
				"death"
			]
		}
	],
	"sentence_list": [
		{
			"offset": 0,
			"text": "I will br*k his neck and kll him",
			"words": [
				{
					"type": "word",
					"offset": 0,
					"text": "I",
					"lettercase": "capitalized",
					"role": "agent",
					"lexeme": 63061,
					"family": 301,
					"grammar": [
						"PRON"
					],
					"stopword": true
				},
				{
					"type": "word",
					"offset": 2,
					"text": "will",
					"lexeme": 146938,
					"family": 316,
					"grammar": [
						"VERB"
					],
					"stopword": true
				},
				{
					"type": "word",
					"offset": 7,
					"text": "br*k",
					"role": "verb",
					"lexeme": 20996,
					"family": 107846,
					"grammar": [
						"VERB"
					]
				},
				{
					"type": "word",
					"offset": 12,
					"text": "his",
					"lexeme": 63064,
					"family": 303,
					"grammar": [
						"DET"
					],
					"stopword": true
				},
				{
					"type": "word",
					"offset": 16,
					"text": "neck",
					"lexeme": 93293,
					"family": 40510,
					"wikidata": "Q9633",
					"grammar": [
						"NOUN"
					]
				},
				{
					"type": "word",
					"offset": 21,
					"text": "and",
					"lexeme": 4096,
					"family": 322,
					"grammar": [
						"CCONJ"
					],
					"stopword": true
				},
				{
					"type": "word",
					"offset": 25,
					"text": "kll",
					"role": "verb",
					"lexeme": 77380,
					"family": 113102,
					"grammar": [
						"VERB"
					]
				},
				{
					"type": "word",
					"offset": 29,
					"text": "him",
					"role": "patient",
					"lexeme": 63062,
					"family": 303,
					"grammar": [
						"PRON"
					],
					"stopword": true
				}
			],
			"corrected_text": "I will break his neck and kill him"
		}
	]
}
```