From Diplomacy to Division: Quantitatively Assessing the Kremlin’s Rhetorical Descent

January 24, 2025

In recent years, the rhetoric used by the Kremlin, particularly the Russian Ministry of Foreign Affairs (MFA), has undergone a marked shift. On social media and Russian state television, researchers and public sector experts have recognized a flood of insults, hate speech, and unprofessional language spewing from official Russian government channels. This rhetoric has of course been pointed at Ukraine, but also at the European Union, the United States, and other countries the Kremlin has deemed “unfriendly.”

In this research brief, I present quantitative evidence to support so-far anecdotal and qualitative observations about the degradation of the Kremlin’s rhetoric, arguing that it is not merely incidental, but a deliberate strategy of war, employed to sow division and chaos in Western democracies. This interesting use of data analytics, namely semantic similarity analysis, further aims to showcase how a few lines of code can unlock years worth of insights from seemingly mundane data (in this case, a boatload of old press releases!).

Methods

To quantify the Kremlin's rhetorical shift, I analyzed MFA press releases published from 2003 to 2023, which mention Ukraine, Belarus, the Baltic states, Georgia and their elected officials. The dataset, sourced from an open-source Github repository, totaled 71,351 individual sentences after filtering for specific keywords. These MFA sentences were compared to 36,953 non-Russia-specific sentences, each containing one of three types of hate speech: Dehumanizing language (17,933 sentences), calls for violence (11,194 sentences), and genocidal language (7,826 sentences). Hate speech sentences labeled as the most hateful (4, on a scale of 1 to 4) for these hate speech types were pulled from the UC Berkeley Measuring Hate Speech Dataset.

Essentially, by comparing the tone and context of the MFA's linguistic choices to the tones and context of thousands of individual, known examples of hate speech, we can assess whether the MFA's rhetoric has — or hasn't — become more hateful over time.

I coded a bespoke Python script which generates semantic similarity scores ranging from 0 (no similarity) to 1 (perfect similarity) between two sentences, comparing their meanings objectively (e.g., "He left the house" vs. "He went outside" ≈ 1; "He left the house" vs. "He stayed inside" ≈ 0). The script first converts each MFA and each hate speech sentence into an embedding — a way to represent a sentence as numbers so that computers can map and compare them. It then calculates an average of all embeddings for each hate speech type, yielding three average scores. Next, each MFA sentence embedding is compared to each of the three average hate speech embeddings, producing similarity scores between 0 and 1. The similarity scores corresponding to each hate speech type were then averaged by year and plotted on Figure 1, below.

Results

Figure 1: Graphical display of the semantic similarity between Russian MFA press releases and three kinds of hate speech, averaged by year

Since 2003, there has been a general upward trend in the similarity between MFA language and each of the three types of generic hate speech. Notable spikes in similarity occurred around Russia’s 2008 invasion of Georgia and 2014 annexation of Crimea, and similarity has sharply increased since 2021, when Russia began amassing troops on Ukraine’s border in preparation of the 2022 full-scale invasion. However, in between these spikes, a decrease in semantic similarity occurred between 2009 and 2013 during the Presidency of Dmitry Medvedev, coinciding with Russia’s diplomatic “reset” with the US and the broader West.

Average similarity scores by year reached as high as 0.61 in 2023, between MFA language and genocidal language, surpassing the commonly-held threshold of 0.6 representing a “high” degree of similarity. The scores between MFA language and dehumanizing and violent language also reached their highest levels in 2023, of 0.57 and 0.58 respectively, on the upper bound of moderate similarity. This finding provides confirmation of recent qualitative documentation of the Kremlin's unprecedented, increasingly incendiary diplomacy.

Furthermore, when examining de-aggregated lists of the average daily and monthly similarity scores from all years, scores as high as 0.80 were observed between MFA sentences and all three hate speech types during years in which Russia invaded neighboring countries.

These spikes not only prove that the MFA’s language has become more genocidal, violent, and dehumanizing leading up to and during Russia’s invasions of its neighbors, but they also debunk the Kremlin’s claims that it has sought to de-escalate these conflicts. While actions don’t always mirror rhetoric, and semantic similarity certainly isn't a measure of geopolitical strategy, this frame of analysis nonetheless carries leverage. If the Kremlin was seeking to de-escalate such conflicts, it would not make linguistic choices casting, for example, Ukraine, the Baltics, and the West as evil or Nazis, nor threaten them with military action. This would be reflected in lower or lower-trending similarity scores, such as those seen between 2009 and 2013.

One of the highest spikes in similarity between the MFA's language and all three hate speech types can be seen in 2014, ending the diplomatic reset, and representing the Kremlin's flood of untoward rhetoric about Ukraine and the West during its annexation of Crimea. The Kremlin has continued falling out with the rest of the democratic world since, and has grown only more hostile with Ukraine and its other neighbors, which is reflected in the near-continuous increase in similarity scores from 2014 to 2023. The MFA’s surge of hateful, “rage baiting” language post-2014 has served to portray Ukraine and its allies in false light, further justify Russian aggression, and simultaneously polarize the West.

Closing Thoughts

This research brief offers quantitative backing to numerous qualitative assessments regarding the deterioration of Russia’s diplomatic rhetoric in the last two decades. The clear overall increase in semantic similarity between the MFA’s language and three types of hate speech, and sharp spikes in similarity seen during Russian acts of aggression against Ukraine and Georgia, suggest that the Kremlin is intentionally using hateful rhetoric as a tool of war. This strategy aligns with the Kremlin’s broader goals of undermining global democratic institutions and deepening divisions, both within Russia’s neighboring countries and across the West. In the absence of a full set of data for 2024, this trend is likely to continue, while tensions between Russia and the West continue to heighten on the global stage.