112 113
Slovenščina 2.0, 2020 (2)
UPDATING THE DICTIONARY: SEMANTIC 
CHANGE IDENTIFICATION BASED ON CHANGE 
IN BIGRAMS OVER TIME
S a n n i  N I M B ,  N i c o l a i  H A R T V I G  S Ø R E N S E N ,  
H e n r i k  L O R E N T Z E N 
Society for Danish Language and Literature
Nimb, S., Hartvig Sørensen, N., Lorentzen, H. (2020): Updating the dictionary: semantic 
change identification based on change in bigrams over time. Slovenščina 2.0, 8(2): 112–138
DOI: https://doi.org/10.4312/slo2.0.2020.2.112-138
We investigate a method of updating a Danish monolingual dictionary with new 
semantic information on already included lemmas in a systematic way, based 
on the hypothesis that the variation in bigrams over time in a corpus might 
indicate changes in the meaning of one of the words. The method combines 
corpus statistics with manual annotations. The first step consists in measuring 
the collocational change in a homogeneous newswire corpus with texts from a 
14 year time span, 2005 through 2018, by calculating all the statistically sig-
nificant bigrams. These are then applied to a new version of the corpus that is 
split into one sub-corpus per year. We then collect all the bigrams that do not 
appear at all in the first three years, but appear at least 20 times in the following 
11 years. The output, a dataset of 745 bigrams considered to be potentially new 
in Danish, are double annotated, and depending on the annotations and the 
inter-annotator agreement, either discarded or divided into groups of relevant 
data for further investigation. We then carry out a more thorough lexicographi-
cal study of the bigrams in order to determine the degree to which they support 
the identification of new senses and lead to revised sense inventories for at least 
one of the words Furthermore we study the relation between the revisions car-
ried out, the annotation values and the degree of inter-annotator agreement. 
Finally, we compare the resulting updates of the dictionary with Cook et al. 
(2013), and discuss whether the method might lead to a more consistent way of 
revising and updating the dictionary in the future.
Keywords: corpus statistics, bigrams, dictionary update, semantic change, Danish
113
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
1 INTRODUCTION AND MOTIVATION
The Danish Dictionary (DDO) was originally edited from 1994 to 2003 based 
on studies of Danish word senses in corpus texts from 1983-1992, in total 40 
million tokens (cf. Norling-Christensen and Asmussen, 1998). It was initially 
published in print 2003-2005 and at the time it described the senses of 66,000 
lemmas (cf. Lorentzen, 2004). Since 2009 it has been available online at ordnet.
dk/ddo, and in recent years the main focus has been to update it with new lem-
mas. Today, 25 years after the first editorial work was carried out, the dictionary 
covers 100,000 lemmas, and time has come to update the earliest edited ones 
by supplying them with new senses, new fixed expressions, new collocations, 
and also new citations. After the first published version of the dictionary, this 
has only been done sporadically, as a result of user suggestions and whenever 
the lexicographers observed new ways of using a word in the language. When 
it comes to citations, the dating of these in the dictionary can be used as an 
indicator since entries with only older ones probably need an update. The edi-
torial staff is currently going through all senses which are only illustrated with 
a citation from the 1980s. However, presenting more updated citation infor-
mation would also be relevant in many other cases, but these are hard to find 
systematically, as are those cases where there is a need for new collocations or 
even more importantly, for a slightly different sense description or even a new 
sense, maybe in the form of a fixed expression. Our aim is to be able to supply 
the current practice building on suggestions from users and editorial observa-
tions with a more systematic approach across the whole vocabulary, based on 
corpus statistics.
2 METHOD
It is a well-established fact that collocational change might indicate sense 
change (Tahmasebi et al., 2018; Pollak et al. 2019; Traugott, 2017). For in-
stance, Pollak et al. (2019) compare automatically extracted collocations 
from computer-mediated communication (such as blogs and social networks) 
with those from a general language reference corpus and discover not only 
topic/genre-related new words, but also new meanings of previously lexi-
cographically described vocabulary. In contrast to this, the present paper is 
based on the comparison of sets of automatically extracted collocations from 
corpora which are similar in composition and genre, but which instead cover 
114 115
Slovenščina 2.0, 2020 (2)
different timespans. We describe a method where the collocational change 
in these corpora is used as input for lexicographers in their search for new 
meanings of already included vocabulary in a dictionary. We initially calcu-
late the statistically significant variation in bigrams in a corpus and create a 
dataset of those that are estimated to be new in Danish texts. Independently 
of each other, two lexicographers judge whether, at a first glance, the bi-
grams indicate the need for a semantic revision of the lemmas involved, and 
if so, should it be 1) in the form of a defined sense or fixed expression, or 2) 
in the form of a collocation added to an existing sense with no need of ex-
planation? Afterwards, the lemmas represented by the bigrams which were 
marked as 1) or 2) either by one or both lexicographers are more thoroughly 
inspected, leading to a revision in the dictionary when required, otherwise 
not. The judgments of the data are based on a set of internal guidelines to 
be followed by editors of the dictionary when new lemmas, senses and fixed 
expressions are to be added. 
In this paper, we study and discuss the relation between annotation value (1 
or 2), inter-annotator agreement and the final type of update to be carried 
out. We conclude that especially when the annotators agree that the bigram is 
semantically relevant, but disagree upon which exact type of semantic change 
it indicates, we find many new senses. Finally, we compare our findings with 
Cook et al. (2013). 
In the next section we describe the statistical method that we estimate to be 
suitable for our purpose, as well as the computational creation of the dataset.
3 CREATING THE DATASET
Since 2005, the Society for Danish Language and Literature has collected news-
wire data of roughly the same size daily. The newswire corpus consists of 20 to 
40 million tokens for each year, 512 million running words in all. It consists of 
articles that are randomly selected from major Danish newspapers each day 
(due to license restrictions the corpus is not publicly available, but see korpus.
dsl.dk/resources.html for other Danish corpora from DSL that are).
The homogeneous data type, the relatively even distribution, and the suf-
ficiently long time-scale make this corpus ideal for investigating our 
115
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
hypothesis. If lexical data in the form of a token or e.g. a bigram has not 
occurred at all in the initial period of the text collection, but occurs regularly 
in the more recent corpus texts, it might indicate that it is a neologism or, in 
the case of bigrams, either a new expression in the language, or a new way 
of using one (or more) of the words involved. We have previously used this 
method to identify potential new single lemmas for DDO, but have never 
evaluated the method formally. We divided the corpus by year, and selected 
all tokens which do not appear at all in the first 3 years, 2005-7, but appear 
frequently during the remaining 11 years. The set of tokens was checked by 
a lexicographer who removed proper nouns and errors, and now it is used 
as input to lexicographers in the task of supplying DDO with new lemmas. 
However, it has not been studied to which degree these lemma candidates do 
end up being included as new lemmas in the dictionary. This paper describes 
the same method carried out on bigrams, but takes it a step further. In this 
case not just one, but two lexicographers check and annotate the output data 
independently of each other. Furthermore we also check how useful the re-
maining manually selected part of the data turns out to be when it comes to 
the concrete task of updating the dictionary, and study the relation between 
the initial annotations and the usefulness. The updates that we decide upon 
are either carried out immediately or listed as future tasks in the editorial 
process of keeping the dictionary up to date.
Once again, we use the corpus text collection divided by year, and now collect 
all the bigrams which do not appear at all in the first three, but appear with a 
certain frequency during the next 11 years. Our method is easily reproducible. 
1. We calculate the statistically significant bigrams for the complete 
newswire corpus 2005 - 2018 (~ 512 million tokens), see [3.1] below 
for details; 
2. We divide the corpus into 14 sub-corpora, one for each year; 
3. We count the occurrences of the bigrams for each sub-corpus, i.e. each 
year, separately;
4. We make a dataset of all bigrams that meet the following two 
requirements: 
116 117
Slovenščina 2.0, 2020 (2)
a. The bigram does not occur in the first three years, 2005, 2006, and 
2007, 3 being the lowest number of years that we felt would prevent 
accidental gaps in the distribution of the bigram.
b. The bigram occurs at least 20 times in the following time period of 
11 years, 
 (--> frequency ~20/400 million = 0.00000005).
The output of the process is a dataset of 745 bigrams considered to be new 
in Danish. These bigrams are listed and used as input for the manual anno-
tation task.
3.1 Calculating the statistically significant bigrams 
In order to calculate the statistically significant bigrams we developed a small 
Python script using the Phrases module of the Gensim package (Řehůřek and 
Sojka, 2010; Řehůřek, 2020). We used the so-called original scorer algorithm 
based on the bigram scoring function developed by Mikolov et al. (2013) for 
calculating the bigrams. 
The bigrams are calculated using the formula:
score = (count(wi, wj) - m) * count(vocab) / count(wi)*count(wj)
where count(wi, wj) is the frequency of the bigram, count(vocab) is the size of 
the vocabulary, count(wi) is the frequency of the first word, count(wj) is the fre-
quency of the second word, and m is the minimum frequency of the bigrams. 
We chose the minimum frequency of bigrams to consider (m) to be 5 and 
we chose the threshold of 7 for significant bigrams. This threshold was cho-
sen based on manual inspection in order to select only the most significant 
bigrams without letting too much noise into the dataset. This threshold re-
moves arbitrary, ad-hoc bigrams like nævne nogle (‘mention some’, score 
3.9) and skal betale (‘must pay’, score 1.2), but keeps wanted bigrams like 
offentlig institution (‘public institution’, score 8.8) and monopolagtige til-
stande (‘monopoly-like conditions’, score 385.0). However, any fixed thresh-
old must of course be expected to give some unfortunate results. In our case 
we find that some bigrams that are clearly non-collocational are included in 
the dataset (e.g. stormer flyet, ‘raid the plane’, score 7.3), and some excellent 
117
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
ones are excluded (e.g. stor betydning, ‘great importance, score 6.8). We have 
not investigated the perfect threshold for this experiment, but it is clearly a 
task we wish to perform.
4 MANUAL ANNOTATION OF THE DATASET 
We established the following five questions for the manual annotation task. 
The categories we chose are closely related to the type of information described 
in the dictionary which is to be updated with new semantic information.
1. Is the bigram likely to represent a new sense of one of the words, pos-
sibly in the form of a fixed expression, to be included in the dictionary? 
2. Is it instead more likely to represent a new collocation, both words 
being transparent in sense? 
3. Is the bigram (part of) a proper noun? For example the title of a Dan-
ish movie Den skaldede frisør (English title: Love is all you need), or a 
Danish tv-program Den store bagedyst (corresponding to the English 
program: The Great British Bake Off).
4. Is it a grammatical construction, for example anno 2013 (‘in the year 
2013’), arvelovens paragraf (X) (‘section (X) of the Inheritance Act’). 
5. Is it not at all relevant to include in the dictionary? Eurozonens tred-
jestørste (‘the third largest of the Eurozone’, din smartphone (‘your 
smartphone’).
The first 2 categories are particularly important in the semantic update task. 
In Figure 1, the DDO entry design is shown, and here we see how the two cat-
egories are used. Category 1 refers to defined senses in the dictionary which 
can be expressed as either a main sense or subsense (1., 1.a and 1.b in Figure 
1), or in the form of a multiword unit where the lemma is included, initiated 
by the headline ‘Faste udtryk’ (‘Fixed expressions’) in the figure illustrated 
by intelligent design (‘intelligent design’). Category 2 refers to the use of bi-
grams (or trigrams) as examples of how the word combines with other words 
in this sense, e.g. industrielt design (‘industrial design’) and italiensk design 
(‘italian design’). We have chosen to call only these example bigrams ‘colloca-
tions’ in this paper. Others use the term ‘collocations’ differently. In a similar 
118 119
Slovenščina 2.0, 2020 (2)
work, Pollak et al. (2019) use it in a broader sense, corresponding to the entire 
set of bigrams that they operate with, due to the fact that this only contain 
noun lemmas and their collocates. They operate with only bigrams containing 
noun lemmas in the dataset. Only their term ‘collocationally new collocations’, 
which is used to define one of the 7 core categories among their initially ex-
tracted collocations, correspond to what we call ‘collocations’.
Figure 1: The noun lemma design in DDO.
Two of us, both experienced lexicographers, annotated the output of 745 bigrams 
independently of one another with one of the 5 categories listed above. We both 
have a good knowledge of the lexical content of the DDO, and are very familiar 
with the task of updating the dictionary with new lemmas, senses etc. Table 1 
shows an extract of one of the two independently annotated lists of bigrams. 
119
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
Table 1: The list of bigrams with frequency information and annotation, one annotator
Bigram Frequency Annotation
amerikanske=internetgigant 23 2
amerikanske=jobmarked 32 5
amerikanske=medicinalselskab 57 5
amerikanske=whistleblower 74 5
analyserer=kulturelle 123 5
anbefalinger=fordeler 94 5
andengenerations=bioethanol 32 2
anno=2012 124 4
anno=2013 111 4
anno=2015 113 4
anno=2017 103 4
annoncerede=ordrer 26 5
antisemitiske=hændelser 25 2
anvendte=billedmateriale 422 5
arabiske=forårs 45 1
arabiske=opstande 21 2
arabiske=revolutioner 30 2
arktiske=kyststater 26 2
arktiske=stater 46 2
To compare our annotation task with similar work carried out by Pollak et al. 
(2019), they instead initially annotated a dataset manually (not double-an-
notated) in only three categories (p. 190): ‘non-relevant data’ (correspond-
ing to 4 and 5 in our task), ‘proper words and abbreviations’ (corresponding 
to 3 in our task), and finally ‘core results’, which correspond to our catego-
ries 1 and 2. Afterwards the ‘core results’ in their study were annotated by 
two linguists (again not double-annotated) into 7 more specific categories, 
some of which are related to their specific interest in non-standard vocabu-
lary and therefore not relevant to our case. But their 4 categories: ‘lexically’, 
‘collocationally’, as well as ‘semantically new vocabulary’, and finally ‘termi-
nology’, are all covered by the content of our first 2 categories: ‘new sense or 
fixed expression’ or 'new collocation’. 
Pollak et al. (2019) apparently do not double-annotate the data, and as we 
shall see, the double annotation is in our case an important part of our method, 
120 121
Slovenščina 2.0, 2020 (2)
and likewise plays an important role in the analysis and conclusions. Nor do 
Pollak et al. (2019) investigate to which degree the annotated data in each 
case entails an update in a practical lexicographic project, and what exact type 
of update that ends up being carried out on the basis of each bigram in the 
dictionary. Our study allows us to compare on the one hand the annotations 
and the inter-annotator agreement, on the other hand the different types of 
resulted updates, and to draw some conclusions based on the combinations.
The output of the annotation task that we carried out – two lists with 745 
annotated bigrams – was subsequently compared in order to calculate the in-
ter-annotator agreement. The results are discussed in the next subsection.
4.1 Inter-annotator agreement and relevant data
The overall inter-annotator agreement was 85% in the annotation task de-
scribed above. However, there was almost 100% agreement between the two 
lexicographers on whether the data was unlikely to influence the semantic de-
scription in the DDO (the categories 3, 4 and 5, covering proper nouns, gram-
matical constructions or simply not relevant information to include in a dic-
tionary). This data, 1/3 of the statistically significant bigrams, was therefore 
discarded as non-relevant for further lexicographic inspection, a share which 
corresponds roughly to the 37,4% of the extracted data which was found irrel-
evant in the Slovene study (Pollak et al., 2019, p. 191). The high inter-annota-
tor agreement indicates that the task of discarding non-relevant bigrams from 
the automatically extracted list could probably have been carried out by just 
one experienced lexicographer.
The bigrams said to belong to either category 1 or 2 by both lexicographers, 
and thus likely to influence the semantic description of one of the lemmas (or 
both), constituted 482 bigrams, corresponding to 2/3 of all statistically sig-
nificant bigrams. These were selected as highly relevant for a more thorough 
lexicographic inspection. 
4.2 Frequency
Our choice of a frequency criteria of 0.00000005 seems suitable for our pur-
pose of finding enough data to initiate a more systematic update process of 
the dictionary. A large part, namely more than 1/3 of the new bigrams, had a 
121
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
frequency between 20 and 30 (of 400 million tokens), and most of them, 3/4, 
had a frequency lower than or equal to 50. If the initial criteria on frequency 
had been raised from 20 to 50, we would only have obtained 1/4 of the rele-
vant data that was found. It might even pay off to also check bigrams with a 
frequency between only 10 and 20 in the corpus, since more than a third of the 
relevant bigrams had 30 or less occurrences.
5 L E X I C O G R A P H I C I N S P E C T I O N O F T H E B I G R A M S A G R E E D 
U P O N T O B E R E L E V A N T D A T A
Figure 2 illustrates how the 745 statistically significant bigrams are overall 
distributed in non-relevant and relevant ones as described above and, maybe 
more importantly, how the relevant 2/3 (482 bigrams) are further divided into 
three groups: two groups with those where the lexicographers agreed upon the 
type of semantic update (both chose category 1, or both chose category 2) and 
one where they disagreed (the one chose category 1, the other chose category 
2), or put differently, agreed upon it to be either category 1 or 2 (and not any 
of the categories 3, 4 or 5). 
Figure 2: Double annotation of 745 statistically significant bigrams results in 4 groups: one with 
bigrams agreed upon as being non-relevant, one with bigrams agreed upon to represent 1) a new 
sense or fixed expression, one with bigrams agreed upon to represent 2) a new collocation, and 
finally one where the one annotator chose 1) new sense or fixed expression, and the other chose 
2) new collocation.
By dividing the relevant bigrams in this way we obtain a distinction between 
the relatively clear cases (the first two groups where the annotators agreed 
 
122 123
Slovenščina 2.0, 2020 (2)
upon the type of update) in opposition to the more unclear, albeit relevant 
cases (the third group where the annotators disagreed on the type of update). 
Interesting data concerning sense change tends to hide in the unclear data, as 
we shall see in section 6.3.
Our next step was to thoroughly inspect the bigrams from all three groups 
with the purpose of updating one or maybe even both lemmas in the diction-
ary with new semantic information. As an example, the multiword expres-
sion fri fagskole (‘free vocational school’, a new type of educational institution 
in Denmark) was added to the noun entry of fagskole (‘vocational school’) 
based on the bigram frie fagskoler (‘free vocational schools’). The collocation 
streame musik (‘to stream music’) was inserted in the verb entry streame (‘to 
stream’) based on the identical bigram streame musik, and the collocation 
nordisk køkken (‘Nordic cuisine’) was added to the noun entry of køkken (‘cui-
sine’) based on the bigram nordiske køkkens (genitiv: ‘of the Nordic cuisine’). 
It turned out that the updates would not only consist in a new sense, fixed 
expression or collocation, but also a slightly changed definition, or an added 
citation illustrating the bigram. In some cases the lemma was even updated in 
more ways than one, e.g. the bigram intelligente løsninger (‘intelligent solu-
tions’) entailed both a new collocation as well as a slightly changed definition 
in the adjective entry intelligent, which now includes the new digital and com-
puterized aspect of the sense. 
Other bigrams turned out to be of less relevance than originally expected 
during the initial annotation task when they were more thoroughly inspect-
ed. E.g. the bigrams forbyde burkaer (‘to ban burkas’, reflecting a political 
debate) and levende myrer (‘live ants’, a much debated dish at the famous 
Danish restaurant, Noma) did not entail any revision of entries in the dic-
tionary, estimated to be connected to very specific former events, and there-
fore, from a linguistic and lexicographic point of view, less relevant to in-
clude in the DDO today. 
After having closely studied 189 bigrams and the corresponding two lemmas 
in the dictionary, we ended up deciding upon 103 semantic updates to be car-
ried out in the dictionary. However, 300 bigrams from the collocation group 
have not yet been thoroughly analysed, but based on our studies of 1/5 of the 
123
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
group, we estimate the total amount of bigrams leading to an update to be 
approx. 41% of all the bigrams annotated to be relevant (category 1 or 2), and 
thereby 27% of the initial dataset of automatically extracted and calculated 
bigrams. This will be discussed further in the next section, where we will study 
the relation between the annotations carried out and the resulting types of up-
dates, and draw conclusions on how to profit in more than one way from the 
double annotation of the bigrams.
6 T H E R E L A T I O N B E T W E E N T Y P E O F A N N O T A T I O N A N D T Y P E 
O F R E S U L T I N G U P D A T E I N T H E D I C T I O N A R Y
In Table 2, the number of updates (some of which are not yet carried out but 
listed as future editorial tasks), are presented in relation to the annotated data.
Table 2: Bigrams divided into three groups depending on inter-annotator agreement 
482 relevant 
bigrams (of 
745 statistically 
significant 
bigrams)
Agree 1: 55 
bigrams.
Both annotators 
agree: new 
sense or fixed 
expression
Agree 2: 367 bigrams.
Both annotators agree: 
collocation
Agree 1 or 2: 
60 bigrams 
One annotator: 
collocation
Another annotator: 
new sense or fixed 
expression
Number 
leading to 
update
All inspected
49 lead to update
1/5 inspected (a sample of 
74 bigrams) 
24 lead to update (estimate 
full set: ~120)
All inspected
30 lead to update
Note. For each group, the number of bigrams leading to an update is given.
The same data is illustrated in Figure 3. When at least one of the annotators 
estimate the bigram to represent a new sense or new fixed expression, the 
data very often turns out to be useful in the process of updating previously 
described lexicographical vocabulary with new semantic information, as illus-
trated by the first and last columns.
Furthermore, and perhaps quite surprisingly, Figure 3 also clearly shows that 
when both annotators agree that a bigram constitutes a new collocation, the 
bigram quite often does not result in any update at all. 
Apart from studying the amount of updates made up by the bigrams of each 
annotation group, it is also interesting to find out what kind of updates the 
124 125
Slovenščina 2.0, 2020 (2)
three different groups typically entail. Table 3 presents the number of specific 
updates in relation to the type of annotation.
Table 3: Bigrams leading to updates and the types of updates that they entailed related to 
annotations 
Type of annotation 
leading to update 
→
Type of update
Agree 1:
Both 
annotators: 
new sense 
or fixed 
expression
 = 49
Agree 2:
Both annotators: 
collocation
= 24 of sample 
(estimation full set 
~ 120)
Agree 1 or 2: 
One annotator: 
collocation.
The other annotator: 
new sense or fixed 
expression
= 30
Estimated 
total 
number 
of updates
= 200
new lemma 22 2 (full set ~10) 2 34
fixed expression 19 0 8 27
new sense 1 3 (full group ~15) 7 23
changed definition 3 0 4 7
collocation 4 11 (full group ~ 55) 10 69
new citation 0 8 (full group ~40) 0 40
Note. The table also presents the estimated total number of updates entailed by the extracted 
dataset of bigrams.
We also estimate how many updates the dataset will lead to when the total 
set of annotated data is thoroughly studied. Around 27% of the automatically 
Figure 3: The figure illustrates how often the each of the three groups of relevant bigrams con-
tained data which was useful in the task of updating the dictionary. 
 
125
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
extracted bigrams lead to an update, which constitutes around 41% of the bi-
grams annotated as relevant for the semantic revision of the dictionary by 
both lexicographers. A little over 1/3 of the updates take the form of a new 
collocation in the dictionary, 1/4 take the form of a new senses or fixed expres-
sion, equally distributed. 1/5 is in the form of new citations, and almost 1/5 are 
new lemmas. See Figure 4.
Figure 4: The share of the different types of updates entailed by the information on extracted 
bigrams.
In the next 3 subsections, we will go into detail with the data from each group.
6.1 Agree 1: Both annotators agree that it is a new sense, maybe in the form 
of a fixed expression
The two lexicographers agreed that a rather small, but valuable part of the 
semantically relevant bigrams represented a new sense or fixed expression. 
Here we find the most useful data when it comes to updating the already in-
cluded lemmas in the dictionary, since almost all of it leads to revisions when 
the bigrams and the two corresponding dictionary entries are thoroughly in-
spected. See Figure 5.
 
126 127
Slovenščina 2.0, 2020 (2)
Figure 5: The distribution of different types of semantic updates entailed by the group of bi-
grams agreed to be a new sense or fixed expression by the two annotators. 
Somewhat surprisingly, almost half turned out to constitute new lemmas 
based on an English multiword expression (e.g. urban farming, augmented 
reality). Danish neologisms are highly influenced by English, and loans from 
multiword expressions are often written in one word when they are included 
in Danish dictionaries, due to Danish spelling rules (street food → streetfood, 
game changer → gamechanger), if not, simply constituting a lemma entry 
spelled in two word. Pollak et al. (2019, p. 192) also deal with such loan words 
from English.
A substantial part of the bigrams in the group leads to a new fixed expression 
in the dictionary as foreseen by the annotators. In contrast to this, only very 
few led to the addition of a new main sense or subsense. More frequently they 
led to a change in existing definitions of the lemmas so that they now include 
the new phenomena described by the bigram. This was the case of the adjective 
præhospital ‘prehospital’ (based on the bigram regionens præhospitale), and 
funktionel (‘functional’), based on the bigram funktionelle lidelser (‘functional 
diseases’), see also other examples and a comparison with Cook et al. (2013) in 
section 7. Another rather small part led to new collocations in the entries. It is 
worth noticing that only among the bigrams in this group do we find the cases 
where the semantic information they represent had already been included in 
the dictionary, discovered during recent editorial work carried, for example 
due to user suggestions. In fact this goes for 12% of the updates, and most of 
 
127
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
them are fixed expressions which apparently attract the attention to a much 
higher extent than new senses and collocations.
6.2 Agree 2, inter-annotator agreement: collocations
Now we turn to the other part of the relevant bigrams in which the type of up-
date was agreed upon by the two lexicographers, in this case judged to be new 
collocations by both. This part constitutes the largest group of the relevant 
data by far, namely ¾ (367 bigrams), and we have not inspected all of them 
yet. Here we find bigrams like tørrede tranebær (‘dried cranberries’), syriske 
borgerkrig (‘Syrian civil war’), klimatiske udfordringer (‘climate challeng-
es’), and brystforstørrende operation (‘breast enlargement surgery’). In our 
investigation, we have previously only studied one fifth (74 bigrams) in de-
tail, however we estimate this to be a sufficient number to enable us to draw 
some conclusions. We have compared them with the current lexical descrip-
tion of the two lemmas in the dictionary and also studied the occurrences 
in the corpora. As seen in Figure 5 above, only one third of the studied ones 
lead to an update of the dictionary. Many of them turn out to be very topi-
cal, time-limited and related to specific political or economic events in recent 
years. Therefore they are discarded in the final analysis and not integrated in 
the dictionary. One example of this is the bigram amerikanske droneangreb 
(‘American drone strikes’).
Figure 6: The distribution of updates entailed by the group of bigrams agreed to be collocations 
(category 2) by the two annotators. 
 
128 129
Slovenščina 2.0, 2020 (2)
Figure 6 illustrates how those of the category 2 bigrams that did result in an 
update are distributed when they are to be implemented in the dictionary. 
Almost half of them are added in the form of a collocation as also foreseen 
by both lexicographers, i.e. trådløs opladning (‘wireless charging’) which 
has been added to the adjective trådløs, politiets vagtchef (‘police officer 
on call’) which has been added to the noun vagtchef (‘officer on call’), ulov-
lig overvågning (‘illegal surveillance’) which has been added to the noun 
overvågning (‘surveillance’), and kriseramte banker (‘crisis-stricken banks’) 
which has been added to the adjective kriseramt (‘crisis-stricken’). But Fig-
ure 6 also reveals that quite a lot of the bigrams that were estimated to be 
collocations in the first place instead have led to the adding of a new citation 
representing the bigram. It is worth noticing that only this group of bigrams 
(agreed upon to be collocations by both lexicographers) leads to this type of 
update in the dictionary. This suggests the future use of the same method in 
the task of updating citations in the dictionary, as a supplement to the criteria 
we use at the moment where we only look at entries with old citations from 
specific magazines. Another interesting fact about the updates based on the 
collocation group is that none of the data had already been discovered and in-
cluded in the dictionary by other editors in the period since the bigrams were 
extracted for our experiments, indicating that this type of information, which 
is in fact highly needed in order to keep the dictionary content up to date at a 
more general level, would probably have been overlooked without the statisti-
cal investigation of bigrams. 
However, the group of collocations also contains the highest amount of in-
applicable data. It contains a lot of time-limited bigrams which according to 
the editorial guidelines of the DDO are not relevant to include in the diction-
ary. This is due to the fact that we are dealing with bigrams extracted mainly 
from newspapers. From a structural point of view, they are of course typical 
collocations: adjective + noun, verb + object etc., which is also why the two 
lexicographers easily agreed upon their status as such at first hand, but from a 
more pragmatic point of view they are not, and we should probably have been 
aware of this problem from the beginning. We can also conclude that very 
few bigrams in this group led the lexicographers on the track of new senses or 
new lemmas. One rare example is the loanword big data based on the English 
129
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
multiword expression. The lemma data is already part of the DDO which is 
why both lexicographers annotated it as a new collocation. However, since 
it is a term and a direct new loan pronounced in English it has instead to be 
included at lemma level in the dictionary. 
6.3 Agree 1 or 2: inter-annotator disagreement whether it is a collocation 
or rather a new sense, maybe in the form of a fixed expression
The third and last part of the data selected for further lexicographic inspection 
consists of 60 bigrams that the two lexicographers agreed to be highly rele-
vant. They disagreed, however, upon how to include them in the dictionary 
structure. While one annotator estimated that the bigram was most likely to 
represent a new sense or fixed expression, the other believed that it was more 
likely to represent a new collocation. In fact, only half of the bigrams in this 
group entailed a dictionary update. See Figure 7 for the distribution of the 
different types of updates. 
Figure 7: The distribution of updates entailed by the bigrams agreed to be relevant. However the 
annotators disagreed upon whether the bigram represented a new sense or fixed expression, or 
rather a collocation. 
The vast majority of those which entailed an update did so in the form that 
was suggested by either one or the other annotator, more or less equally dis-
tributed. For the first time, we find quite a lot of new senses and not only fixed 
expressions. One third of the bigrams were included as collocations (e.g. bære-
dygtig omstilling (‘sustainable conversion’, mentalt helbred (‘mental health’)), 
 
130 131
Slovenščina 2.0, 2020 (2)
almost another third as a fixed expression (bibelske dimensioner (‘biblical pro-
portions’), pædagogiske assistenter (‘teaching assistents’, new job title)), and, 
particularly interesting, one quarter in the form of a new main sense or sub-
sense. E.g. the new subsense of the noun boble (‘bubble’) discovered from the 
bigram glas bobler (lit. ‘glass of bubbles’ – i.e. ‘a glass of sparkling wine, e.g. 
champagne’) was included in the dictionary, and the adjective mobil (‘mobile’) 
is planned to be provided with a new sense triggered by the bigrams mobile 
bredbånd and mobilt internet (‘broadband/internet via a cellular phone’). 
Some of the bigrams will result in several changes. In the case of the new 
concept selvkørende bil (‘self-driving car’) which is also a part of the new data 
described in Pollak et al. (2019, p. 193), the definition of the adjective entry 
selvkørende needs to be changed in DDO, as does the entry of bil (‘car’). The 
entry will be extended with a new fixed expression with its own definition. 
It is worth noticing that this group of bigrams is the one reveals the larg-
est amount of new senses by far. Several bigrams lead to the inclusion of a 
new main sense or subsense in the dictionary. Many also entail the need of a 
changed definition for one of the lemmas. For instance, a revision of the defi-
nition of digital (‘digital’) is needed due to the bigram digital dannelse (‘dig-
ital code of conduct/digital education’), likewise a revision of the definition 
of cannabis (‘cannabis; marijuana’) was needed due to the bigram medicinsk 
cannabis (‘medicinal marijuana’). We also found one new lemma in the group, 
the adjective æresrelateret (‘honor-related’), due to the bigram æresrelatere-
de konflikter (‘honor-related conflicts’). This lemma would also be discovered 
by single lemma extraction methods, but since it very often occurs together 
with konflikter in our data, this should be added as collocational information 
when the new lemma is included and edited.
Among the discarded data in the group were bigrams that had only been fre-
quent for a short period of time (based on the study of the occurrences in our 
corpus), others were considered to be terminology which is not suitable for 
inclusion in the dictionary. As in the case of the agreed collocations, it's worth 
noticing that no lexical information discovered from our study of this group of 
bigrams had been registered in the dictionary by other editors since the data 
was extracted, and it would probably have been hard to discover without the 
use of statistical methods.
131
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
6.4 Conclusions on annotation and resulting updates
Our computational measure of the appearance of new bigrams in homogenous 
newswire corpora combined with double annotations of the output dataset and 
the entailed updates of the dictionary allow us to draw a number of conclusions. 
6.4.1 How useful was the automatically calculated dataset?
First of all, we can conclude that quite a lot, i.e. approx. 1/4, of the automatically 
extracted dataset leads (or will lead) to a resulting update in the dictionary, while 
3/4 do not. In comparison, Pollak et al. (2019) find a little less “lexically, collo-
cationally, or semantically new data that can be considered in the process of up-
dating existing lexical resources for Slovene” (p. 197), namely 21.6%. The initial 
annotation by two lexicographers made it possible to discard many bigrams in 
the extracted dataset in an efficient and not very time-consuming way. The data 
that the lexicographers selected as most likely to be relevant turned out to be 
useful when more thoroughly inspected and compared to the content of the dic-
tionary entries in almost half of the cases. Had the initial annotation task been 
carried out on the basis of more detailed and elaborated guidelines, we could 
probably have avoided even more ‘noise’ (bigrams not leading to any updates 
after all), for example the many time-limited bigrams. The automatic extraction 
of the bigrams can maybe also be tuned in a way so that such time-limited data 
is better avoided in the first place, and not even included in the output dataset. 
Pollak et al. (2019) also propose that the automatic extraction procedure should 
include language recognition in the preprocessing step in order to identify and 
remove the English bigrams from the list. However, this would entail that sev-
eral new loan words would not have been discovered and included in the DDO.
6.4.2 New lemmas
We found far more lemma candidates in the dataset than expected, namely 4%, 
due to the fact that many English multiword expressions are to be integrated in 
the dictionary at lemma level. This is in line with the results of Pollak et al. (2019).
6.4.3 Fixed expressions
A little over 4% of the initial dataset ended up being included in the dictionary 
in the form of fixed expressions. They constitute 14% of the updates carried 
out. From our investigations, we can see that when a bigram is recognized by 
132 133
Slovenščina 2.0, 2020 (2)
two lexicographers as a fixed expression, it very often holds true, and it almost 
surely will influence the semantic description of one or both lemmas that are 
part of the bigram in one way or another. Very few bigrams that had been 
annotated as a fixed expression by both lexicographers led to no update at all, 
so if you want to make sure you find relevant data for the updating task of a 
dictionary, then this a way to go. Furthermore we can conclude that when two 
lexicographers agree that a bigram is not a fixed expression but rather a collo-
cation, we can also be sure that it is not. Fixed expressions also seem to be the 
easiest to discover without applying any systematic method, since around 1/6 
of them had already recently been included in the dictionary.
6.4.4 New main senses and subsenses
We found quite a lot of new senses via the dataset. Around 3% of the auto-
matically extracted bigrams led us to this information, and among the anno-
tated relevant data one in every 20 bigrams revealed a new sense. Pollak et al. 
(2019) find a bit more (4.9% of the extracted data), but they state that many 
are found in non-standard colloquial language (p. 193), which might explain 
the higher amount – this type of language is not included in our corpus texts. 
Due to the method of double annotation, we discovered that new senses tend 
to hide between the more ambiguous data where the lexicographer is not so 
sure whether the bigram represents a sense or a fixed expression that needs to 
be explained to the dictionary user, or whether it is rather a collocation with 
transparent meanings of both words. However, new senses can also be found 
among bigrams which when presented to the lexicographers in the first place, 
were estimated to be merely collocations of already included senses in the dic-
tionary. In contrast, new fixed expressions were in fact found only when both 
annotators estimated the bigram to be either a new sense or a fixed expression.
6.4.5 Collocations 
Bigrams resulting in updates in the form of a collocation constitute 9% of the 
extracted data, and almost half of those that were annotated as category 2 by 
both lexicographers, also turned out to lead to a new collocation in the diction-
ary. Thereby they constitute the cases in which inter-annotator agreement is 
very high and at the same time they most often corresponded to the type of re-
sulting update Pollak et al. (2019) find a higher percentage of ‘collocationally 
133
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
new collocations’ in their extracted data (13.3%, p. 193), but the many collo-
cations that we chose not to include in the dictionary after a more thorough 
investigation probably explains the difference. In contrast to the DDO update 
guidelines, Pollak et al. (2019) propose that such data should not necessarily 
be left out of dictionaries: “trending vocabulary that is often bound to specific 
political and social events”, should instead be included in digital dictionaries. 
They advocate for “a faster and more fluid lexicography that focuses not only 
on the stable and established, but also on the changeable and variable aspects 
of language – which is where language users often need assistance” (p. 200). 
We find that the inclusion of such data would probably entail an ongoing and 
maybe time-consuming control with the already lexicographically described 
vocabulary in the DDO in order to be sure to avoid lexical information that 
has become outdated. 
Since two thirds of the collocation bigrams did not lead to any updates, we can 
conclude that when two lexicographers independently of one another agree 
that a bigram is a collocation, it is much less likely to represent useful data for 
the semantic update of a dictionary than if at least one of them consider it a 
new sense or fixed expression as described above.
6.4.5 Citations
Many collocations were included in the form of a citation when the data was 
thoroughly inspected, and we are in fact pleased to have discovered a more sys-
tematic way of updating this part of the dictionary information across lemmas.
7  R E S U L T S C O M P A R E D W I T H P R E V I O U S R E S E A R C H
In this section we compare our study with a similar project presented by Cook 
et al. (2013). They used a reference corpus from 1995 and a focus corpus from 
2008 to identify new elements to be included in an English learner’s diction-
ary (Macmillan). In their paper, they use three categories:
1. the uninteresting findings, which are mostly due to the many news sto-
ries in the corpus; certain items exhibit a sudden spike and then they 
disappear and never turn up again; one example of this is the word jun-
ta referring to the regime in Myanmar that would not accept humani-
tarian help from the outside world after a disastrous cyclone that caused 
134 135
Slovenščina 2.0, 2020 (2)
many deaths; another example is the word candy that popped up be-
cause some Chinese candy had been contaminated with melamine;
2. much more interesting are the cases where a dictionary entry should 
be changed in some way, it needs ‘tweaking’; for instance the existing 
entry for cleric, which only referred to clerics typical of the Church of 
England, but in the 2008 corpus, clerics are often Muslim and this 
should be reflected in the entry; the example video is obvious: in the 
1990s a video would be a video tape of the VHS type, but nowadays 
it is typically a digital recording of images and sounds distributed via 
online media;
3. the third category is cases where new senses should be included in 
specific entries in the dictionary, for instance the verb to search (= ‘do 
a web search’), and text as in text messaging, send someone a text or 
text someone, a technology that was not yet available in 1995.
Let us take a look at our findings using more or less the same categories as 
Cook et al. (2013) We have a high number of irrelevant findings, which we first 
categorized as collocations without deciding if they would lead to an actual 
change in the entries for the two words (cf. Section 6.2). The high amount of 
newspaper texts in our corpus accounts for findings related to specific events 
and political discussions; tibetansk flag (‘Tibetan flag’) for instance refers to a 
demonstration where Danish police unlawfully removed a Tibetan flag so that 
it would not be seen by the Chinese president who was visiting Copenhagen.
As is the case for Cook et al. (2013) we have changed (tweaked) several diction-
ary entries, for instance cannabis, where the collocation medicinsk cannabis 
(‘medicinal marijuana’) shows that cannabis may also be used for medical 
purposes nowadays; or intelligente løsninger (‘intelligent solutions’), which 
indicates a new nuance in the meaning of intelligent involving digital func-
tions and computers - so this has been added to the definition (cf. Section 6.3).
The entirely new senses include the word digital; the current entry describes 
the situation in the 1980s and 1990s when you would distinguish between a 
digital watch and an analogue one; of course, this is not up to date and the 
entry digital needs a new sense that will account for collocations like digitale 
indfødte (‘digital natives’) and digital mail.
135
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
A fourth category not mentioned by Cook et al. (2013) is new fixed expressions. 
As mentioned in section 5.4 this category is very salient in the list of bigrams 
and we have decided to include several of these. The most significant one is 
probably sociale medier (‘social media’), which had already been discovered 
by other methods and added to the dictionary; other interesting examples are 
assisteret reproduktion (‘assisted reproduction’), cirkulær økonomi (‘circular 
economy’) and brændende platform (‘burning platform’, i.e. a difficult situa-
tion that urgently needs taking care of); the expression refers to a fire on an oil 
platform in 1988 which resulted in many deaths.
A fifth category contains new lemma candidates, mostly of English origin; 
many of the English bigrams in the list may be included in our dictionary, 
either as headwords consisting of two words (pulled pork) or as a solid com-
pound like komfortzone (‘comfort zone’ in English); even a pragmatic phrase 
like oh, my god and its abbreviation omg are lemma candidates if you take 
into account how common the phrase has become in everyday Danish, and the 
same goes for other English phrases that have been included in the DDO in 
recent years, such as you name it, whatever, and take it or leave it.
8 F I N A L C O N C L U S I O N S A N D P E R S P E C T I V E S
In this final section we make a brief evaluation of our study: what are the 
overall pros and cons of this method and of our approach? On the upside, 
it provides the editors of the DDO with very useful input for updating sens-
es, definitions, collocations, etc. In fact, the editors are so happy with it that 
the plan is to repeat the bigram calculation regularly, for instance every three 
years. It is also very encouraging that the material supports updates that have 
already been made - quite reassuring for a corpus-based dictionary. The ma-
terial is a necessary supplement to other methods used by the dictionary edi-
tors to keep track of lexical and semantic change, like user suggestions, other 
corpus-linguistic data and good old editorial observations since it guarantees 
a systematic check across the entire vocabulary.
A drawback, of course, is that manual filtering is indispensable, but the good 
news is that one experienced lexicographer can fulfill the first phase (discard-
ing non-relevant bigrams), whereas it takes two (or more) lexicographers 
to annotate the rest reliably and eventually make the actual changes in the 
136 137
Slovenščina 2.0, 2020 (2)
dictionary. An important lesson from the experience is that a very large pro-
portion of the bigrams consists of topical (time-limited) examples, which is 
due to the composition of the corpus (mostly newspaper material). Other 
types of corpus texts are too scarce for the time being, and this is a task that 
the dictionary staff intends to work on in the future, keeping in mind, howev-
er, that a homogeneous data type as well as an even distribution of text types 
over time is absolutely necessary in order to obtain good results with the sta-
tistical method that we have described in this paper.
Acknowledgments
The authors would like to thank the anonymous reviewers for their sugges-
tions and careful reading of the manuscript. We would also like to thank our 
colleague Jonas Jensen for useful feedback and for proofreading the article.
R E F E R E N C E S
Dictionaries
DDO = Den Danske Ordbog [The Danish Dictionary]. Retrieved from https://
ordnet.dk/ddo (17. 2. 2020)
Macmillan = Macmillan English Dictionary. Retrieved from https://www.mac-
millandictionary.com/ (17. 2. 2020)
Corpora
Korpus.dsl.dk = Language Technology Resources for Danish. Retrieved from 
https://korpus.dsl.dk/resources.html
Other
Cook, P., Lau, J. H., Rundell, M., McCarthy, D., & Baldwin, T. (2013). A lexico-
graphic appraisal of an automatic approach for detecting new word-sens-
es. In Electronic lexicography in the 21st century: thinking outside the 
paper. Proceedings of the eLex 2013 conference (pp. 49–65). Tallinn, 
Estonia.
Lorentzen, H. (2004). The Danish Dictionary at large: Presentation, Problems 
and Perspectives. In G. Williams & S. Vessier (Eds.), Proceedings of the 
11th EURALEX International Congress (pp. 285–294). Lorient, France.
137
S. NIMB, N. HARTVIG SØRENSEN, H. LORENTZEN: Updating the dictionary
Mikolov, T., Sutskever, I, Chen, K., Corrado, G., & Dean, J. (2013). Distribut-
ed Representations of Words and Phrases and their Compositionality. In 
Advances in neural information processing systems 26. Retrieved from 
https://arxiv.org/abs/1310.4546
Norling-Christensen, O., & Asmussen, J. (1998). The Corpus of The Danish 
Dictionary. Lexikos (Afrilex Series) 8, 223–242.
Pollak, S., Gantar, P., & Arhar Holdt, Š. (2019). What’s New on the Internetz? 
Extraction and Lexical Categorization of Collocations in Computer-Medi-
ated Slovene. In International Journal of Lexicography, 32(2), 184–206.
Řehůřek, R., & Sojka, P. (2010). Software Framework for Topic Modelling with 
Large Corpora. In Proceedings of LREC 2010 workshop New Challenges 
for NLP Frameworks (pp. 46–50). Valletta, Malta: University of Malta.
Řehůřek, R. (2020). models.phrases – Phrase (collocation) detection. Re-
trieved from https://radimrehurek.com/gensim/models/phrases.html (17. 2. 2020)
Tahmasebi, N., Borin, L., & Jatowt, A. (2018). Survey of Computational Ap-
proaches to Lexical Semantic Change [Preprint at ArXiv 2018]. Retrieved 
from https://arxiv.org/abs/1811.06278
Traugott, E. C. (2017). Semantic Change. Oxford Research Encyclopedias 
[Online publication]. doi: 10.1093/acrefore/9780199384655.013.323
138 139
Slovenščina 2.0, 2020 (2)
POSODABLJANJE SLOVARJA: PREPOZNAVANJE 
SEMANTIČNIH SPREMEMB NA PODLAGI 
DIAHRONIH SPREMEMB BIGRAMOV
V prispevku preizkusimo metodo sistematičnega posodabljanja Danskega eno-
jezičnega slovarja z novimi semantičnimi podatki o obstoječih lemah. Metoda 
temelji na hipotezi, da so diahrone spremembe bigramov v korpusnih podatkih 
lahko pokazatelj sprememb pomena ene od besed v bigramu. Pri metodi kom-
biniramo korpusno statistiko z ročnim označevanjem. V prvem koraku izmeri-
mo kolokacijske spremembe v homogenem korpusu novic za 14-letno obdobje 
(2005 do 2018), tako da izračunamo vse statistično pomembne bigrame. Te 
bigrame potem preverimo v novi različici korpusa, razdeljenega na podkorpuse, 
pri čemer vsak podkorpus zajema obdobje enega leta. Nato izluščimo vse bi-
grame, ki se nikoli ne pojavijo v prvih treh letih, se pa pojavijo vsaj 20-krat v 
naslednjih 11 letih. Na podlagi tega postopka dobljenih 745 bigramov, ki jih 
obravnavamo kot potencialno nove v danskem jeziku, označita dva označev-
alca. Bigrami so glede na rezultate označevanja in ujemanje označevalcev bodisi 
izločeni bodisi razvrščeni v skupine glede na relevantnost za nadaljnjo obravna-
vo. Sledi temeljitejša leksikografska analiza, s katero določimo, do kakšne mere 
gre za nove pomene besed in posledično potrebo po spremembi pomenske 
členitve pri vsaj eni od besed v bigramu. Poleg tega analiziramo tudi povezavo 
med potrebnimi popravki, oznakami in odstotkom ujemanja označevalcev. V 
zadnjem delu prispevka primerjamo slovarske posodobitve s pristopom, ki so 
ga izvedli Cook idr. (2013), in podamo razmisleke o tem, ali tovrstna metoda 
lahko predstavlja doslednejše popravljanje in dopolnjevanje slovarskih gesel.
Ključne besede: korpusna statistika, bigrami, posodabljanje slovarja, semantične 
spremembe, danski jezik
To delo je ponujeno pod licenco Creative Commons: Priznanje avtorstva-Deljenje pod enakimi 
pogoji 4.0 Mednarodna. / This work is licensed under the Creative Commons Attribution-Share-
Alike 4.0 International.
https://creativecommons.org/licenses/by-sa/4.0/