Menu
support@authoritypapers.com
+1(805) 568 7317

after performing k means clusters let us suppose that we examine the clusters by sig 5100825

After performing K-means clusters, let us suppose that we examine the clusters by sight and assign names

to them. For example, one cluster may represent documents about sports, another may represent documents

about politics, and yet another may represent documents about animals. Let us assume that we assign each

cluster a name such as sports, politics, and animals.

Sometimes, words are used in multiple contexts. For example, the word duck is ambiguous. Sometimes it

means a waterfowl and would fall into the animal category. Sometimes it is used in politics such as a lame

duck congress and would fall into the politics category. Sometime it is used in sports such as the name of a

National Hockey League team the Anaheim Ducks and would fall into the sports category. Knowing which

context the word is used makes the clustering much better. To understand why, suppose that we had two

documents, one with the words duck and water, and the other with the words duck and ice. Without

understanding the context of the word duck, our similarity metric may actually find that these documents

are similar. However, understanding that when duck appears with water, the word duck probably refers to

an animal, whereas when duck appears with ice, the word duck probably refers to sports. With this

knowledge, our similarity metric would find these documents not very similar at all.

Suppose we had a library of words that are used in multiple contexts such as:

String[] multiContextWords= {“duck”, “crane”, “book”, …};

Suppose also that we have a multi-dimensional array that shows the multi-context words and common

words that are used with them:

Attachments:

"Order a similar paper and get 15% discount on your first order with us
Use the following coupon
"GET15"

Order Now