Content

In Assignment 1, we will explore the word vectors. Here, we will explore two types of word vectors: those derived from co-occurrence matrices, and those derived via GloVe.

Co-occurrence Matrix

Example: Co-Occurrence with Fixed Window of n=1:

Document 1: “all that glitters is not gold”

Document 2: “all is well that ends well”

*<START>allthatglittersisnotgoldwellends<END>
<START>0200000000
all2010100000
that0101000110
glitters0010100000
is0101010100
not0000101000
gold0000010001
well0010100011
ends0010000100
<END>0000001100

Here I won’t show the code which will be attached in the code part.

Results

  1. After I complete the code of Co-occurrence matrix part and pass all the simple check. My result figure is kind of flipping upside down compared to the reference answer.
  2. While in the GloVe part, the result figure is flipping leftside right.

Note

  1. For the word which has multiple meanings, it’s hard to detect all the meaning space to fit the analogy test well.
  2. And also for the analysis of bias should be payed attention to, like gender, race, religion etc. When training the model, we should use the corpus carefully.

Code

Assign 1 Completion