|
|
|
@ -1,8 +1,8 @@
|
|
|
|
|
function x = emailFeatures(word_indices)
|
|
|
|
|
%EMAILFEATURES takes in a word_indices vector and produces a feature vector
|
|
|
|
|
%from the word indices
|
|
|
|
|
% x = EMAILFEATURES(word_indices) takes in a word_indices vector and
|
|
|
|
|
% produces a feature vector from the word indices.
|
|
|
|
|
% x = EMAILFEATURES(word_indices) takes in a word_indices vector and
|
|
|
|
|
% produces a feature vector from the word indices.
|
|
|
|
|
|
|
|
|
|
% Total number of words in the dictionary
|
|
|
|
|
n = 1899;
|
|
|
|
@ -12,20 +12,20 @@ x = zeros(n, 1);
|
|
|
|
|
|
|
|
|
|
% ====================== YOUR CODE HERE ======================
|
|
|
|
|
% Instructions: Fill in this function to return a feature vector for the
|
|
|
|
|
% given email (word_indices). To help make it easier to
|
|
|
|
|
% given email (word_indices). To help make it easier to
|
|
|
|
|
% process the emails, we have have already pre-processed each
|
|
|
|
|
% email and converted each word in the email into an index in
|
|
|
|
|
% a fixed dictionary (of 1899 words). The variable
|
|
|
|
|
% word_indices contains the list of indices of the words
|
|
|
|
|
% which occur in one email.
|
|
|
|
|
%
|
|
|
|
|
%
|
|
|
|
|
% Concretely, if an email has the text:
|
|
|
|
|
%
|
|
|
|
|
% The quick brown fox jumped over the lazy dog.
|
|
|
|
|
%
|
|
|
|
|
% Then, the word_indices vector for this text might look
|
|
|
|
|
% Then, the word_indices vector for this text might look
|
|
|
|
|
% like:
|
|
|
|
|
%
|
|
|
|
|
%
|
|
|
|
|
% 60 100 33 44 10 53 60 58 5
|
|
|
|
|
%
|
|
|
|
|
% where, we have mapped each word onto a number, for example:
|
|
|
|
@ -48,14 +48,11 @@ x = zeros(n, 1);
|
|
|
|
|
%
|
|
|
|
|
%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
for j = 1:length(word_indices)
|
|
|
|
|
x(word_indices(j)) = 1;
|
|
|
|
|
end
|
|
|
|
|
|
|
|
|
|
% =========================================================================
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
end
|
|
|
|
|