The link between the delta rule for homework 2, question 1a and homework 1, question 2c is most easily seen if the labels (y values) are thought of as vectors rather than integer values. i.e. instead of thinking of the labels being {1,2,...,k}, think of them as {[1 0 ... 0], [0 1 0 ... 0], ... [0 ... 0 1]}. It might be instructive to first try reformulating 2c on hw1 as a problem where the labels are vectors {[1 0], [0 1]} rather than {0,1}.
Equation (5) should have a sum between the epsilon and the partial with respect to w_{ij} (similar to what we had in equation (5) on the last homework).