# 第四次作业

在本次作业中，大家需要实现SIGN模型并在cora数据集上进行测试。

本作业需要安装[CogDL](https://github.com/THUDM/cogdl)：pip install cogdl

如需使用gpu版，请先安装gpu版本的[PyTorch](https://pytorch.org/get-started/locally/)，再安装cogdl。

本作业由智谱GNN中心及课程团队筹备，由CogDL团队提供技术支持。


### 1. 实现SIGN模型
可以调用CogDL提供的spmm(graph, x)函数来进行稀疏矩阵乘AX。
模型实现可以参考原SIGN论文：https://arxiv.org/pdf/2004.11198.pdf 。

In [1]:
import os
import torch
import torch.nn as nn
import torch.nn.functional as F

from cogdl.models.nn.mlp import MLP
from cogdl.utils import spmm


class SIGN(nn.Module):

    def __init__(
        self,
        num_features,
        hidden_size,
        num_classes,
        num_layers,
        dropout,
        nhop,
        norm="batchnorm",
        activation="relu",
    ):
        super(SIGN, self).__init__()
        self.nhop = nhop
        num_features = num_features * (1 + nhop) # nhop: number of hops
        self.mlp = MLP(
            in_feats=num_features,
            out_feats=num_classes,
            hidden_size=hidden_size,
            num_layers=num_layers,
            dropout=dropout,
            activation=activation,
            norm=norm,
        )

        self.all_x = None

    def preprocessing(self, graph, x):
        graph.sym_norm() # 对图进行对称归一化

        ###################
        ##### 作业填空 #####
        ###################
        # 进行多次feature propagation，并将所有特征拼接起来，特征维度与MLP模型输入一致
    
        self.all_x = ...

    def forward(self, graph):
        ###################
        ##### 作业填空 #####
        ###################
        # 将MLP模型应用到预先计算出来的特征中

        ret = ...
        return ret

### 2. 从cogdl中加载cora数据集（x表示特征，y表示标签，mask表示训练/验证/测试集的划分）

In [2]:
from cogdl.datasets import build_dataset_from_name

dataset = build_dataset_from_name("cora")
data = dataset[0]
print(data)

Graph(x=[2708, 1433], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708], edge_index=[2, 10556])


### 3. 使用你实现的SIGN模型进行训练

In [3]:
import math
import copy
from tqdm import tqdm

def accuracy(y_pred, y_true):
    y_true = y_true.squeeze().long()
    preds = y_pred.max(1)[1].type_as(y_true)
    correct = preds.eq(y_true).double()
    correct = correct.sum().item()
    return correct / len(y_true)

def train(hidden_size=128, nhop=3):
    hidden_size = hidden_size
    model = SIGN(data.x.shape[1], hidden_size, data.y.max() + 1, num_layers=2, dropout=0.5, nhop=nhop)
    model.preprocessing(data, data.x)

    if torch.cuda.is_available():
        device = torch.device("cuda")
        model = model.to(device)
        data.to(device)

    optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
    epoch_iter = tqdm(range(100), position=0, leave=True)
    best_model = None
    best_loss = 1e8
    for epoch in epoch_iter:
        model.train()
        optimizer.zero_grad()
        logits = model(data)
        loss = F.cross_entropy(logits[data.train_mask], data.y[data.train_mask])
        loss.backward()
        optimizer.step()
        train_loss = loss.item()

        model.eval()
        with torch.no_grad():
            logits = model(data)
            val_loss = F.cross_entropy(logits[data.val_mask], data.y[data.val_mask]).item()
            val_acc = accuracy(logits[data.val_mask], data.y[data.val_mask])
            if val_loss < best_loss:
                best_loss = val_loss
                best_model = copy.deepcopy(model)

        epoch_iter.set_description(f"Epoch: {epoch:03d}, Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")

    with torch.no_grad():
        logits = best_model(data)
        val_acc = accuracy(logits[data.val_mask], data.y[data.val_mask])
        test_acc = accuracy(logits[data.test_mask], data.y[data.test_mask])
    print("Val Acc", val_acc)
    print("Test Acc", test_acc)

train()

Epoch: 099, Train Loss: 0.0002, Val Loss: 0.9386, Val Acc: 0.7360: 100%|██████████| 100/100 [00:15<00:00,  6.28it/s]


Val Acc 0.74
Test Acc 0.767


### 4. 设置不同的hidden_size和nhop数进行实验，分析这两个参数的影响

hidden_size可以从64, 128, 256, 512中选取，nhop可以从2～6中选取

In [4]:
train(hidden_size=64, nhop=2)

Epoch: 099, Train Loss: 0.0034, Val Loss: 1.0779, Val Acc: 0.6620: 100%|██████████| 100/100 [00:06<00:00, 14.61it/s]


Val Acc 0.67
Test Acc 0.679
