How Are the Input and Output Data Elements Linked in Machine Learning?

In deep learning, the datasets of the input and output data elements are fed separately into the models. Sometimes it may be needed to shuffle the datasets to split them into cross-validation and training examples. I am wondering how the corresponding raws in both datasets are linked during training,

