Abstract
The scarcity of labeled data in real-world scenarios is a critical bottleneckof deep learning's effectiveness. Semi-supervised semantic segmentation hasbeen a typical solution to achieve a desirable tradeoff between annotation costand segmentation performance. However, previous approaches, whether based onconsistency regularization or self-training, tend to neglect the contextualknowledge embedded within inter-pixel relations. This negligence leads tosuboptimal performance and limited generalization. In this paper, we propose anovel approach IPixMatch designed to mine the neglected but valuableInter-Pixel information for semi-supervised learning. Specifically, IPixMatchis constructed as an extension of the standard teacher-student network,incorporating additional loss terms to capture inter-pixel relations. It shinesin low-data regimes by efficiently leveraging the limited labeled data andextracting maximum utility from the available unlabeled data. Furthermore,IPixMatch can be integrated seamlessly into most teacher-student frameworkswithout the need of model modification or adding additional components. Ourstraightforward IPixMatch method demonstrates consistent performanceimprovements across various benchmark datasets under different partitioningprotocols.