A technology for the automatic multi-class labeling of brain electron microscopy (EM) objects needed to create large synthetic datasets, which could be used for brain cell segmentation tasks, is proposed.The main research tools were a generative diffusion stoneraven d2 AI model and a U-Net-like segmentation model.The technology was studied on the segmentation task of up to six brain organelles.The initial dataset used was the popular EPFL dataset labeled for the mitochondria class, which has training and test parts having 165 layers each.Our mark up for the EPFL dataset was named EPFL6 and contained six classes.
The technology was implemented and studied in a two-step experiment: (1) dataset synthesis using a diffusion model trained on EPFL6; (2) evaluation of the labeling accuracy of a multi-class synthetic dataset by the segmentation accuracy on the test part of EPFL6.It was found that (1) the segmentation accuracy of the mitochondria class invacare 9000 xt recliner wheelchair for the diffusion synthetic datasets corresponded to the accuracy of the original ones; (2) augmentation via geometric synthetics provided a better accuracy for underrepresented classes; (3) the naturalization of geometric synthetics by the diffusion model yielded a positive effect; (4) due to the augmentation of the 165 layers of the original EPFL dataset with diffusion synthetics, it was possible to achieve and surpass the record accuracy of Dice = 0.948, which was achieved using 3D estimation in Hive-net (2021).