Representing complex 3D objects as simple geometric primitives, known as shape abstraction, is important for geometric modeling, structural analysis, and shape synthesis. In this paper, we propose an unsupervised shape abstraction method to map a point cloud into a compact cuboid representation. We jointly predict cuboid allocation as part segmentation and cuboid shapes and enforce the consistency between the segmentation and shape abstraction for self-learning. For the cuboid abstraction task, we transform the input point cloud into a set of parametric cuboids using a variational auto-encoder network. The segmentation network allocates each point into a cuboid considering the point-cuboid affinity. Without manual annotations of parts in point clouds, we design four novel losses to jointly supervise the two branches in terms of geometric similarity and cuboid compactness. We evaluate our method on multiple shape collections and demonstrate its superiority over existing shape abstraction methods. Moreover, based on our network architecture and learned representations, our approach supports various applications including structured shape generation, shape interpolation, and structural shape clustering.