In the Categorical Clustering problem, we are given a set of vectors (matrix) A={a_1,ldots,a_n} over Sigma^m, where Sigma is a finite alphabet, and integers k and B. The task is to partition A into k clusters such that the median objective of the clustering in the Hamming norm is at most B. That is, we seek a partition {I_1,ldots,I_k} of {1,ldots,n} and vectors c_1,ldots,c_kinSigma^m such that sum_{i=1}^ksum_{jin I_i}d_h(c_i,a_j)leq B, where d_H(a,b) is the Hamming distance between vectors a and b. Fomin, Golovach, and Panolan [ICALP 2018] proved that the problem is fixed-parameter tractable (for binary case Sigma={0,1}) by giving an algorithm that solves the problem in time 2^{O(Blog B)} (mn)^{O(1)}. We extend this algorithmic result to a popular capacitated clustering model, where in addition the sizes of the clusters should satisfy certain constraints. More precisely, in Capacitated Clustering, in addition, we are given two non-negative integers p and q, and seek a clustering with pleq |I_i|leq q for all iin{1,ldots,k}. Our main theorem is that Capacitated Clustering is solvable in time 2^{O(Blog B)}|Sigma|^B(mn)^{O(1)}. The theorem not only extends the previous algorithmic results to a significantly more general model, it also implies algorithms for several other variants of Categorical Clustering with constraints on cluster sizes.