Wireless connectivity creates a computing paradigm that merges communication and inference. A basic operation in this paradigm is the one where a device offloads classification tasks to the edge servers. We term this remote classification, with a potential to enable intelligent applications. Remote classification is challenged by the finite and variable data rate of the wireless channel, which affects the capability to transfer high-dimensional features and thus limits the classification resolution. We introduce a set of metrics under the name of classification capacity that are defined as the maximum number of classes that can be discerned over a given communication channel while meeting a target classification error probability. The objective is to choose a subset of classes from a library that offers satisfactory performance over a given channel. We treat two cases of subset selection. First, a device can select the subset by pruning the class library until arriving at a subset that meets the targeted error probability while maximizing the classification capacity. Adopting a subspace data model, we prove the equivalence of classification capacity maximization to Grassmannian packing. The results show that the classification capacity grows exponentially with the instantaneous communication rate, and super-exponentially with the dimensions of each data cluster. This also holds for ergodic and outage capacities with fading if the instantaneous rate is replaced with an average rate and a fixed rate, respectively. In the second case, a device has a preference of class subset for every communication rate, which is modeled as an instance of uniformly sampling the library. Without class selection, the classification capacity and its ergodic and outage counterparts are proved to scale linearly with their corresponding communication rates instead of the exponential growth in the last case.