Existing blockchain systems scale poorly because of their distributed consensus protocols. Current attempts at improving blockchain scalability are limited to cryptocurrency. Scaling blockchain systems under general workloads (i.e., non-cryptocurrency applications) remains an open question. In this work, we take a principled approach to apply sharding, which is a well-studied and proven technique to scale out databases, to blockchain systems in order to improve their transaction throughput at scale. This is challenging, however, due to the fundamental difference in failure models between databases and blockchain. To achieve our goal, we first enhance the performance of Byzantine consensus protocols, by doing so we improve individual shards throughput. Next, we design an efficient shard formation protocol that leverages a trusted random beacon to securely assign nodes into shards. We rely on trusted hardware, namely Intel SGX, to achieve high performance for both consensus and shard formation protocol. Third, we design a general distributed transaction protocol that ensures safety and liveness even when transaction coordinators are malicious. Finally, we conduct an extensive evaluation of our design both on a local cluster and on Google Cloud Platform. The results show that our consensus and shard formation protocols outperform state-of-the-art solutions at scale. More importantly, our sharded blockchain reaches a high throughput that can handle Visa-level workloads, and is the largest ever reported in a realistic environment.