The Large Intelligent Surface (LIS) is a promising technology in the areas of wireless communication, remote sensing and positioning. It consists of a continuous radiating surface located in the proximity of the users, with the capability to communicate by transmission and reception (replacing base stations). Despite of its potential, there are numerous challenges from implementation point of view, being the interconnection data-rate, computational complexity, and storage the most relevant ones. In order to address those challenges, hierarchical architectures with distributed processing techniques are envisioned to to be relevant for this task, while ensuring scalability. In this work we perform algorithm-architecture codesign to propose two distributed interference cancellation algorithms, and a tree-based interconnection topology for uplink processing. We also analyze the performance, hardware requirements, and architecture tradeoffs for a discrete LIS, in order to provide concrete case studies and guidelines for efficient implementation of LIS systems.