Quantum computers promise tremendous impact across applications -- and have shown great strides in hardware engineering -- but remain notoriously error prone. Careful design of low-level controls has been shown to compensate for the processes which induce hardware errors, leveraging techniques from optimal and robust control. However, these techniques rely heavily on the availability of highly accurate and detailed physical models which generally only achieve sufficient representative fidelity for the most simple operations and generic noise modes. In this work, we use deep reinforcement learning to design a universal set of error-robust quantum logic gates on a superconducting quantum computer, without requiring knowledge of a specific Hamiltonian model of the system, its controls, or its underlying error processes. We experimentally demonstrate that a fully autonomous deep reinforcement learning agent can design single qubit gates up to $3times$ faster than default DRAG operations without additional leakage error, and exhibiting robustness against calibration drifts over weeks. We then show that $ZX(-pi/2)$ operations implemented using the cross-resonance interaction can outperform hardware default gates by over $2times$ and equivalently exhibit superior calibration-free performance up to 25 days post optimization using various metrics. We benchmark the performance of deep reinforcement learning derived gates against other black box optimization techniques, showing that deep reinforcement learning can achieve comparable or marginally superior performance, even with limited hardware access.