High operating speeds and use of aggressive fabrica- tion technologies necessitate validation of mixed-signal electronic systems at every stage of top-down design: behavioral to netlist to physical design to silicon. At each step, design validation establishes the equivalence of lower level design descriptions against their higher level specifications. In this paper, a novel reinforcement-learning guided stimulus generation algorithm is presented that excites behavioral differences in the statistics of observed responses between high and low-level descriptions of an analog/mixed-signal device (as opposed to the difference magnitude as in prior research). These discovered differences are learned using series-parallel interconnected machine learn- ing kernels appended to the device model and the process is repeated until no further differences can be excited via stimulus generation. The latter behavior difference learning is significantly facilitated by the proposed stimulus generation approach as opposed to prior research. We present the formulation of design validation as a Markov decision process and discuss a reward metric for reinforcement learning based on the statistics of observed device responses as discussed earlier. Integration of the proposed design validation methodology with deep-Q learning software and the suite of Cadence simulation tools is discussed. Validation results for selected design bugs in representative designs are presented and show the quality and efficiency of the proposed design validation methodology.