Compared to other subjects the typical introductory programming (CS1) course has higher than usual rates of both failing and high grades, creating a characteristic bimodal grade distribution. In this paper I explore two possible explanations. The conventional explanation has been that learners naturally fall into populations of programmers and non-programmers. A review of decades of research, however, finds little or no evidence to support this account. I propose an alternative explanation, the learning edge momentum (LEM) effect. This hypothesis is introduced by way of a simulated model of grade distributions, then grounded in the psychological and educational literature. LEM operates such that success in acquiring one concept makes learning other closely linked concepts easier (whereas failure makes it harder). This interaction between the way that people learn and the tightly integrated nature of the concepts comprising a programming language creates an inherent structural bias in CS1 which drives students towards extreme outcomes.