1. the complete title of one (or more) paper(s) published in the open literature describing the work that the author claims describes a human-competitive result; A flexible symbolic regression method for constructing interpretable clinical prediction models 2. the name, complete physical mailing address, e-mail address, and phone number of EACH author of EACH paper(s); William G. La Cava (1) [william.lacava@childrens.harvard.edu], Paul C. Lee (2), Imran Ajmal (2), Xiruo Ding (2), Priyanka Solanki (2), Jordana B. Cohen (3,4), Jason H. Moore (4) and Daniel S. Herman (2) [daniel.herman2@pennmedicine.upenn.edu] (1) Computational Health Informatics Program, Boston Children’s Hospital, Harvard Medical School, 300 Longwood Ave, BCH Mailstop LM 5528.2, Boston, MA, 02115, USA. william.lacava@childrens.harvard.edu (2) Department of Pathology and Laboratory Medicine, Richards Building, University of Pennsylvania, Philadelphia, PA, USA. (3) Division of Renal-Electrolyte and Hypertension, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA. (4) Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA 3. the name of the corresponding author (i.e., the author to whom notices will be sent concerning the competition); William G. La Cava 4. the abstract of the paper(s); Machine learning (ML) models trained for triggering clinical decision support (CDS) are typically either accurate or interpretable but not both. Scaling CDS to the panoply of clinical use cases while mitigating risks to patients will require many ML models be intuitively interpretable for clinicians. To this end, we adapted a symbolic regression method, coined the feature engineering automation tool (FEAT), to train concise and accurate models from high-dimensional electronic health record (EHR) data. We first present an in-depth application of FEAT to classify hypertension, hypertension with unexplained hypokalemia, and apparent treatment-resistant hypertension (aTRH) using EHR data for 1200 subjects receiving longitudinal care in a large healthcare system. FEAT models trained to predict phenotypes adjudicated by chart review had equivalent or higher discriminative performance (p < 0.001) and were at least three times smaller (p < 1 × 10−6) than other potentially interpretable models. For aTRH, FEAT generated a six-feature, highly discriminative (positive predictive value = 0.70, sensitivity = 0.62), and clinically intuitive model. To assess the generalizability of the approach, we tested FEAT on 25 benchmark clinical phenotyping tasks using the MIMIC-III critical care database. Under comparable dimensionality constraints, FEAT’s models exhibited higher area under the receiver-operating curve scores than penalized linear models across tasks (p < 6 × 10−6). In summary, FEAT can train EHR prediction models that are both intuitively interpretable and accurate, which should facilitate safe and effective scaling of ML-triggered CDS to the panoply of potential clinical use cases and healthcare practices. 5. a list containing one or more of the eight letters (A, B, C, D, E, F, G, or H) that correspond to the criteria (see above) that the author claims that the work satisfies; D, E 6. a statement stating why the result satisfies the criteria that the contestant claims (see examples of statements of human-competitiveness as a guide to aid in constructing this part of the submission); We demonstrate in this work that a symbolic regression method can automatically discover a remarkably concise description of a disease diagnosis (apparent treatment resistant hypertension) that is more accurate than any similarly sized model, including those created through manual programming by human experts (E). By interpreting the model, we find a new covariate to be considered during for screening patient populations for this outcome: a high maximum calcium lab measurement. By evaluating this feature we find that it is likely used by the model to identify patients on diuretic medications for hypertension, who likely otherwise would not have been identified as having treatment resistant hypertension (due to, e.g., missing data in their health record) (E). 7. a full citation of the paper (that is, author names; title, publication date; name of journal, conference, or book in which article appeared; name of editors, if applicable, of the journal or edited book; publisher name; publisher city; page numbers, if applicable); La Cava, W.G., Lee, P. C., Ajmal, I., Ding, X., Solanki, P., Cohen, J. B., Moore, J. H., & Herman, D. S. (2023). A flexible symbolic regression method for constructing interpretable clinical prediction models. npj Digital Medicine. https://doi.org/10.1038/s41746-023-00833-8. In Press. Preprint on MedRxiv: 10.1101/2020.12.12.20248005 8. a statement either that "any prize money, if any, is to be divided equally among the co-authors" OR a specific percentage breakdown as to how the prize money, if any, is to be divided among the co-authors; Any prize money, if any, is to be divided equally among William G. La Cava, Paul C. Lee, and Daniel S. Herman. 9. a statement stating why the authors expect that their entry would be the "best," and 10. An indication of the general type of genetic or evolutionary computation used, such as GA (genetic algorithms), GP (genetic programming), ES (evolution strategies), EP (evolutionary programming), LCS (learning classifier systems), GI (genetic improvement), GE (grammatical evolution), GEP (gene expression programming), DE (differential evolution), etc. GP 11. The date of publication of each paper. If the date of publication is not on or before the deadline for submission, but instead, the paper has been unconditionally accepted for publication and is “in press” by the deadline for this competition, the entry must include a copy of the documentation establishing that the paper meets the "in press" requirement. Accepted: May 5 2023 Currently in Press. Proof of acceptance and proof version attached to this submission.