13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Large Language Models are powerful tools for program synthesis and advanced auto-completion, but come with no guarantee that their output code is syntactically correct. This paper contributes an incremental parser that allows early rejection of syntactically incorrect code, as well as efficient detection of complete programs for fill-in-the-middle (FItM) tasks. We develop Earley-style parsers that operate over left and right quotients of arbitrary context-free grammars, and we extend our incremental parsing and quotient operations to several context-sensitive features present in the grammars of many common programming languages. The result of these contributions is an efficient, general, and well-grounded method for left and right quotient parsing. To validate our theoretical contributions -- and the practical effectiveness of certain design decisions -- we evaluate our method on the particularly difficult case of FItM completion for Python 3. Our results demonstrate that constrained generation can significantly reduce the incidence of syntax errors in recommended code.

          Related collections

          Author and article information

          Journal
          27 February 2024
          Article
          2402.17988
          ad223d86-28a0-49e7-bf54-522d4f8fcf15

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          20 pages, Code available at https://github.com/amazon-science/incremental-parsing
          cs.PL cs.LG cs.SE

          Software engineering,Programming languages,Artificial intelligence
          Software engineering, Programming languages, Artificial intelligence

          Comments

          Comment on this article