Blog

Discovery, characterization and engineering of ligases for amide synthesis | Nature

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Nature volume  593, pages 391–398 (2021 )Cite this article Ethyl4-Oxocyclohexancarboxylate

Discovery, characterization and engineering of ligases for amide synthesis | Nature

Coronatine and related bacterial phytotoxins are mimics of the hormone jasmonyl-l -isoleucine (JA-Ile), which mediates physiologically important plant signalling pathways1,2,3,4. Coronatine-like phytotoxins disrupt these essential pathways and have potential in the development of safer, more selective herbicides. Although the biosynthesis of coronatine has been investigated previously, the nature of the enzyme that catalyses the crucial coupling of coronafacic acid to amino acids remains unknown1,2. Here we characterize a family of enzymes, coronafacic acid ligases (CfaLs), and resolve their structures. We found that CfaL can also produce JA-Ile, despite low similarity with the Jar1 enzyme that is responsible for ligation of JA and l -Ile in plants5. This suggests that Jar1 and CfaL evolved independently to catalyse similar reactions—Jar1 producing a compound essential for plant development4,5, and the bacterial ligases producing analogues toxic to plants. We further demonstrate how CfaL enzymes can be used to synthesize a diverse array of amides, obviating the need for protecting groups. Highly selective kinetic resolutions of racemic donor or acceptor substrates were achieved, affording homochiral products. We also used structure-guided mutagenesis to engineer improved CfaL variants. Together, these results show that CfaLs can deliver a wide range of amides for agrochemical, pharmaceutical and other applications.

Satyajit Roy, David A. Vargas, … Rudi Fasan

Hongting Tang, Lianghuan Wu, … Tao Yu

Genki Hibi, Taro Shiraishi, … Tomohisa Kuzuyama

Coronatine 3 (COR) is an important phytotoxin produced by bacterial plant pathogens; it is composed of the polyketide coronafacic acid 1 (CFA), conjugated via an amide bond to coronamic acid 2 (CMA), an unusual cyclopropyl amino acid (Fig. 1a)1,2. COR is a structural mimic of JA-Ile (6), a ubiquitous plant hormone that is essential for plant development and defence3. The biologically active stereoisomer is (3R,7S)-JA-Ile, but the C7 stereocentre rapidly epimerizes at physiological pH to the more stable but inactive trans (3R,7R) diastereoisomer, which modulates its activity4. In contrast, COR is configurationally stable, endowing it with increased potency and longevity. The conjugation of JA and l -Ile in plants is catalysed by the adenosine triphosphate (ATP)-dependent ligase Jar1 (Fig. 1b), a member of the ANL superfamily5. ANL enzymes generate acyl-adenylate (acyl-AMP) intermediates that undergo substitution with various nucleophiles. For example, acyl-CoA synthetases (ACSs) are common ANL enzymes generating thioesters, which can be coupled to an amine by a secondary N-acyltransferase. Jar1 is an amide bond synthetase (ABS), a rarer subclass of the ANL superfamily which accept amine nucleophiles directly without requiring an additional partner enzyme (Supplementary Fig. 1)5,6.

a, Bacterial CfaL enzymes are predicted to ligate coronafacic acid (CFA) 1 with coronamic acid (CMA) 2 or l -amino acids to generate phytotoxins, including COR 3 and CFA-Ile 4. b, In plants, the enzyme Jar1 ligates jasmonic acid (JA) and l -isoleucine to produce JA-Ile epimers (3R,7S)-6 and (3R,7S)-6. c, Polyketide synthase (PKS) assembly of 1 from succinic semialdehyde. PKS consists of acyl-carrier protein (ACP), acyl-transferase (AT), keto-synthase (KS), dehydratase (DH), enoyl-reductase (ER), keto-reductase (KR) and thioesterase (TE) domains. d, Nonribosomal peptide synthetase (NRPS)-mediated biosynthesis of 2 in P. syringae. NRPS consists of adenylation (A), thiolation (T) and thioesterase (TE) domains.

CFA is assembled by a type I polyketide synthase via the cyclization of a β-ketothioester intermediate (Fig. 1c)1,7,8,9. The biosynthesis of CMA occurs via a nonribosomal peptide synthetase (NRPS)-mediated cryptic chlorination and cyclization (Fig. 1d)10,11,12. A putative ligase (CfaL) within the Pseudomonas syringae COR biosynthetic gene cluster is predicted to couple CFA and CMA to form COR (Fig. 1a, Supplementary Fig. 2a and Supplementary Table 1). Previous attempts to characterize this ligase have been unsuccessful13. Other plant pathogens possess a COR-like biosynthetic gene cluster14,15, including Streptomyces scabies which has a putative CfaL and CFA biosynthetic genes, but no CMA pathway (Supplementary Fig. 2b and Supplementary Table 1)15. Consequently, this strain produces predominantly CFA-l -Ile 4, along with smaller quantities of l -Val and l -allo-Ile adducts, which have also been detected from P. syringae16,17,18. As CFA must be coupled to an amino acid to elicit biological activity, we sought to characterize the key CfaL enzymes to enable new routes to COR-like phytotoxins as potential herbicides19. We were also interested in exploring the relationship between the bacterial CfaL and the functionally related plant ligase Jar1.

As well as being fundamental in nature, amide formation is one of the most widely used synthetic transformations. Although coupling acids and amines is relatively simple, it often requires three steps, protect–couple–deprotect, to install each amide. Stoichiometric quantities of expensive and deleterious coupling reagents are typically required and purification can be problematic20. While some progress has been made in the development of chemocatalytic methods for amide synthesis, these have not been widely adopted21,22,23,24,25,26,27. Consequently, there is interest in the development of enzymatic alternatives6,20,28,29,30,31. In this work, we characterize CfaL ligases and demonstrate how they are highly versatile biocatalysts for the synthesis of ubiquitous amides. Additionally, using structure-guided mutagenesis, we generate improved ligases providing more sustainable, alternative routes for production of pharmaceuticals, agrochemicals and other valuable materials.

Overproduction of the CfaL from P. syringae (PsCfaL) in Escherichia coli resulted in only trace amounts of active enzyme. However, assays demonstrated that PsCfaL catalyses the ATP-dependent coupling of l -isoleucine and CFA, obtained from acid hydrolysis of coronatine, confirming the function of CfaL for the first time (Supplementary Figs. 3 and 4). The low quantity of PsCfaL available prevented full characterization, and so alternative CfaL homologues were explored. In addition to the putative S. scabies ligase (SsCfaL), other candidates located within putative COR-like clusters were selected from BLAST analysis (Supplementary Fig. 2). Of these, PbCfaL from Pectobacterium brasiliense and AlCfaL from Azospirillum lipoferum were chosen for characterization, as they are predicted to be more amenable to crystallization (Supplementary Table 2). While P. brasiliense is a well-known plant pathogen32,33, A. lipoferum is a root-dwelling, nitrogen-fixing plant symbiont that is not known to produce coronatine34.

SsCfaL was overproduced in E. coli (Supplementary Fig. 5) and assays with synthetic (±)-CFA19, l -isoleucine and ATP showed the direct formation of CFA-l -Ile 4 via a CFA-AMP intermediate (Supplementary Fig. 6), confirming CfaL is an ABS enzyme. In addition, SsCfaL also accepted the aromatic CFA variant 7 which when coupled to l -Ile forms coronalone, a simplified synthetic COR analogue with promising herbicidal activity35. Given adenylation occurs in the absence of amine substrate, the rate of adenylation can be measured in isolation (Extended Data Table 1 and Supplementary Fig. 7). The rate of adenylation (kcat) was greatest with (±)-CFA but the aromatic analogue 7 was found to have a lower Michaelis constant, Km. Both (3R,7R)- and (3S,7S)-enantiomers of trans-JA 5 were also accepted, albeit at a lower level than CFA or 7, with a preference towards the natural (3R,7R)-5 stereoisomer. This indicates that, despite low amino acid sequence similarity (16%), the bacterial SsCfaL and plant Jar1 both catalyse the ligation of JA with l -Ile (Supplementary Fig. 8). Reactions with deactivated enzyme confirmed that the adenylation of 5 and the subsequent reaction with l -Ile are both enzyme-catalysed. Samples of trans-JA (Extended Data Table 1) contain a minor amount of the less stable cis epimer, owing to facile C7-epimerization (Fig. 1b)4. To explore the stereoselectivity of SsCfaL further, all four stereoisomers of configurationally stable 7-methyl-jasmonic acid were synthesized (Supplementary Fig. 9). Initially the trans and cis diastereoisomers of 7-methyl-jasmonic acid were separated and tested as a racemic mixture. However, these were poor substrates for SsCfaL, hence the resolution of all four stereoisomers was not carried out. The selectivity of SsCfaL was further tested by incubating CFA 1 or (±)-jasmonic acid 5 with 21 proteinogenic amino acids, resulting in a wide range of amino acid conjugates (Supplementary Figs. 10 and 11). Hydrophobic amino acids such as l -isoleucine and l -valine were preferred by SsCfaL which reflects the COR-like metabolites isolated from S. scabies16. No activity was seen with d -amino acids, or with primary amines and dipeptides (Supplementary Fig. 12). AlCfaL and PbCfaL, expressed from codon optimized synthetic genes, were seen to be functionally similar to SsCfaL, although with lower activities (Supplementary Fig. 13).

Crystallography trials revealed that only PbCfaL yielded crystals of sufficient quality for structural studies. A PbCfaL structure in the adenylation conformation was solved to 2 Å resolution (Fig. 2), which is consistent with the ANL superfamily. Despite sharing low sequence identity (<20%), PbCfaL showed substantial structural similarity (>70%) to several other ANL ligases from a variety of organisms, including bacterial benzoate CoA ligases and firefly luciferases (Fig. 2b, Supplementary Table 3). By contrast, PbCfaL shares very little structural similarity with the catalytically equivalent Jar1 (30%), suggesting that the two enzymes evolved independently (Supplementary Fig. 14). Plants are known to possess other ACSs that do share high structural homology with PbCfaL (Supplementary Table 3). However, Jar1 and related plant acyl-AMP forming enzymes that conjugate salicylate or indole-6-acetic acid (IAA) with amino acids appear to have evolved separately and specifically for plant hormone signalling36.

a, Main image, X-ray crystal structure of PbCfaL (2 Å) in the ‘open’ or ‘adenylation’ conformation (PDB ID, 7A9I). PbCfaL has a large N-terminal region (residues 1–403), shown in blue, and a flexible C-terminal region (residues 403–516), shown in red. The boxed region and magnified inset highlight the active site region, which is shown with 7 co-crystallized; labelled residues are conserved between all four CfaLs in this study. 7 lies 3.6 Å from a conserved tryptophan residue (W220), which probably helps to align the substrate via π–π stacking interactions. b, PbCfaL superimposed onto McbA (gold; PDB ID, 6SQ8)29, the closest structural homologue to PbCfaL. In the ‘adenylation’ state, the two structures show high levels of similarity. In this state the carboxylic acid binding site (shown) is located between the N-terminal and the flexible C-terminal regions and is solvent accessible. c, PbCfaL superimposed on McbA in the ‘closed’ state (also referred to as the ‘thiolation’ state in related ACS enzymes). Like all ANLs, the C-terminal region of McbA undergoes a large rotation (direction indicated by dashed red arrow) to lie on top of the carboxylic acid binding site, trapping the adenylated intermediate before amine attack. The more rigid N-terminal region does not substantially change conformation. We would expect the C-terminal region of PbCfaL to undergo a similar rotation during catalysis. Structural alignment was performed with Chimera (version 1.14) MatchMaker.

The structure of PbCfaL is composed of a large N-terminal domain (residues 1–403) and a smaller, flexible C-terminal domain (residues 403–516, Fig. 2). As with other members of the ANL superfamily, it is likely that the C-terminal domain undergoes a large rotation following acyl-adenylate formation to close off the acyl binding pocket and form the amino acid binding site (closed conformation) (Fig. 2). Co-crystallography of PbCfaL with 7 revealed that the extremely solvent-accessible acyl binding pocket lies between the two domains (Fig. 2, Supplementary Fig. 15). Sequence alignment between the CfaLs in this study showed a small number of conserved residues (Supplementary Fig. 16), with only W220 likely to make direct contact with 7, probably aligning the carboxylic acid via π–π stacking, explaining the higher binding affinity of 7 versus CFA (Fig. 2). Other conserved residues around 7 probably define the width and depth of the binding pocket. When aligned with the most structurally similar proteins from the Protein Data Bank (PDB; Supplementary Fig. 17) there are few conserved sequences, the most similarity occurring in the ATP binding SSGTTG motif (residues 168–173)37. Despite many attempts, determination of a PbCfaL structure in the closed conformation with AMP and amino acid bound could not be achieved.

We next explored if the synthetic scope of CfaL enzymes could be extended towards other amide targets. The CfaL enzymes were found to possess extremely broad substrate tolerance, accepting a variety of aryl and heteroaryl carboxylic acids 8–29 as well as aliphatic carboxylic acids 30–46, including several chiral compounds 39–46 (Fig. 3, Extended Data Table 2). Several acyl-donor substrates possessed other reactive functionalities, such as electrophilic ketones (11, 35, 43, 45), alkenes (33), as well as nucleophilic alcohol (40, 44) or amine groups (13, 17, 18, 19, 46), which would require protecting for traditional coupling chemistries, but do not interfere with the enzymatic ligation to amino acid acceptor substrates. In addition to proteinogenic amino acids (Fig. 4a, Extended Data Table 3), CfaL enzymes also accept a wide range of non-proteinogenic amino acids 47–61, including common pharmaceutical building blocks, with a preference for hydrophobic amino acids (Fig. 4b, c, Extended Data Table 4). Although polar, particularly charged, amino acids are not well accepted by CfaL, both l -2,4-diaminobutyrate 47 and l -ornithine 48 can be selectively acylated at the α-amino group, obviating the need for protection of the side-chain amino group (Fig. 4c, Supplementary Fig. 17).

a, Diverse structures of carboxylic acid (donor) substrates assayed with l -Ile and CfaL enzymes. b, Percentage conversion for ligation of carboxylic acids 8–46 with l -Ile catalysed by CfaL enzymes (column headings). Assays were carried out with wild-type and engineered CfaL enzymes (25 μM), carboxylic acids 8–46 (2 mM) and l -Ile (5 mM). Conversion to amide products was determined by HPLC analysis following 20 h incubation. Actual conversion values and errors can be found in Extended Data Table 2.

a, Percentage conversion for ligation of carboxylic acid 9 with acceptor proteinogenic amino acids (rows). b, Percentage conversion for ligation of 9 with acceptor non-proteinogenic amino acids. a-Ile = allo-isoleucine. c, Structures of non-proteinogenic amino acids. d, Reversed-phase (RP)-HPLC trace of ligation product of l -Dab 47 (green) and 9 (m-methylbenzoate, red) catalysed by SsCfaL, compared to HPLC traces of synthesized standards of the two possible products 62 and 63. Product of the enzymatic reaction (bottom trace) shows selective acylation of the α-amino group to give amide 62. All assays were carried out with wild-type and engineered CfaL enzymes (5 μM), carboxylic acid 9 (1 mM) and amino acids (2 mM). Conversion to amide products was determined by RP-HPLC analysis following 20 h incubation. Actual conversion values and errors can be found in Extended Data Tables 3 and 4.

In general, SsCfaL and AlCfaL both performed better than PbCfaL (Figs. 3 and 4, Extended Data Tables 2–4), which was found to be less thermally stable and frequently precipitated during the reaction timescale (Extended Data Fig. 1). On the basis of our crystallographic studies, we sought to improve the activity and stability of PbCfaL via rational, structure-guided mutagenesis. Sequence comparison between the four CfaL enzymes (Supplementary Fig. 16) identified few obvious distinctions. However, one noticeable difference was found on the flexible hinge-region linking the N- and C-terminal domains (at position 395, Extended Data Fig. 2a). This position is solvent-exposed and likely to be involved in the conformational changes required to shift between the adenylation (open) and amidation (closed) states of CfaL. The large, charged arginine residue that is located in this position of PbCfaL is orientated out from the enzyme, while the same position in the other CfaLs and ANLs included in the sequence alignment (Supplementary Fig. 16) is occupied by a small and uncharged glycine. A PbCfaL(R395G) mutant showed increased activity against the panel of carboxylic acids and some amino acid substrates (Figs. 3 and 4, Extended Data Tables 2–4). An X-ray crystal structure of this mutant was determined, which revealed no overall structural changes (Extended Data Fig. 2a). However, the melting temperature (Tm) of PbCfaL(R395G) increased by 5 °C relative to the wild type, suggesting that the replacement of this solvent-accessible, charged R395 is beneficial for stability (Extended Data Fig. 1).

A subsequent double mutant, PbCfaL(R395G/A294P), showed a further increased Tm and slightly improved activity (Figs. 3 and 4, Extended Data Table 2–4). The location of this second mutation is within a highly conserved ATP binding loop (G289–L297) that is significantly larger in PbCfaL than in other related structures, and which may partially occlude the ATP binding site (Extended Data Fig. 2b). The proline found at this location in SsCfaL, PbCfaL and several other structurally similar ligases may aid in rotating this loop out of the binding site (Supplementary Fig. 16). Using these two PbCfaL mutants, which no longer precipitate during the reaction, we were able to substantially improve the conversions of both the panel of carboxylic acid and amino acid substrates (Figs. 3 and 4), demonstrating that minimal structure-guided mutagenesis can be used to engineer improved CfaL variants.

To demonstrate the synthetic utility of CfaL, we sought to establish preparative-scale ligation reactions. Accordingly, conditions were optimized for the ligation of carboxylic acid 10 and l -Ile (Fig. 5a). Reaction of 10 (at 15 mM concentration) with PbCfaL(R395G/A294P) cell free lysate afforded amide 64 in near quantitative conversion as determined by high-performance liquid chromatography (HPLC). The reaction mixture was subjected to a simple solvent extraction, providing 1.48 g of crude 64 from 400 ml of reaction mixture, which would be sufficiently pure (>92% purity by NMR, Extended Data Fig. 3) for further synthetic derivatization. Purification of the extract by column chromatography provided 1.37 g of pure 64 in 87% isolated yield (Fig. 5a). To avoid the use of stoichiometric quantities of the expensive co-factor ATP, we repeated the ligation of 10 and l -Ile at the same scale, omitting ATP and instead introducing an ATP recycling system consisting of a polyphosphate (PolyP) kinase enzyme (CHU)38 and an inexpensive PolyP phosphate donor (Extended Data Fig. 3). Although the isolated yield was lower in this case (52%), there is further scope for optimization. While CfaL cell lysate shows good activity for up to 12 h, we sought to improve enzyme stability/longevity through immobilization of CfaL in the form of a cross-linked enzyme aggregate (CLEA)39. PbCfaL(R395G/A294P) CLEAs were shown to retain activity over an extended period of five days, and could be isolated and recycled in five sequential ligation reactions (Extended Data Fig. 4a). Purified PbCfaL(R395G/A294P) was also shown to tolerate several solvents, including MeOH, ethylene glycol and the widely used ‘green’ solvent 2-methylTHF (Extended Data Fig. 4b).

a, Amides including pharmaceutical scaffolds synthesized by CfaL enzymes. b, Comparison of percentage conversion for CfaL enzymes in the synthesis of 65–70. c, Kinetic resolution of racemic carboxylic acids (donor). Absolute configuration and diastereoisomeric ratios (d.r.) were determined by RP-HPLC using synthetic standards. Inset, the HPLC chromatogram for (S)-71 formed in the kinetic resolution of racemic ibuprofen (41) with l -Ile and AlCfaL (E = 94). d, Comparison of the enantioselectivities of the different enzymes. Values of E = 15–30 are considered moderate–good, E > 30 are excellent43. For conversions <30% the calculation of E is unreliable, so the values were not determined (ND). e, Kinetic resolution of racemic amino acid (acceptor) 57 (2 mM) with acid 9 (1 mM) and AlCfaL (5 μM) (E > 200) following 20 h incubation. The yield reported is based on 9 which equates to a yield of 33% based on 57 (2 equiv. used). Inset, chiral HPLC analysis of the amide product showing a single enantiomer, (S)-77. Enantiomeric ratio (e.r.) values determined by chiral HPLC. aIsolated yield preparative-scale synthesis of 64 with PbCfaL(R395G/A294P) lysate 10 (15 mM), l -Ile (45 mM) and ATP (36 mM) incubated for 24 h. bIsolated yields of about 100-mg-scale reactions catalysed by SsCfaL cell lysate with carboxylic acid (5 mM), amine (15 mM) and ATP (15 mM) following 24 h incubation. cConversions determined from HPLC peak area ratios, following assays including CfaL enzymes (25 μM), carboxylic acids (2 mM) and amino acid (6 mM) incubated for 20 h. dE value was calculated from average d.r. or e.r. values, as described previously43. Percentage conversions and d.r. values represent means where n = 3, error denotes s.d.

To further demonstrate the synthetic potential of CfaL, a series of pharmaceutical-relevant scaffolds were prepared in excellent yields (Fig. 5a, b and Supplementary Fig. 18). For example, amides 65 and 67 were prepared in >70% isolated yields at around 100 mg scale. Furthermore, ligations of cinnamic acid 33 and indole carboxylic acids 26 (Fig. 3) with cyclopropyl amino acids 56 and l -Leu, respectively (Fig. 4), produced amides 68 and 69, which are precursors for the manufacture of promising SARS-CoV-2 protease inhibitors, including PF-07304814 (Pfizer; in phase I clinical trials) (Fig. 5a, b and Supplementary Fig. 18)40,41. Similarly, ligation of the thiazole carboxylic acid 29 and O-methyl-l -serine 49 provided amide 70; this is a key component of oprozomib, which is in phase II clinical trials for treatment of multiple myeloma42. Probing the limits of potential CfaL-reaction scope, we found that the mutant PbCfaL(R395G/A294P) also allowed the generation of amide precursors required for the synthesis of the antiviral telaprevir and the anti-cancer agent bortezomib (Extended Data Fig. 5). These products were only produced in low quantities, but with further engineering it may be possible to synthesize these precursors at higher levels. Overall, these reactions (Fig. 5) clearly demonstrate how structurally diverse carboxylic acids can be combined with proteinogenic and synthetic amino acids to produce pharmaceutically important compounds.

The potential of CfaL for use in kinetic resolution of racemic synthetic carboxylic acids was also investigated (Fig. 5c, d). Notably, racemic ibuprofen could be resolved, with excellent enantioselectivity (E = 94)43 leading to the biologically active (S)-ibuprofen-l -Ile amide 71. Amide conjugates of ibuprofen and related NSAIDs with amino acids have been explored extensively for applications as prodrugs and/or hydrogel-based nanomedicine44,45. Five other racemic acids were subjected to kinetic resolution, affording amides 72–76, with modest E values (Fig. 5c, d and Supplementary Fig. 19). In the case of amide 73, the mutant PbCfaL(R395G) was superior to any of the wild-type CfaLs, illustrating how protein engineering could be used to achieve more effective kinetic resolutions. Finally, we sought to exploit the high selectivity of CfaL for l -amino acids to effect the kinetic resolution of racemic amino acids. Amino acid 57 was selected as this is a common pharmaceutical building block, which would normally require multi-step asymmetric synthesis or laborious resolution and protection before acylation or peptide coupling. As anticipated, the reaction between carboxylic acid 9 and racemic amino acid 57 proceeds with excellent enantioselectivity (E > 200), with none of the R-configured enantiomer evident in chiral HPLC when SsCfaL or AlCfaL was used (Fig. 5e, Supplementary Fig. 20). This demonstrates how racemic carboxylic acid and racemic amino acids can be resolved during amide bond synthesis, using CfaL, avoiding more laborious asymmetric synthesis or traditional resolution procedures, and the need for protective group manipulations.

The results presented here demonstrate the role of CfaL enzymes in biosynthesis of the important coronatine family of phytotoxins. BLAST analysis reveals that CfaL-like ligases appear in a large number of distinct COR-like clusters from across a broad range of microorganisms, including bacteria where COR-like phytotoxins have not been observed, suggesting that CfaLs and the biosynthesis of COR-like phytotoxins are widespread. CfaLs can also catalyse ligation of JA with Ile to generate the plant hormone JA-Ile in an identical fashion to the plant ligase Jar1. The lack of sequence and structural similarity between the CfaL and Jar1 suggests that the two enzymes have evolved largely independently in bacteria and in plants to perform very similar reactions. In addition to potential agrochemical applications, the CfaL family of enzymes can be used to produce a wide range of pharmaceutically relevant amides. Used in combination with improving ATP recycling techniques46,47, these enzymes could become powerful synthetic tools offering major advantages over other biocatalysts developed for amide synthesis. For example, the combination of ACS and N-acyltransferase enzymes have been investigated for amide synthesis48. However, large numbers of ACS and N-acyltransferase enzymes had to be screened to find pairs of enzymes with matching selectivity48. In addition to low substrate scope, this system also requires use of two expensive co-factors as well as engineering of two enzymes for further optimization, rather than just one. Other reports describe the use of standalone NRPS adenylation domains to synthesize amides30,49. In these examples only the carboxylic acid activation step is directly enzyme catalysed, the subsequent amidation proceeds spontaneously, requiring a large excess (about 100 equiv.) of the amine, which is not viable for many syntheses. CfaLs directly catalyse both steps and can therefore utilize acids and amines with more efficient stoichiometry. Taken together our results show that CfaLs have potential for the synthesis of diverse range of important amide products, offering clear advantages over traditional synthetic methods and other biocatalytic approaches.

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Nucleotide sequences for the mutants generated as part of this study are available in Supplementary Information. Other nucleotide sequences for the enzymes used in this study were obtained from GenBank, and their accession numbers are provided within the paper or in Supplementary Information. The original materials and data that support the findings of this study are either available within the paper or are available from the corresponding author upon reasonable request. Crystallographic coordinates of wild type and mutant PbCfaL have been deposited in the Protein Data Bank as 7A9I (wild type) and 7A9J (R395G mutant).

Rangaswamy, V., Jiralerspong, S., Parry, R. & Bender, C. L. Biosynthesis of the Pseudomonas polyketide coronafacic acid requires monofunctional and multifunctional polyketide synthase proteins. Proc. Natl Acad. Sci. USA 95, 15469–15474 (1998).

Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

Ullrich, M. & Bender, C. L. The biosynthetic gene cluster for coronamic acid, an ethylcyclopropyl amino acid, contains genes homologous to amino acid-activating enzymes and thioesterases. J. Bacteriol. 176, 7574–7586 (1994).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Staswick, P. E. & Tiryaki, I. The oxylipin signal jasmonic acid is activated by an enzyme that conjugates it to isoleucine in Arabidopsis. Plant Cell 16, 2117–2127 (2004).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Fonseca, S. et al. (+)-7-iso-Jasmonoyl-l -isoleucine is the endogenous bioactive jasmonate. Nat. Chem. Biol. 5, 344–350 (2009).

Article  CAS  PubMed  Google Scholar 

Westfall, C. S. et al. Structural basis for prereceptor modulation of plant hormones by GH3 proteins. Science 336, 1708–1711 (2012).

Article  ADS  CAS  PubMed  Google Scholar 

Winn, M., Richardson, S. M., Campopiano, D. J. & Micklefield, J. Harnessing and engineering amide bond forming ligases for the synthesis of amides. Curr. Opin. Chem. Biol. 55, 77–85 (2020).

Article  CAS  PubMed  Google Scholar 

Parry, R. J., Jiralerspong, S., Mhaskar, S., Alemany, L. & Willcott, R. Investigations of coronatine biosynthesis. Elucidation of the mode of incorporation of pyruvate into coronafacic acid. J. Am. Chem. Soc. 118, 703–704 (1996).

Tao, T. & Parry, R. J. Determination by enantioselective synthesis of the absolute configuration of CPE, a potential intermediate in coronatine biosynthesis. Org. Lett. 3, 3045–3047 (2001).

Article  CAS  PubMed  Google Scholar 

Strieter, E. R., Koglin, A., Aron, Z. D. & Walsh, C. T. Cascade reactions during coronafacic acid biosynthesis: elongation, cyclization, and functionalization during Cfa7-catalyzed condensation. J. Am. Chem. Soc. 131, 2113–2115 (2009).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Parry, R. J., Lin, M. T., Walker, A. E. & Mhaskar, S. Biosynthesis of coronatine: investigations of the biosynthesis of coronamic acid. J. Am. Chem. Soc. 113, 1849–1850 (1991).

Vaillancourt, F. H., Yeh, E., Vosburg, D. A., O’Connor, S. E. & Walsh, C. T. Cryptic chlorination by a non-haem iron enzyme during cyclopropyl amino acid biosynthesis. Nature 436, 1191–1194 (2005).

Article  ADS  CAS  PubMed  Google Scholar 

Kelly, W. L. et al. Characterization of the aminocarboxycyclopropane-forming enzyme CmaC. Biochemistry 46, 359–368 (2007).

Article  CAS  PubMed  Google Scholar 

Rangaswamy, V. et al. Expression and analysis of coronafacate ligase, a thermoregulated gene required for production of the phytotoxin coronatine in Pseudomonas syringae. FEMS Microbiol. Lett. 154, 65–72 (1997).

Article  CAS  PubMed  Google Scholar 

Slawiak, M. & Lojkowska, E. Genes responsible for coronatine synthesis in Pseudomonas syringae present in the genome of soft rot bacteria. Eur. J. Plant Pathol. 124, 353–361 (2009).

Bignell, D. R. D. et al. Streptomyces scabies 87–22 contains a coronafacic acid-like biosynthetic cluster that contributes to plant-microbe interactions. Mol. Plant Microbe Interact. 23, 161–175 (2010).

Article  CAS  PubMed  Google Scholar 

Fyans, J. K., Altowairish, M. S., Li, Y. & Bignell, D. R. Characterization of the coronatine-like phytotoxins produced by the common scab pathogen Streptomyces scabies. Mol. Plant Microbe Interact. 28, 443–454 (2015).

Article  CAS  PubMed  Google Scholar 

Mitchell, R. E. & Frey, E. J. Production of N-coronafacoyl-L-amino-acid-analogs of coronatine by Pseudomonas syringae pv Atropurpurea in liquid cultures supplemented with L-amino acids. J. Gen. Microbiol. 132, 1503–1507 (1986).

Mitchell, R. E. & Ford, K. L. Chlorosis-inducing products from Pseudomonas syringae pathovars: new N-coronafacoyl compounds. Phytochemistry 49, 1579–1583 (1998).

Article  CAS  PubMed  Google Scholar 

Littleson, M. M. et al. Scalable total synthesis and comprehensive structure-activity relationship studies of the phytotoxin coronatine. Nat. Commun. 9, 1105 (2018).

Article  ADS  PubMed  PubMed Central  Google Scholar 

Sabatini, M. T., Boulton, L. T., Sneddon, H. F. & Sheppard, T. D. A green chemistry perspective on catalytic amide bond formation. Nat. Catal. 2, 10–17 (2019).

Sabatini, M. T., Boulton, L. T. & Sheppard, T. D. Borate esters: simple catalysts for the sustainable synthesis of complex amides. Sci. Adv. 3, e1701028 (2017).

Article  ADS  PubMed  PubMed Central  Google Scholar 

Krause, T., Baader, S., Erb, B. & Gooßen, L. J. Atom-economic catalytic amide synthesis from amines and carboxylic acids activated in situ with acetylenes. Nat. Commun. 7, 11732 (2016).

Article  ADS  PubMed  PubMed Central  Google Scholar 

Stephenson, N. A., Zhu, J., Gellman, S. H. & Stahl, S. S. Catalytic transamidation reactions compatible with tertiary amide metathesis under ambient conditions. J. Am. Chem. Soc. 131, 10003–10008 (2009).

Article  CAS  PubMed  Google Scholar 

Allen, C. L., Atkinson, B. N. & Williams, J. M. J. Transamidation of primary amide with amines using hydroxylamine hydrochloride as an inorganic catalyst. Angew. Chem. Int. Edn 51, 1383–1386 (2012).

Al-Zoubi, R. M., Marion, O. & Hall, D. G. Direct and waste-free amidations and cycloadditions by organocatalytic activation of carboxylic acids at room temperature. Angew. Chem. Int. Edn 47, 2876–2879 (2008).

Noda, H., Furutachi, M., Asada, Y., Shibasaki, M. & Kumagai, N. Unique physicochemical and catalytic properties dictated by the B3NO2 ring system. Nat. Chem. 9, 571–577 (2017).

Article  CAS  PubMed  Google Scholar 

Scheidt, K. Amide bonds made in reverse. Nature 465, 1020–1022 (2010).

Article  ADS  CAS  PubMed  Google Scholar 

Goswami, A. & Van Lanen, S. G. Enzymatic strategies and biocatalysts for amide bond formation: tricks of the trade outside of the ribosome. Mol. Biosyst. 11, 338–353 (2015).

Article  CAS  PubMed  Google Scholar 

Petchey, M. et al. The broad aryl acid specificity of the amide bond synthetase McbA suggests potential for the biocatalytic synthesis of amides. Angew. Chem. Int. Edn 57, 11584–11588 (2018).

Wood, A. J. L. et al. Adenylation activity of carboxylic acid reductases enables the synthesis of amides. Angew. Chem. Int. Edn 56, 14498–14501 (2017).

Pattabiraman, V. R. & Bode, J. W. Rethinking amide bond synthesis. Nature 480, 471–479 (2011).

Article  ADS  CAS  PubMed  Google Scholar 

Pérombelon, M. C. M. Potato diseases caused by soft rot Erwinias: an overview of pathogenesis. Plant Pathol. 51, 1–12 (2002).

Bell, K. S. et al. Genome sequence of the enterobacterial phytopathogen Erwinia carotovora subsp. atroseptica and characterization of virulence factors. Proc. Natl Acad. Sci. USA 101, 11105–11110 (2004).

Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

Bottini , R. , Fulchieri , M. , Pearce , D. & Pharis , R. Identification of gibberellins A1 , A3 and iso-A3 in cultures of Azospirillum lipoferum .Plant Physiol.90, 45–47 (1989).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Schüler, G. et al. Coronalon: a powerful tool in plant stress physiology. FEBS Lett. 563, 17–22 (2004).

Shockey, J. M., Fulda, M. S. & Browse, J. Arabidopsis contains a large superfamily of acyl-activating enzymes. Phylogenetic and biochemical analysis reveals a new class of acyl-coenzyme A synthetases. Plant Physiol. 132, 1065–1076 (2003).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Stuhlsatz-Krouper, S. M., Bennett, N. E. & Schaffer, J. E. Substitution of alanine for serine 250 in the murine fatty acid transport protein inhibits long chain fatty acid transport. J. Biol. Chem. 273, 28642–28650 (1998).

Article  CAS  PubMed  Google Scholar 

Nocek, B. P. et al. Structural insights into substrate selectivity and activity of bacterial polyphosphate kinases. ACS Catal. 8, 10746–10760 (2018).

Sheldon, R. A. Cross-linked enzyme aggregates (CLEAs): stable and recyclable biocatalysts. Biochem. Soc. Trans. 35, 1583–1587 (2007).

Article  CAS  PubMed  Google Scholar 

Zhang, L. et al. α-Ketoamides as broad-spectrum inhibitors of coronavirus and enterovirus replication: structure-based design, synthesis, and activity assessment. J. Med. Chem. 63, 4562–4578 (2020).

Article  CAS  PubMed  Google Scholar 

Boras, B. et al. Discovery of a novel inhibitor of coronavirus 3CL protease for the potential treatment of COVID-19. Preprint at https://www.biorxiv.org/content/10.1101/2020.09.12.293498v3 (2020).

Zhou, H.-J. et al. Design and synthesis of an orally bioavailable and selective peptide epoxyketone proteasome inhibitor (PR-047). J. Med. Chem. 52, 3028–3038 (2009).

Article  CAS  PubMed  Google Scholar 

Chen, C.-S., Fujimoto, Y., Girdaukas, G. & Sih, C. J. Quantitative analyses of biochemical kinetic resolutions of enantiomers. J. Am. Chem. Soc. 104, 7294–7299 (1982).

Jervis, P. J., Amorim, C., Pereira, T., Martins, J. A. & Ferreura, P. M. T. Exploring the properties and potential biomedical applications of NSAID-capped peptide hydrogels. Soft Matter 16, 10001–10012 (2020).

Article  ADS  CAS  PubMed  Google Scholar 

Tiwari, A. D. et al. Microwave assisted synthesis and QSAR study of novel NSAID acetaminophen conjugates with amino acid linkers. Org. Biomol. Chem. 12, 7238–7249 (2014).

Article  CAS  PubMed  Google Scholar 

Andexer, J. N. & Richter, M. Emerging enzymes for ATP regeneration in biocatalytic processes. ChemBioChem 16, 380–386 (2015).

Article  CAS  PubMed  Google Scholar 

Strohmeier, G. A., Eiteljorg, I. C., Schwarz, A. & Winkler, M. Enzymatic one-step reduction of carboxylates to aldehydes with cell-free regeneration of ATP and NADPH. Chem. Eur. J. 25, 6119–6123 (2019).

Article  CAS  PubMed  Google Scholar 

Philpott, H. K., Thomas, P. J., Tew, D., Fuerst, D. E. & Lovelock, S. L. A versatile biosynthetic approach to amide bond formation. Green Chem. 20, 3426–3431 (2018).

Hara, R., Hirai, K., Suzuki, S. & Kino, K. A chemoenzymatic process for amide bond formation by an adenylating enzyme-mediated mechanism. Sci. Rep. 8, 2950 (2018).

Article  ADS  PubMed  PubMed Central  Google Scholar 

We thank the BBSRC (grants BB/K002341/1 and BB/N023536/1) and Syngenta for funding. F.W. was supported by the China Scholarship Council (grant no. 201806155100) and L.B. was funded by the Deutsche Forschungsgemeinschaft (DFG, grant BE 7054/1). The Michael Barber Centre for Collaborative Mass Spectrometry provided access to MS instrumentation. We also thank J. Vincent and N. Mulholland (Syngenta) for helpful discussions in the early stages of the project, and N. J. Turner (University of Manchester) for kindly providing the CHU plasmid. We also thank Diamond Light Source for beamtime access on i03 and i04-1 (proposal mx17773-56 and 76).

Present address: School of Food Science and Engineering, South China University of Technology, Guangzhou, China

Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, Manchester, UK

Michael Winn, Michael Rowlinson, Fanghua Wang, Luis Bering, Daniel Francis, Colin Levy & Jason Micklefield

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

M.W. and J.M. designed experiments; M.W., M.R., F.W., L.B. and D.F. carried out the experiments and provided additional experiment design; C.L. performed crystallographic studies. M.W. and J.M. wrote the manuscript. J.M. led the study.

The authors declare no competing interests.

Peer review information Nature thanks Francesca Paradisi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Melting point temperatures (Tm) of the CfaLs in this study, obtained using a fluorescence-based assay conducted in a Bio-Rad CFX Connect qPCR machine. Higher Tm indicates improved thermal stability. The Tm is calculated as the lowest point when plotting the negative derivative of RFU (relative fluorescence units) as a function of temperature (dT), versus the temperature (degrees Celsius).

a, Structural comparison between PbCfaL (left) and the mutant PbCfaL(R395G) (right). R395 (circled) of PbCfaL (PDB ID 7A9I) is in the hinge region between the N-terminal domain (blue) and the flexible C-terminal domain (red). In PbCfaL(R395G) (PDB ID 7A9J) this large arginine residue is replaced by a much smaller glycine (circled) that is found in the other members of the CfaL family and many other similar ANL ligases. The overall structure of this mutant exhibits no other substantial structural difference from that of the wild type. b, Overlay of PbCfaL with three published ATP-dependent ligase structures (in ellipse) showing the conserved ATP binding location. When superimposed, PbCfaL (PDB ID 7A9I, blue), McbA (AMP bound, PDB ID 6SQ8, red), GrsA (ATP bound, PDB ID 1AMU, green) and AuaEII (anthranoyl-AMP bound, PDB ID 4WV3, light brown) show the conserved location of ATP binding. The corresponding loop in PbCfaL (inset, arrowed) is larger than in the other structures which may affect ATP binding. The location of this region within the structure of PbCfaL (grey) is also shown for reference. Structural alignment was performed using Chimera (version 1.14) MatchMaker.

a, Preparative-scale synthesis of 64 from 10 and l -Ile catalysed by PbCfaL(R395G/A294P) (lysate), with either the addition of ATP (87% isolated yield) or recycling the endogenous ATP present in the lysate using the kinase (CHU) and polyphosphate (polyP) (52% isolated yield). b, 1H-NMR spectrum of crude product 64. c, 13C-NMR spectrum of crude product 64.

a, Catalysed by PbCfaL (R395G/A294P) CLEAs. Reactions (100 mM Tris-HCl, 10 mM MgCl2, 10 mM ATP, 1 mM 10, 3 mM l -isoleucine, 50 ml total volume) were run for 24 h; the CLEA (cross-linked enzyme aggregate) was then removed, washed, and reintroduced to an identical reaction. While activity was seen to reduce over the 5 days, the CLEA still retained high levels of productivity even after 5 recycles over 5 days, whereas cell lysates generally precipitated and lost all activity within 12 h. Although CfaL undergoes extensive conformational changes during catalysis, encapsulating it within CLEAs shows the potential of immobilization to extend the functional lifespan of the CfaL. More sophisticated immobilization techniques may have the potential to further retain activity. Conversion values were calculated from HPLC peak area ratios of product and starting materials, and represent means where n = 5, error bars denote s.d. b, Catalysed by purified PbCfaL(R395G/A294P), showing percentage conversions of 10 and l -Ile in the presence of various solvents and at different concentrations. Conversion values were calculated from HPLC peak area ratios of product and starting materials and represent means where n = 3, error bars denote s.d.

a, Proposed route towards the antiviral agent telaprevir by CfaL (see reaction at top). The expected product of the reaction, 110, was detected by LCMS (top trace, expected m/z 262.1197, observed 262.1193 [M-H]−). Additional peaks consistent with dipeptide 111, formed from condensation of two cyclohexylglycines, 58, were also detected (bottom trace, 111a and 111b, expected m/z 295.2027, observed 295.2036 [M-H]−). Although CfaL are highly selective for l -amino acid substrates, the appearance of two products of the same mass suggests formation of diastereomers, which may be due to a lack of enantioselectivity in the adenylation step forming the acyl donor when racemic cyclohexylglycine (58) is used. This indicates that 58 can function as both a carboxylic acid and an amine donor. b, Proposed route towards anti-cancer agent bortezomib via the synthesis of 112 by CfaL (see reaction at top). The expected product of the reaction was detected by LCMS (top trace, expected m/z 270.0884, observed 270.0878 [M-H]−). An additional peak consistent with an l -Phe dipeptide (113) was also detected (bottom trace, expected m/z 311.1401). This indicates that l -Phe can function as both acyl donor and amine acceptor. c, The reaction between carboxylic acid substrate (9), which is a good substrate for the enzyme, and cyclohexylglycine (58) gives only the desired product (114, top trace, expected m/z 274.1449, observed 274.1460 [M-H]−). No cyclohexylglycine homodimer (dipeptide 111) was evident in this case, indicating that homocoupling of 58 only takes place when carboxylic acid (acyl donor) substrates that are not well accepted by CfaL are used.

This file contains supplementary methods and compound characterization, NMR Spectra and supplementary references.

This file contains supplementary text, supplementary figures 1 – 21, supplementary tables 1 – 4 and supplementary references.

Winn, M., Rowlinson, M., Wang, F. et al. Discovery, characterization and engineering of ligases for amide synthesis. Nature 593, 391–398 (2021). https://doi.org/10.1038/s41586-021-03447-w

DOI: https://doi.org/10.1038/s41586-021-03447-w

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Nature (Nature) ISSN 1476-4687 (online) ISSN 0028-0836 (print)

Discovery, characterization and engineering of ligases for amide synthesis | Nature

5-Aminoisophthalicacid Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.