tM13 construction
In our early research based on the genomic replication mechanism of the M13 phage [37,38,39], we found that knocking out the replication origin region in the M13KO7 genome almost completely prevents it from self-packaging. However, because it carries the P15A plasmid replicon, it can still stably exist and express relevant proteins in the form of a plasmid after entering the host bacteria. Additionally, based on the work of Specthrie et al., we have developed a dual-plasmid system to produce tM13. In this innovative approach, one plasmid (pUC-tM13) is designed utilizing pUC19 as the vector, housing essential elements such as f1 ori and f1 term, which facilitate the generation of circular single-stranded DNA (Figure S1). Moreover, it incorporates a signal sequence (PS) that optimizes the transport of circular single-stranded DNA and a Domain B, a replication enhancer, whose removal can drop DNA replication to 3% of regular levels [38, 40]. The other plasmid has been subjected to modifications utilizing the genetic blueprint of M13KO7. This plasmid harbors its own resistance sequence and plasmid replicon, specifically P15A. Notably, we have deliberately eliminated its single-stranded replication region and packaging signal sequence, thereby conferring upon it the sole capability of facilitating the expression of packaging proteins (Fig. 1A, Figure S1). To authenticate the successful production of the abbreviated phage, we initially devised customized primers targeting the plasmid pUC-tM13 and KO7Δori. Furthermore, we formulated specific primers located at both termini of the circularization junction to validate the circularization of the concise phage genome (Fig. 1B). The outcomes demonstrated that sole transformation with either pUC-tM13 or KO7Δori failed to yield circularized tM13 genomes. However, transformations involving both plasmids yielded distinct and intense bands, thus confirming the desired outcome. The purified tM13 samples, after sodium dodecyl sulfate (SDS) and heat-induced lysing, were subjected to electrophoresis on SDS-PAGE gels to perform anti-pIII and anti-pVIII western blotting with the purpose of scrutinizing the existence of key phage capsid proteins (Fig. 1C). To confirm the genomes of tM13, we employed a sophisticated gene extraction kit, followed by a meticulous analysis using agarose gel electrophoresis. In this experimental setup, we employed a specialized dye capable of discerning between single-stranded (Red) and double-stranded (Green) DNA. Upon scrutinizing the outcomes, it became evident that solely the bacterial culture subjected to transformation with the pUC-tM13 plasmid exhibited an absence of any discernible genomes and the bacterial culture transformed with the KO7Δori unveiled that a fraction of the single-stranded form of plasmid genome had been encapsulated and released. The bacterial culture that underwent transformation with both plasmids exclusively harbored the genome of the tM13. (Figure S2).
Subsequently, we inserted the pIII sequence and its expression elements into the pUC-tM13 vector (pUC-tM13::pIII) while simultaneously constructing a control vector devoid of its expression elements (pUC-tM13ΔpIII). Observation with transmission electron microscopy (TEM) revealed that the length of the tM13 produced by the pUC-tM13 vector is approximately 100 nm. Meanwhile, the length of the phage produced by the pUC-tM13::pIII is approximately 350 nm. (Fig. 1D) Upon insertion of the expressible pIII sequence, not only was there an appreciable increase in length, but the yield was also increased by an order of magnitude compared to pUC-tM13, and nearly two orders of magnitude compared to pUC-tM13ΔpIII (Fig. 1E). Furthermore, statistical analysis revealed that pUC-tM13::pIII exhibited superior length uniformity, rendering it more suitable for subsequent biological template applications (Figure S3). We speculate that the occurrence of this phenomenon is due to an increase in the concentration of the pIII protein near the packaging sequence, which facilitates the timely termination and release of tM13 from the bacterial secretion.
Modification and display of tM13
In particular, the amine groups located at the amino-terminus of pVIII are the main targets for conjugating external compounds through amine-directed conjugation. Conventional chemical conjugation methods, including the EDC/NHS activation strategy for coupling amines and carboxyl groups, adversely affect the preservation of bacteriophage affinity activity. BMCC-biotin, a maleimide- biotin compound, serves as a labeling agent for protein cysteine residues and other molecules harboring thiol groups. This reagent exhibits specific reactivity with reduced thiols (-SH) in nearly neutral buffer solutions, resulting in the formation of a permanent (irreversible) thioether bond (Fig. 2A). As such, we elected to introduce a mutation at a single amino acid residue, replacing it with cysteine on its hydrophobic side, thus ensuring that each peripheral pVIII molecule presents a thiol group. We engineered a point mutation (Val30 → Cys) at the 30th position of the pVIII sequence on the KO7Δori plasmid, resulting in the creation of KO7Δori-Cys. Subsequently, we co-transformed this plasmid with pUC-tM13::pIII and compared the production and electron microscopy characterization of the phage before and after the cysteine mutation and found little to no difference (Fig. 2B, Figure S5). We then mixed BMCC-Biotin with short bacteriophages that expose thiol groups, and after identification using MALDI-Biotyper, we discovered that almost all the exposed thiols on the surface were successfully linked to biotin (Fig. 2C). To evaluate the targeted activity, we utilized ELISA on the biotinylated short bacteriophages and determined that the addition of BMCC-biotin did not impact the bacteriophage’s binding ability at the pIII end (Fig. 2D). This confirms the suitability of this design as a molecular probe. To further confirm the activity of the biotin modification on the surface of the short bacteriophages, we incubated them with streptavidin-coated magnetic beads (MNP@SA) and separated them using magnetic beads. Additionally, we performed electron microscopy characterization to validate the results (Fig. 2E). We found that the short bacteriophages without biotin modification showed minimal binding to the magnetic beads, while the biotinylated short bacteriophages exhibited a 1000-fold decrease in titer after magnetic separation (Fig. 2F). We performed a comparative analysis of the bacterial adsorption efficiency of tM13, tM13-pIII, and biotin-modified tM13-pIII at different time points. The results indicated that tM13-pIII exhibited superior adsorption effectiveness compared to tM13, and the biotin modification did not impact its adsorption efficiency (Fig. 2G). To further validate the feasibility of magnetic separation with the tM13-Biotin after affinity with target bacteria, we performed the same cysteine mutation and biotin conjugation on M13KE (M13KE-Biotin). The results demonstrated that both biotin-modified and non-modified tM13-pIII exhibited a significantly higher degree of specificity compared to M13KE-Cys and M13KE-Biotin. In contrast, M13KE-Cys and M13KE-Biotin were found to magnetically isolate non-target bacteria, likely due to non-specific entanglement resulting from their excessive length. (Figure S6). Meanwhile, we compared the specific adsorption capacity of tM13-Biotin, tM13pIII-Biotin and M13KE-Biotin for E. coli, as well as the magnetic separation efficiency after adsorption (Fig. 2H).
Optimization of the targeting capability of tM13
To further enhance the targeting capacity of the M13 bacteriophage toward E. coli, we explored two distinct sequence modifications of pIII in the pUC-tM13::pIII and KO7Δori-Cys. Specifically, we mutated the 153rd glycine residue in the pIII-N2 domain to aspartic acid (G153D) and deleted the N2 domain within the pIII sequence respectively (Figure S7). This triggers the activation of pIII by revealing the binding site for TolA, the primary phage receptor on the cell surface (Fig. 3A). Its tight association with the N2 domain necessitates an opening facilitated by the interaction between N2 and the F pilin protein (Fig. 3B). Based on previous studies [29, 31, 41], the mutation of G153D and the knockout of N2 domain both lead to enhanced efficiency of M13 infection in F- E. coli. Therefore, we initially compared the yields of the mutant tM13-pIII (tM13-pIIIG153D), the knockout tM13-pIII (tM13-pIIIΔN2), and the wild-type tM13 under the same culture conditions. It is clear that the knockout of the N2 domain leads to a substantial reduction in tM13 yield. This reduction is likely attributable to structural alterations in the pIII protein, which may cause decreased anchoring efficiency to the inner membrane or impair its capacity to encapsulate the M13 termini. Further detailed investigation is required to elucidate the specific mechanisms involved. Despite these reductions, yields continue to exceed 10 [10] copies/mL (Fig. 3C), which remains adequate for subsequent applications. Subsequently, we employed sandwich ELISA to validate the binding affinities of the three tM13 variants. The results indicate that tM13-pIIIG153D exhibits the most effective binding to ER2738, while tM13-pIIIΔN2 displays optimal binding to F- E. coli Trans1-T1, far surpassing the unmodified tM13-pIII. In the case of pathogenic E. coli O157:H7, the binding affinities of all three variants experienced a slight decline, yet the binding abilities of tM13-pIIIG153D and tM13-pIIIΔN2 remain superior to that of tM13-pIII (Fig. 3D). Following this, we utilized BMCC-Biotin to modify the aforementioned engineered tM13-pIII variants as well as the wild-type tM13-pIII. We then evaluated their binding capacity to the target bacteria, and subsequently conducted separation using MNP@SA, with a focus on efficiency and specificity. In the case of E. coli ER2738, Trans1-T1, O157:H7, and Pseudomonas aeruginosa PAO1, the separation efficiencies of tM13-pIII, tM13-pIIIG153D, and tM13-pIIIΔN2 for ER2738 were 90.9%, 94.6%, and 83.9%, respectively (Fig. 3E). In contrast, the efficiencies for Trans1-T1 were 7.8%, 59.9%, and 66.9% (Fig. 3F). For O157:H7, the separation efficiencies were 10.5%, 42.9%, and 30.5% (Fig. 3G), whereas for PAO1, all efficiencies were below 5% (Fig. 3H). In conclusion, the tM13-pIIIG153D variant exhibits remarkable separation prowess, achieving a separation rate of 94.6% for the host E. coli ER2738, and rates of 59.9% and 42.9% for the non-host E. coli Trans1-T1 and O157:H7, respectively. In comparative assessments of capture efficacy between tM13-pIIIG153D and antibodies targeting the majority of E. coli serotypes, it was observed that tM13-pIIIG153D exhibited superior capture efficiency for the host E. coli strain (ER2738) compared to antibodies (Figure S8). However, for the other two E. coli strains (Trans1-T1, O157:H7) in the experimental group, its efficiency was slightly lower than that of the antibodies. Given the simpler production conditions and lower cost associated with tM13-pIIIG153D, it demonstrates potential as an alternative to antibodies. Consequently, we selected the tM13-pIIIG153D variant as the primary capture probe for the targeted E. coli.
Enumeration of bacterial cells with dark field microscopy
After removing excess biotin from the biotinylated tM13-pIIIG153D, we incubated it with an excess of known theoretical gradient concentrations of bacterial suspensions (ER2738, Trans1-T1, O157:H7, and PAO1). Following this, we conducted affinity adsorption and magnetic separation utilizing MNP@SA. The isolated bacterial cells were subsequently examined and quantified with dark field microscopy. Significantly, E. coli demonstrates characteristic morphological features under dark field microscopy. The optical properties intrinsic to this microscopy method induce opacity in the bacterium’s central region, thereby presenting E. coli as rod-shaped particles with a distinct black core and well-defined peripheral contours. The enumeration chamber used in the study can hold up to 500μL of sample liquid and has 16 small grids, each measuring 5 mm × 5 mm (Figure S9). 400 μL of the sample solution was added into the enumeration chamber and dried in the oven. Three random fields of view in each grid were imaged to count bright particles with characteristic E. coli features. The average values were obtained from 48 fields of view to achieve quantitative measurements. Given that each field of view equates to 5.0 × 10 [4] μm [2], the final concentration was calculated to be 2 × 10 [4]A CFU/mL (A: the mean number of bacteria cells within each field of view). Upon comparison with the plate counting method, we observed that the dark field microscopy detection strategy yielded results that were most consistent with the spiked concentration for ER2738 (Fig. 4A, B). Notably, bacteria cells were still detectable at a theoretical concentration of 2.6 × 10 [3] CFU/mL. However, for the detection of Trans1-T1 and O157:H7, there was a significant disparity between the concentrations obtained through dark field microscopy and the plate counting method, with some values falling below 50% of the theoretical concentration (Fig. 4C, D, E, F). Nevertheless, the results from the plate counting method were almost aligned with the theoretical concentrations (for example, when detecting Trans1-T1 with a theoretical concentration of 425 CFU/mL, the plate counting method yielded results of 375 CFU/mL, whereas our method produced only 149 CFU/mL). This indicates that to achieve precise quantification of the majority of E. coli strains, further technical improvements are required in our approach. However, considering the lengthy duration necessary for plate counting and its inability to provide specific identification, our method retains certain advantages. In addition, compared to the negative control PAO1, the target bacterial cells remained detectable (Fig. 4G, H). Subsequently, we performed negative controls for the three E. coli strains without the addition of tM13-pIIIG153D (Figure S10). The results indicated that, over the concentration of 1.0 × 10 [3] CFU/mL, positive results were still discernible with dark field microscopy.
Bacteriophages are readily obtainable, exhibit long shelf lives, and demonstrate remarkable tolerance to harsh environmental conditions, with minimal constraints regarding high temperatures and pH variations [42]. Their specificity towards hosts makes them ideal tools for the identification of various pathogens, particularly as biorecognition elements in bacterial detection. To date, numerous phage-based detection methods and sensors have been developed, employing diverse transducer platforms including electrochemical and optical approaches. Among these, various strategies for detecting E. coli using M13 phages have been explored, encompassing common methods such as colorimetric assays [43] and surface-enhanced Raman scattering (SERS) [18]. However, experimental results indicate that the filamentous structure of M13 phages can significantly impact detection efficacy. Additionally, existing methods for E. coli detection show limited broad-spectrum specificity, often recognizing only a small number of E. coli strains.
In response to these limitations, we have proposed structural modifications based on the packaging and infection mechanisms of M13 phages [22, 27, 29], including truncation of the phage structure and alteration of the pIII region. These adjustments have demonstrated considerable improvements in detection sensitivity and specificity, although further optimization is still needed. The implementation of this approach on an intelligent dark-field platform also eliminates the need for expensive equipment and complex manual procedures, suggesting a strong potential for widespread adoption.
Intelligent detection using image processing model and convolutional neural network
During the analysis of certain samples with dark field microscopy, we observed numerous impurities lacking bacterial features and noise that contributed to the complexity of the image background (Figure S9). While discernible by the naked eye, these elements were not effectively distinguished from true event using conventional counting software. Global thresholding was initially employed to segment the foreground and background of the dark field images, influencing the accuracy of subsequent image processing and analysis. The threshold was determined based on the overall grayscale histogram of the image, effectively segregating the image pixels into target and background categories. Upon establishing an appropriate threshold, it was applied to the entire dark field image, effectively segmenting the foreground from the background, laying the groundwork for further image processing. The segmented image revealed the areas of interest, and subsequent focus was on cropping the target areas within the segmented image to extract key targets from the complex background for more in-depth analysis. Special attention was paid to maintaining the integrity of the target areas during cropping to ensure the accuracy of further analyses, efficiently extracting the targets of interest from the dark field images and providing a crucial data foundation for further experimental analysis (Fig. 5A).
Following processing, we obtained a total of 4579 individual bacterial dark field characterizations and 1564 negative characterizations (comprising impurity particles, noise, and glare). Individual targets obtained from the cropping step were manually annotated with binary labels and the dataset was partitioned into training, validation, and test sets in a 6:2:2 ratio, as detailed in Table S2. As can be seen in the Fig. 5B, the DenseNet169 architecture was employed due to its proven efficacy in image recognition tasks [44]. The architecture’s dense connectivity pattern (Fig. 5C, D) facilitates feature reuse, reduces the number of parameters, and improves training efficiency. The model underwent 100 epochs and was executed on the PyTorch platform, utilizing an NVIDIA RTX 4060 GPU for computational efficiency and capability to handle the demands of deep neural network training. In the final test phase, the test image underwent threshold-based segmentation and target-specific cropping before being fed into the trained model for classification. Binary classification results were used to determine the number of positively identified targets within the image.
As shown in Table S3, we evaluated the classification results using various evaluation metrics. In the evaluation of our model tailored for dark field image analysis, it was observed that the model achieved an impressive accuracy of 97.40%, indicating its superior performance in accurate classification. Notably, the model demonstrated a precision of 98.57%, reflecting its robust capability in minimizing false positives within the context of dark field imaging. Concurrently, a recall rate of 97.93% was achieved, ensuring the effective identification of the majority of positive instances, a critical aspect for our task. Furthermore, an F1 score of 98.25% highlighted the model’s balanced proficiency in both precision and recall, establishing it as a potent tool for dark field image analysis. This comprehensive performance underscores the dual strengths of robustness and accuracy inherent in our AI model for binary classification in dark field imaging applications. We conducted detection using three concentrations (1.9 × 104 CFU/mL; 9.5 × 104 CFU/mL; 4.75 × 105 CFU/mL) of ER2738 with three different methods (Fig. 5E). We found that the convolutional neural network recognition results closely matched manual enumeration, whereas counts obtained using the regular Image J software displayed significant disparities. We simultaneously assessed the concentration of five prevalent E. coli strains (T12/Y12/Y14/Y48/E172) collected clinically using three different methodologies (Figure S11A). For controls, a Gram-negative Salmonella strain S15 and a Gram-positive Cutibacterium acnes strain L1 were utilized. Results indicated that all five E. coli strains were detectable, although sensitivities varied among the strains. For instance, the detection results for strain Y12 obtained using our method were comparable to those from the plate counting method, which measured a concentration of 940 CFU/mL, while our study’s method yielded a concentration of 908.6 CFU/mL. Currently, quantitative accuracy does not match that of plate counting and therefore still requires optimization. However, the identification model enhanced using convolutional neural networks significantly outperformed the ImageJ software, which tends to include numerous impurities and light spots in its calculations, thus greatly exaggerating actual concentrations. Additionally, we conducted validation tests on a mixed-sample, mixing ER2738 strain with Salmonella strain S15 under known concentration conditions using our study’s method. The measured concentration was close to the theoretical concentration of the ER2738 strain, indirectly indicating that Salmonella strain S15 was not isolated by the tM13-pIIIG153D, thus demonstrating a certain degree of specificity for this method (Figure S11B). We simultaneously spiked various real food matrices with E. coli to compare traditional plate counting methods with dark field automated counting. The results obtained are shown in Table S4.
However, a total of 32 instances of misclassification were observed within a test set comprising 1,231 images (Figure S12). These instances of misclassification highlight inconsistencies between actual labels and model predictions, providing an intuitive indicator for potential enhancements in model performance. In the task of binary classification using dark field imaging data, the deep learning model encountered significant challenges, especially with low-resolution and high-noise images conditions that are prevalent in dark field microscopy. Such challenges predominantly manifest in the model’s inadequate extraction of meaningful signals within the images. During the preprocessing phase, a uniform resizing to 96 × 96 pixels was implemented for targets of varying sizes. This standardized rescaling notably affected the resolution of smaller targets, which contributed to misclassification by the model. Further analysis indicated that a substantial number of misclassifications were attributed to inaccurate cropping, leading to the erroneous recognition of target edges as positive signals.
In light of these observations, the future work should incorporate adaptive target size adjustments during preprocessing to preserve the original resolution and structural information of the targets. Such adaptive strategies may involve more sophisticated image processing techniques, such as target detection and region of interest (ROI) selection algorithms, to ensure that critical information is not compromised during the preparatory stages prior to model training. Additionally, future work should focus on exploring the display of various affinity peptides on the tM13 probe to target multiple sites on different strains. Incorporating detection results from diverse strains into the model for training could enhance the universality of the approach. Furthermore, testing the generalizability of the model using external datasets would be beneficial for validating its effectiveness.