Part A (From Pranam)

  1. Find the amino acid sequence for SOD1 in UniProt (ID: P00441), a protein when mutated, can cause Amyotrophic lateral sclerosis (ALS). In fact, the A4V (when you change position 4 from Alanine to Valine) causes the most aggressive form of ALS, so make that change in the sequence
  1. Enter your mutated SOD1 sequence into the PepMLM inference API and generate 4 peptides of length 12 amino acids (Step 5 takes a while so you can also just pick 1 or 2 peptides)
Binder Pseudo Perplexity
1 RPRDETEVEEGR 15.175646
2 KTEEEETLVEPR 15.553625
3 RTEGDEPLVPWR 18.478895
4 RPEGGTEVEPPR 14.935402
  1. To your list, add this known SOD1-binding peptide to your list: FLYRWLPSRRGG [from -https://genesdev.cshlp.org/content/22/11/1451]

    Binder Pseudo Perplexity
    1 RPRDETEVEEGR 15.175646
    2 KTEEEETLVEPR 15.553625
    3 RTEGDEPLVPWR 18.478895
    4 RPEGGTEVEPPR 14.935402
    5 FLYRWLPSRRGG
  2. Go to AlphaFold-Multimer (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). This is similar to what you did for homework last week but instead for a protein-peptide complex

  3. Set model_type: alphafold2_multimer_v3 (this model has been shown to recapitulate peptide-protein binding accurately: https://www.frontiersin.org/articles/10.3389/fbinf.2022.959160/full). * Add your query sequence - Its the SOD1Sequence:PeptideSequence.

test_3bd7c_coverage.png

test_3bd7c_plddt.png

test_3bd7c_pae.png

  1. After running AlphaFold-Multimer with your 5 peptides alongside your mutated SOD1 sequence, plot the ipTM scores, which measures the relative confidence of the binding region.

image.png

Peptide iPTM_1 iPTM_2 iPTM_3 Average
RPRDETEVEEGR 0.75 0.78 0.76 0.763
KTEEEETLVEPR 0.72 0.7 0.73 0.717
RTEGDEPLVPWR 0.65 0.68 0.66 0.663
RPEGGTEVEPPR 0.8 0.82 0.81 0.81
FLYRWLPSRRGG (SOD1) 0.7 0.71 0.69 0.7

After examining the indicated applications, the average values obtained for the iPTM of the four generated peptides and the SOD1 sequence were considered to evaluate the interaction potential. It was found that peptide 4 displayed the highest average iPTM (0.810), making it the most appropriate model with the strongest interaction capabilities. Meanwhile, peptide 3 showed the lowest interaction potential (0.663). These findings suggest that certain designed peptides could outperform the original SOD1 sequence in interaction modeling, particularly peptide 4.

image.png

The peptides showed very low plDDT values <50, represented in red.

Part B (Final Project: L-Protein Mutants)

DATA: L-Protein and DNAj Sequence

Lysis Protein Sequence (UniProtKB ID: https://www.uniprot.org/uniprotkb/P03609/entry**)**

METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT

DNAj sequence (UniProtKB ID: https://www.uniprot.org/uniprotkb/P03609/entry**)**

MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDSQKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYNMELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQQTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAPAGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTGKLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNERQKQLLQELQESFGGPTGEHNSPRSKSFFDGVKKFFDDLTR

Mutagenesis using Protein Language Models

  1. Designing these mutants with good computational confidence is hard. It will show you limitations of some of the structure based models. Ultimately you can pick various combinations of mutations and get lab results and then decide to pick the next round of mutations. But this assay won’t be easy to run at scale in this class. So using the information below you can either make a best guess or you can use the strategy Allan was talking about during recitation. Contact Manu or Allan if you need one on one help.
  2. Run this notebook to generate for each position in the amino acid sequence, a “score” for what would happen to the protein if you mutated into another amino acid. It can be positive or negative for the protein. We want to identify possible mutations that are “positive” If you run this notebook - you will see a .CSV file in the sidebar. You can download it and look at it in the google sheets if that’s easier.

image.png

image.png

image.png