MAFTOOLS: Adding variants from amplicon sequenced data to MAF file generated by WES data

71 Views Asked by AnandamidaCBD At 18 July 2023 at 15:15

I am relatively new to bioinformatics, and I need some help with adding variants from a specific gene that were sequenced using amplicon data (due to bad sequencing in a hot spot) to a MAF (Mutation Annotation Format) file. The aim is to oncoplot properly this gen. The MAF file I have is entirely generated from WES (Whole Exome Sequencing) data. These are my table columns, I'd like to add to my MAF:

> head(fgfr3_status)
# A tibble: 6 × 19
  Sample  chrom     Pos Ref   Alt   var_type consequence                Impact `cDNA pos` `CDS pos` `protein pos` `AA pos` `codon change` SIFT  `Known var` Tum_Ref Tum_Alt Tum_VAF `Seq type`
  <chr>   <dbl>   <dbl> <chr> <chr> <chr>    <chr>                      <chr>  <chr>      <chr>     <chr>         <chr>    <chr>          <chr> <chr>         <dbl>   <dbl>   <dbl> <chr>

However, I am unsure if the data in this table contains all the necessary information to meet the requirements for filling the MAF file.

Here's the code I have tried so far:

setwd("path/to/folder/")
file <- list.files(pattern = ".consensus.3.maf")
library(maftools)
library(dplyr)
library(readxl)
MAF <- read.maf(maf = file)

setwd("path/to/folder")
fgfr3_status <- read_excel("FGFR3_mutation_status.xlsx")

MAF_data <- MAF@data %>%
  add_count(Hugo_Symbol, Transcript_ID, Tumor_Sample_Barcode, Chromosome, Start_Position, Variant_Classification)


new_variants_df <- data.frame(
  Hugo_Symbol = "FGFR3",
  Entrez_Gene_Id = NA,  # You might need to add NA values for columns not present in the metadata
  Center = NA,
  NCBI_Build = NA,
  Chromosome = fgfr3_status$chrom,
  Start_Position = fgfr3_status$Pos,
  End_Position = NA,    # You might need to add NA values for columns not present in the metadata
  Strand = NA,
  Variant_Classification = fgfr3_status$consequence,
  Variant_Type = fgfr3_status$var_type,
  Reference_Allele = fgfr3_status$Ref,
  Tumor_Seq_Allele1 = fgfr3_status$Tum_Ref,
  Tumor_Seq_Allele2 = fgfr3_status$Tum_Alt,
  dbSNP_RS = fgfr3_status$`Known var`,
  dbSNP_Val_Status = NA,
  Tumor_Sample_Barcode = fgfr3_status$Sample,
  Matched_Norm_Sample_Barcode = NA,
  Match_Norm_Seq_Allele1 = fgfr3_status$Alt,
  Match_Norm_Seq_Allele2 = NA,
  Tumor_Validation_Allele1 = NA,
  Tumor_Validation_Allele2 = NA,
  Match_Norm_Validation_Allele1 = NA,
  Match_Norm_Validation_Allele2 = NA,
  Verification_Status = NA,
  Validation_Status = NA,
  Mutation_Status = NA,
  Sequencing_Phase = NA,
  Sequence_Source = fgfr3_status$`Seq type`,
  Validation_Method = NA,
  Score = NA,
  BAM_File = NA,
  Sequencer = NA,
  Tumor_Sample_UUID = NA,
  Matched_Norm_Sample_UUID = NA,
  HGVSc = NA,
  HGVSp = NA,
  HGVSp_Short = NA,
  Transcript_ID = NA,
  Exon_Number = NA,
  t_depth = NA,
  t_ref_count = NA,
  t_alt_count = NA,
  n_depth = NA,
  n_ref_count = NA,
  n_alt_count = NA,
  all_effects = NA,
  Allele= NA,
  Gene= NA,
  Feature=NA,
  Feature_type= NA,
  Consequence= fgfr3_status$consequence,
  cDNA_postion=fgfr3_status$`cDNA pos`,
  CDS_position=fgfr3_status$`CDS pos`,
  Protein_position=fgfr3_status$`protein pos`,
  Amino_acids=fgfr3_status$`AA pos`,
  Codons=fgfr3_status$`codon change`,
  Existing_variation= fgfr3_status$`Known var`,
  SIFT= fgfr3_status$SIFT,
  IMPACT=fgfr3_status$Impact,
  Tum_VAF=fgfr3_status$Tum_VAF)



merged_MAF <- rbind(MAF@data, new_variants_df, fill=TRUE)
setwd("path/to/folder")
save(merged_MAF, file = "FGFR3_Rick.rda")

MERGED_MAF <- MAF(nonSyn= merged_MAF)
write_main_maf(MERGED_MAF, "FGFR3_RICK")

FILE <- list.files(pattern = "FGFR3_RICK.maf")
new_MAF <- read.maf(FILE)
new_MAF@data

pdf("path/to/folder/FGFR3_oncoplot.pdf", width = 20, height = 20)
oncoplot(maf= Z , top = 200, fontSize = 0.3)
dev.off()

The issue I'm encountering is that after running this code, the total variants I find in MAF@data and new_MAF@data are exactly the same. However, I expected the new variants to be added to the MAF object. Could someone please help me identify what might be going wrong? The oncoplot did not show either FGFR3 in top mutated genes.

Thank you in advance for your assistance!

Original Q&A

MAFTOOLS: Adding variants from amplicon sequenced data to MAF file generated by WES data

There are 0 best solutions below

Related Questions in R

Related Questions in MERGE

Related Questions in RBIND

Related Questions in VCF-VARIANT-CALL-FORMAT

Trending Questions

Popular # Hahtags

Popular Questions