I am relatively new to bioinformatics, and I need some help with adding variants from a specific gene that were sequenced using amplicon data (due to bad sequencing in a hot spot) to a MAF (Mutation Annotation Format) file. The aim is to oncoplot properly this gen. The MAF file I have is entirely generated from WES (Whole Exome Sequencing) data. These are my table columns, I'd like to add to my MAF:
> head(fgfr3_status)
# A tibble: 6 × 19
Sample chrom Pos Ref Alt var_type consequence Impact `cDNA pos` `CDS pos` `protein pos` `AA pos` `codon change` SIFT `Known var` Tum_Ref Tum_Alt Tum_VAF `Seq type`
<chr> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
However, I am unsure if the data in this table contains all the necessary information to meet the requirements for filling the MAF file.
Here's the code I have tried so far:
setwd("path/to/folder/")
file <- list.files(pattern = ".consensus.3.maf")
library(maftools)
library(dplyr)
library(readxl)
MAF <- read.maf(maf = file)
setwd("path/to/folder")
fgfr3_status <- read_excel("FGFR3_mutation_status.xlsx")
MAF_data <- MAF@data %>%
add_count(Hugo_Symbol, Transcript_ID, Tumor_Sample_Barcode, Chromosome, Start_Position, Variant_Classification)
new_variants_df <- data.frame(
Hugo_Symbol = "FGFR3",
Entrez_Gene_Id = NA, # You might need to add NA values for columns not present in the metadata
Center = NA,
NCBI_Build = NA,
Chromosome = fgfr3_status$chrom,
Start_Position = fgfr3_status$Pos,
End_Position = NA, # You might need to add NA values for columns not present in the metadata
Strand = NA,
Variant_Classification = fgfr3_status$consequence,
Variant_Type = fgfr3_status$var_type,
Reference_Allele = fgfr3_status$Ref,
Tumor_Seq_Allele1 = fgfr3_status$Tum_Ref,
Tumor_Seq_Allele2 = fgfr3_status$Tum_Alt,
dbSNP_RS = fgfr3_status$`Known var`,
dbSNP_Val_Status = NA,
Tumor_Sample_Barcode = fgfr3_status$Sample,
Matched_Norm_Sample_Barcode = NA,
Match_Norm_Seq_Allele1 = fgfr3_status$Alt,
Match_Norm_Seq_Allele2 = NA,
Tumor_Validation_Allele1 = NA,
Tumor_Validation_Allele2 = NA,
Match_Norm_Validation_Allele1 = NA,
Match_Norm_Validation_Allele2 = NA,
Verification_Status = NA,
Validation_Status = NA,
Mutation_Status = NA,
Sequencing_Phase = NA,
Sequence_Source = fgfr3_status$`Seq type`,
Validation_Method = NA,
Score = NA,
BAM_File = NA,
Sequencer = NA,
Tumor_Sample_UUID = NA,
Matched_Norm_Sample_UUID = NA,
HGVSc = NA,
HGVSp = NA,
HGVSp_Short = NA,
Transcript_ID = NA,
Exon_Number = NA,
t_depth = NA,
t_ref_count = NA,
t_alt_count = NA,
n_depth = NA,
n_ref_count = NA,
n_alt_count = NA,
all_effects = NA,
Allele= NA,
Gene= NA,
Feature=NA,
Feature_type= NA,
Consequence= fgfr3_status$consequence,
cDNA_postion=fgfr3_status$`cDNA pos`,
CDS_position=fgfr3_status$`CDS pos`,
Protein_position=fgfr3_status$`protein pos`,
Amino_acids=fgfr3_status$`AA pos`,
Codons=fgfr3_status$`codon change`,
Existing_variation= fgfr3_status$`Known var`,
SIFT= fgfr3_status$SIFT,
IMPACT=fgfr3_status$Impact,
Tum_VAF=fgfr3_status$Tum_VAF)
merged_MAF <- rbind(MAF@data, new_variants_df, fill=TRUE)
setwd("path/to/folder")
save(merged_MAF, file = "FGFR3_Rick.rda")
MERGED_MAF <- MAF(nonSyn= merged_MAF)
write_main_maf(MERGED_MAF, "FGFR3_RICK")
FILE <- list.files(pattern = "FGFR3_RICK.maf")
new_MAF <- read.maf(FILE)
new_MAF@data
pdf("path/to/folder/FGFR3_oncoplot.pdf", width = 20, height = 20)
oncoplot(maf= Z , top = 200, fontSize = 0.3)
dev.off()
The issue I'm encountering is that after running this code, the total variants I find in MAF@data and new_MAF@data are exactly the same. However, I expected the new variants to be added to the MAF object. Could someone please help me identify what might be going wrong? The oncoplot did not show either FGFR3 in top mutated genes.
Thank you in advance for your assistance!