I have written a perl script, and need help to debug it. I am pretty new to Perl, and therefore do not know quite where I have gone wrong. Any help would be MUCH appreciated. The code will be put in below.
Essentially, it will first read in two protein sequences from an alignment file. It will then go to a subroutine "sub readBlosum", which will read in a scoring matrix from a file, with Z being one axis and X being the other, to get a value for each pair of protein residues. This subroutine will then score the sequence against the matrix. Finally, a bunch of calculations will be done, and then it should spit out a number at the end. However, my Perl debugger is giving me errors, and as I am new to this, I am not quite sure what is wrong.
Essentially, it will first read in two protein sequences from an alignment file. It will then go to a subroutine "sub readBlosum", which will read in a scoring matrix from a file, with Z being one axis and X being the other, to get a value for each pair of protein residues. This subroutine will then score the sequence against the matrix. Finally, a bunch of calculations will be done, and then it should spit out a number at the end. However, my Perl debugger is giving me errors, and as I am new to this, I am not quite sure what is wrong.
Code:
#!/usr/bin/perl -w use warnings; use strict; my $protein1 = (""); my $protein2 = (""); my @seq1 = (""); my @seq2 = (""); my @line = (""); my @seq_line= (""); my $readBLOSUM = "blosum62.txt"; #parse sequences by residues while (<STDIN>) { if ($_ =~ m/^(At[1-5][g][0-9]{5})(\S+)(.*)/g) {$protein1=$2;} elsif ($_ =~ m/^(Hs[1-9]{8})(\S+)(.*)/g) {$protein2 = $2;} if ($_ =~ /(^\|*|.|:)/n) {null;} else { @line = split($_,/\s+/); # split the sequence and sequence length #number parts of the line to an array with two entries @seq_line = split($line[1],//); # put sequence to an array by character } if ($line[0]=m/^(At[1-5][g][0-9]{5})(\S+)(.*)/g) {@seq_line-> @seq1} elsif ($line[0] = m/^(Hs[1-9]{8})(\S+)(.*)/g) {@seq_line-> @seq2} readBlosum; sub readBlosum{ open BLOSUM , "blosum62" or die "Unable to open inputFile."; my $line = ; my @acidIndices = split(/ /, $line); my %blosumMatrix = (); while () { my($line) = $_; my @scores = split(/ /, $line); my $currAcid = $scores[0]; for ( my $i = 1; $i < scalar @scores; $i++) { $blosumMatrix{$currAcid}{$acidIndices[$i]} = $scores[$i]; } close BLOSUM; return \%blosumMatrix; } my $blosumMat = &readBlosum(); $blosumMat->{”Z”}->{”X”}; #calculate weights of sequence 1/sequence 2 alignment and sum #S1 is the similarity score total; lengthseq; expscore; s1norm; unrelscore; unrelnorm; rawdist; scoredist; $scoremat=$blosumMat; sub total {my ($s1, $scoremat); my $s1 = 0; foreach $scoremat { $s1+= $scoremat;} return ($s1);} #calculate length sub lengthseq{ $length = @seq1 ; $len=length($length);} #calculate expected score sub expscore {$expscore= (-0.52) * $len} #calculate Normalized score sub s1norm {$s1norm= $s1 - $expscore} #calculate unrelated score sub unrelscore {$unrelscore=(([7769 + 9529)/2)} #calculate normalized unrelated score sub unrelnorm {$unrelnorm=$unrelscore-$s1norm} #calculate raw distance sub rawdist {$rawdist=-ln($s1norm/$unrelscore)*100} #calculate scoredist estimate sub scoredist {$scoredist=1.1775*$rawdist} print $scoredist;
Comment