在 MATLAB 中更改 seqlogo 图形的 x 轴

发布于 2024-08-26 22:26:10 字数 1004 浏览 3 评论 0原文

我正在制作大量 seqlogo 以编程方式。它们有数百列宽,因此运行 seqlogo 通常会创建太细而看不见的字母。我注意到我只关心其中的一些列(不一定是连续的列)......大多数都是噪音,但有些是高度保守的。

我使用类似以下代码片段的内容:

wide_seqs = cell2mat(arrayfun(@randseq, repmat(200, [500 1]), 'uniformoutput', false));
wide_seqs(:, [17,30, 55,70,130]) = repmat(['ATCGG'], [500 1])

conserve_cell = seqlogo(wide_seqs, 'displaylogo', false);
high_bit_cols = any(conserve_cell{2}>1.0,1);
[~, handle] = seqlogo(wide_seqs(:,high_bit_cols ));

尽管当我这样做时,我丢失了有关数据来自哪些列的信息。

通常我只会更改 seqlogo 的 x 轴。然而,seqlogo 是某种疯狂的基于 java 的对象,并且调用如下:

set(handle, 'xticklabel', num2str(find(high_bit_cols)))

不起作用。任何帮助将不胜感激。

谢谢, 将

编辑:

在赏金上,我愿意接受任何类型的更改轴标签的疯狂方法,包括(但不限于):使用图像处理工具箱在保存后修改图像,使用文本框创建新的 seqlogo 函数、修改java代码(如果可能的话)等。我不愿意接受诸如“使用python”、“使用这个R库”或任何其他类型的非Matlab解决方案之类的东西。

I'm making a large number of seqlogos programmatically. They are hundreds of columns wide and so running a seqlogo normally creates letters that are too thin to see. I've noticed that I only care about a few of these columns (not necessarily consecutive columns) ... most are noise but some are highly conserved.

I use something like this snippet:

wide_seqs = cell2mat(arrayfun(@randseq, repmat(200, [500 1]), 'uniformoutput', false));
wide_seqs(:, [17,30, 55,70,130]) = repmat(['ATCGG'], [500 1])

conserve_cell = seqlogo(wide_seqs, 'displaylogo', false);
high_bit_cols = any(conserve_cell{2}>1.0,1);
[~, handle] = seqlogo(wide_seqs(:,high_bit_cols ));

Although when I do this I lose the information about which columns the data came from.

Normally I would just change the x-axis of the seqlogo. However, seqlogo's are some sort of crazy java-based object and calls like:

set(handle, 'xticklabel', num2str(find(high_bit_cols)))

don't work. Any help would be greatly appreciated.

Thanks,
Will

EDIT:

On the bounty I'm willing to accept any kind of crazy method for changing the axis labels include (but not limited to): Using the image-processing toolbox to modify the image after saving, creating a new seqlogo function using textboxes, modifying the java-code (if possible), etc. I'm NOT willing to accept things like "use python", "use this R library" or any other sort of non-Matlab solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

面如桃花 2024-09-02 22:26:10

好吧,我因为这个问题浪费了几个小时。看来您无法将任何 MATLAB 对象(轴或文本框)放置在该 hgjavacomponent 对象的顶部。当然,我无法修改java代码。所以我找到的唯一可行的解​​决方案是从头开始创建图形。

我不想重写代码来计算权重矩阵(符号高度),你已经这样做了。但如果您根本不想使用 MATLAB 的 seqlogo,这是可以做到的。所以我稍微改变了你的最后一行来获取矩阵:

[wm, handle] = seqlogo(wide_seqs(:,high_bit_cols ));

文本符号的问题是你无法精确控制它们的大小,无法将符号适合文本框。这可能就是 MATLAB 决定使用 java 图形对象的原因。但我们可以创建符号图像并处理它们。

下面是创建字母图像的代码:

letters = wm{1};
clr = [0 1 0; 0 0 1; 1 0.8 0;1 0 0]; % corresponding colors
for t = 1:numel(letters)
    hf = figure('position',[200 200 100 110],'color','w');
    ha = axes('parent',hf, 'visible','off','position',[0 0 1 1]);
    ht = text(50,55,letters(t),'color',clr(t,:),'units','pixels',...
        'fontsize',100,'fontweight','norm',...
        'vertical','mid','horizontal','center');
    F = getframe(hf); % rasterize the letter
    img = F.cdata;
    m = any(img < 255,3); % convert to binary image
    m(any(m,2),any(m,1))=1; % mask to cut white borders
    imwrite(reshape(img(repmat(m,[1 1 3])),[sum(any(m,2)) sum(any(m,1)) 3]),...
        [letters(t) '.png'])
    close(hf)
end

然后我们使用这些图像绘制新的 seqlogo 图:

xlabels = cellstr(num2str(find(high_bit_cols)'));
letters = wm{1};
wmat=wm{2}; % weight matrix from seqlogo
[nletters  npos] = size(wmat);
wmat(wmat<0) = 0; % cut negative values

% prepare the figure
clf
hAx = axes('parent',gcf,'visible','on');
set(hAx,'XLim',[0.5 npos+0.5],'XTick',1:npos,'XTickLabel',xlabels)
ymax = ceil(max(sum(wmat)));
ylim([0 ymax])
axpos = get(hAx,'Position');
step = axpos(3)/npos;

% place images of letters
for i=1:npos
    [wms idx] = sort(wmat(:,i)); % largest on the top
    let_show = letters(idx);
    ybot = axpos(2);
    for s=1:nletters
        if wms(s)==0, continue, end;
        axes('position',[axpos(1) ybot step wms(s)/ymax*axpos(4)])
        ybot = ybot + wms(s)/ymax*axpos(4);
        img = imread([let_show(s) '.png']);
        image(img)
        set(gca,'visible','off')
    end
    axpos(1)=axpos(1)+step;
end

这是结果:
alt text http://img716.imageshack.us/img716/2073/seqlogoexample.png

代码和当然,数字还可以进一步提高,但我希望这是你可以开始工作的东西。如果我错过了什么,请告诉我。

OK, I killed a few hours with this problem. It appears that you cannot place any MATLAB object (axes or textbox) on the top of that hgjavacomponent object. And I couldn't modified the java code, of course. So the only feasible solution I found is to create the figure from scratch.

I didn't want to rewrite the code to calculate weight matrices (symbols heights), you already did that. But it can be done, if you don't want to use MATLAB's seqlogo at all. So I've changed your last line a little to get the matrix:

[wm, handle] = seqlogo(wide_seqs(:,high_bit_cols ));

The problem with text symbols is that you cannot exactly control their size, cannot fit the symbol to textbox. This is probably why MATLAB decided to go with java graphic object. But we can create symbols images and deal with them.

Here is code to create images of letters:

letters = wm{1};
clr = [0 1 0; 0 0 1; 1 0.8 0;1 0 0]; % corresponding colors
for t = 1:numel(letters)
    hf = figure('position',[200 200 100 110],'color','w');
    ha = axes('parent',hf, 'visible','off','position',[0 0 1 1]);
    ht = text(50,55,letters(t),'color',clr(t,:),'units','pixels',...
        'fontsize',100,'fontweight','norm',...
        'vertical','mid','horizontal','center');
    F = getframe(hf); % rasterize the letter
    img = F.cdata;
    m = any(img < 255,3); % convert to binary image
    m(any(m,2),any(m,1))=1; % mask to cut white borders
    imwrite(reshape(img(repmat(m,[1 1 3])),[sum(any(m,2)) sum(any(m,1)) 3]),...
        [letters(t) '.png'])
    close(hf)
end

Then we use those images to draw new seqlogo plot:

xlabels = cellstr(num2str(find(high_bit_cols)'));
letters = wm{1};
wmat=wm{2}; % weight matrix from seqlogo
[nletters  npos] = size(wmat);
wmat(wmat<0) = 0; % cut negative values

% prepare the figure
clf
hAx = axes('parent',gcf,'visible','on');
set(hAx,'XLim',[0.5 npos+0.5],'XTick',1:npos,'XTickLabel',xlabels)
ymax = ceil(max(sum(wmat)));
ylim([0 ymax])
axpos = get(hAx,'Position');
step = axpos(3)/npos;

% place images of letters
for i=1:npos
    [wms idx] = sort(wmat(:,i)); % largest on the top
    let_show = letters(idx);
    ybot = axpos(2);
    for s=1:nletters
        if wms(s)==0, continue, end;
        axes('position',[axpos(1) ybot step wms(s)/ymax*axpos(4)])
        ybot = ybot + wms(s)/ymax*axpos(4);
        img = imread([let_show(s) '.png']);
        image(img)
        set(gca,'visible','off')
    end
    axpos(1)=axpos(1)+step;
end

Here is the result:
alt text http://img716.imageshack.us/img716/2073/seqlogoexample.png

The code and figure can be further improved, of course, but I hope this is something you can start working with. Let me know if I miss something.

嘿看小鸭子会跑 2024-09-02 22:26:10

我遇到了同样的问题yuk做了 尝试修改 SEQLOGO,所以这是我尝试自己的版本来模仿它的外观。这是一个函数 seqlogo_new.m,您可以提供两个参数:您的序列和可选的最小位值。它需要一个图像文件 ACGT.jpg,可以在 在此链接

以下是该函数的代码:

function hFigure = seqlogo_new(S,minBits)
%# SEQLOGO_NEW   Displays sequence logos for DNA.
%#   HFIGURE = SEQLOGO_NEW(SEQS,MINBITS) displays the
%#   sequence logo for a set of aligned sequences SEQS,
%#   showing only those columns containing at least one
%#   nucleotide with a minimum bit value MINBITS. The
%#   MINBITS parameter is optional. SEQLOGO_NEW returns
%#   a handle to the rendered figure HFIGURE.
%#
%#   SEQLOGO_NEW calls SEQLOGO to perform some of the
%#   computations, so to use this function you will need
%#   access to the Bioinformatics Toolbox.
%#
%#   See also seqlogo.

%# Author: Ken Eaton
%# Version: MATLAB R2009a
%# Last modified: 3/30/10
%#---------------------------------------------------------

  %# Get the weight matrix from SEQLOGO:

  W = seqlogo(S,'DisplayLogo',false);
  bitValues = W{2};

  %# Select columns with a minimum bit value:

  if nargin > 1
    highBitCols = any(bitValues > minBits,1);  %# Plot only high-bit columns
    bitValues = bitValues(:,highBitCols);
  else
    highBitCols = true(1,size(bitValues,2));   %# Plot all columns
  end

  %# Sort the bit value data:

  [bitValues,charIndex] = sort(bitValues,'descend');  %# Sort the columns
  nSequence = size(bitValues,2);                      %# Number of sequences
  maxBits = ceil(max(bitValues(:)));                  %# Upper plot limit

  %# Break 4-letter image into a 1-by-4 cell array of images:

  imgACGT = imread('ACGT.jpg');                %# Image of 4 letters
  [nRows,nCols,nPages] = size(imgACGT);        %# Raw image size
  letterIndex = round(linspace(1,nCols+1,5));  %# Indices of letter tile edges
  letterImages = {imgACGT(:,letterIndex(1):letterIndex(2)-1,:), ...
                  imgACGT(:,letterIndex(2):letterIndex(3)-1,:), ...
                  imgACGT(:,letterIndex(3):letterIndex(4)-1,:), ...
                  imgACGT(:,letterIndex(4):letterIndex(5)-1,:)};

  %# Create the image texture map:

  blankImage = repmat(uint8(255),[nRows round(nCols/4) 3]);  %# White image
  fullImage = repmat({blankImage},4,2*nSequence-1);  %# Cell array of images
  fullImage(:,1:2:end) = letterImages(charIndex);    %# Add letter images
  fullImage = cat(1,cat(2,fullImage{1,:}),...        %# Collapse cell array
                    cat(2,fullImage{2,:}),...        %#   to one 3-D image
                    cat(2,fullImage{3,:}),...
                    cat(2,fullImage{4,:}));

  %# Initialize coordinates for the texture-mapped surface:

  X = [(1:nSequence)-0.375; (1:nSequence)+0.375];
  X = repmat(X(:)',5,1);     %'# Surface x coordinates
  Y = [zeros(1,nSequence); cumsum(flipud(bitValues))];
  Y = kron(flipud(Y),[1 1]);  %# Surface y coordinates
  Z = zeros(5,2*nSequence);   %# Surface z coordinates

  %# Render the figure:

  figureSize = [602 402];                   %# Figure size
  screenSize = get(0,'ScreenSize');         %# Screen size
  offset = (screenSize(3:4)-figureSize)/2;  %# Offset to center figure
  hFigure = figure('Units','pixels',...
                   'Position',[offset figureSize],...
                   'Color',[1 1 1],...
                   'Name','Sequence Logo',...
                   'NumberTitle','off');
  axes('Parent',hFigure,...
       'Units','pixels',...
       'Position',[60 100 450 245],...
       'FontWeight','bold',...
       'LineWidth',3,...
       'TickDir','out',...
       'XLim',[0.5 nSequence+0.5],...
       'XTick',1:nSequence,...
       'XTickLabel',num2str(find(highBitCols)'),...  %'
       'YLim',[-0.03 maxBits],...
       'YTick',0:maxBits);
  xlabel('Sequence Position');
  ylabel('Bits');
  surface(X,Y,Z,fullImage,...
          'FaceColor','texturemap',...
          'EdgeColor','none');
  view(2);

end

以下是其用法的一些示例:

S = ['ATTATAGCAAACTA'; ...  %# Sample data
     'AACATGCCAAAGTA'; ...
     'ATCATGCAAAAGGA'];
seqlogo_new(S);             %# A normal plot similar to SEQLOGO

alt text

seqlogo_new(S,1);        %# Plot only columns with bits > 1

< img src="https://i37.photobucket.com/albums/e77/kpeaton/example_2.jpg" alt="替代文字">

I ran into the same problems yuk did trying to modify the figure from SEQLOGO, so here's my attempt at my own version to mimic its appearance. It's a function seqlogo_new.m that you give two arguments: your sequence and an optional minimum bit value. It requires an image file ACGT.jpg that can be found at this link.

Here's the code for the function:

function hFigure = seqlogo_new(S,minBits)
%# SEQLOGO_NEW   Displays sequence logos for DNA.
%#   HFIGURE = SEQLOGO_NEW(SEQS,MINBITS) displays the
%#   sequence logo for a set of aligned sequences SEQS,
%#   showing only those columns containing at least one
%#   nucleotide with a minimum bit value MINBITS. The
%#   MINBITS parameter is optional. SEQLOGO_NEW returns
%#   a handle to the rendered figure HFIGURE.
%#
%#   SEQLOGO_NEW calls SEQLOGO to perform some of the
%#   computations, so to use this function you will need
%#   access to the Bioinformatics Toolbox.
%#
%#   See also seqlogo.

%# Author: Ken Eaton
%# Version: MATLAB R2009a
%# Last modified: 3/30/10
%#---------------------------------------------------------

  %# Get the weight matrix from SEQLOGO:

  W = seqlogo(S,'DisplayLogo',false);
  bitValues = W{2};

  %# Select columns with a minimum bit value:

  if nargin > 1
    highBitCols = any(bitValues > minBits,1);  %# Plot only high-bit columns
    bitValues = bitValues(:,highBitCols);
  else
    highBitCols = true(1,size(bitValues,2));   %# Plot all columns
  end

  %# Sort the bit value data:

  [bitValues,charIndex] = sort(bitValues,'descend');  %# Sort the columns
  nSequence = size(bitValues,2);                      %# Number of sequences
  maxBits = ceil(max(bitValues(:)));                  %# Upper plot limit

  %# Break 4-letter image into a 1-by-4 cell array of images:

  imgACGT = imread('ACGT.jpg');                %# Image of 4 letters
  [nRows,nCols,nPages] = size(imgACGT);        %# Raw image size
  letterIndex = round(linspace(1,nCols+1,5));  %# Indices of letter tile edges
  letterImages = {imgACGT(:,letterIndex(1):letterIndex(2)-1,:), ...
                  imgACGT(:,letterIndex(2):letterIndex(3)-1,:), ...
                  imgACGT(:,letterIndex(3):letterIndex(4)-1,:), ...
                  imgACGT(:,letterIndex(4):letterIndex(5)-1,:)};

  %# Create the image texture map:

  blankImage = repmat(uint8(255),[nRows round(nCols/4) 3]);  %# White image
  fullImage = repmat({blankImage},4,2*nSequence-1);  %# Cell array of images
  fullImage(:,1:2:end) = letterImages(charIndex);    %# Add letter images
  fullImage = cat(1,cat(2,fullImage{1,:}),...        %# Collapse cell array
                    cat(2,fullImage{2,:}),...        %#   to one 3-D image
                    cat(2,fullImage{3,:}),...
                    cat(2,fullImage{4,:}));

  %# Initialize coordinates for the texture-mapped surface:

  X = [(1:nSequence)-0.375; (1:nSequence)+0.375];
  X = repmat(X(:)',5,1);     %'# Surface x coordinates
  Y = [zeros(1,nSequence); cumsum(flipud(bitValues))];
  Y = kron(flipud(Y),[1 1]);  %# Surface y coordinates
  Z = zeros(5,2*nSequence);   %# Surface z coordinates

  %# Render the figure:

  figureSize = [602 402];                   %# Figure size
  screenSize = get(0,'ScreenSize');         %# Screen size
  offset = (screenSize(3:4)-figureSize)/2;  %# Offset to center figure
  hFigure = figure('Units','pixels',...
                   'Position',[offset figureSize],...
                   'Color',[1 1 1],...
                   'Name','Sequence Logo',...
                   'NumberTitle','off');
  axes('Parent',hFigure,...
       'Units','pixels',...
       'Position',[60 100 450 245],...
       'FontWeight','bold',...
       'LineWidth',3,...
       'TickDir','out',...
       'XLim',[0.5 nSequence+0.5],...
       'XTick',1:nSequence,...
       'XTickLabel',num2str(find(highBitCols)'),...  %'
       'YLim',[-0.03 maxBits],...
       'YTick',0:maxBits);
  xlabel('Sequence Position');
  ylabel('Bits');
  surface(X,Y,Z,fullImage,...
          'FaceColor','texturemap',...
          'EdgeColor','none');
  view(2);

end

And here are a few examples of its usage:

S = ['ATTATAGCAAACTA'; ...  %# Sample data
     'AACATGCCAAAGTA'; ...
     'ATCATGCAAAAGGA'];
seqlogo_new(S);             %# A normal plot similar to SEQLOGO

alt text

seqlogo_new(S,1);        %# Plot only columns with bits > 1

alt text

北城半夏 2024-09-02 22:26:10

所以我使用 yuk 和 gnovice 的解决方案创建了另一个解决方案。当我研究该解决方案时,我意识到我真的很希望能够将输出用作“子图”,并且能够任意更改字母的颜色。

由于 yuk 使用以编程方式放置嵌入字母的坐标区对象,因此修改他的代码以绘制到任意坐标区对象会非常烦人(尽管并非不可能)。由于 gnovice 的解决方案从预先创建的文件中读取字母,因此很难修改代码以针对任意配色方案或字母选择运行。所以我的解决方案使用了 yuk 解决方案中的“字母生成”代码和 gnovice 解决方案中的“图像叠加”方法。

还有大量的参数解析和检查。下面是我的组合解决方案......我只是为了完整性而将其包括在内,显然我无法赢得自己的赏金。我将让社区决定奖项,并将赏金给予在时限结束时评分最高的人……如果出现平局,我会将其给予代表最低的人(他们可能更“需要”它)。

function [npos, handle] = SeqLogoFig(SEQ, varargin)
%   SeqLogoFig
%       A function which wraps around the bioinformatics SeqLogo command
%       and creates a figure which is actually a MATLAB figure.  All
%       agruements for SEQLOGO are passed along to the seqlogo calculation.
%       It also supports extra arguements for plotting.
%
%   [npos, handle] = SeqLogoFig(SEQ);
%
%       SEQ             A multialigned set of sequences that is acceptable
%                       to SEQLOGO.
%       npos            The positions that were actually plotted
%       handle          An axis handle to the object that was plotted.
%
%   Extra Arguements:
%       
%       'CUTOFF'        A bit-cutoff to use for deciding which columns to
%                       plot.  Any columns that have a MAX value which is
%                       greater than CUTOFF will be provided.  Defaults to
%                       1.25 for NT and 2.25 for AA.
%
%       'TOP-N'         Plots only the top N columns as ranked by thier MAX
%                       bit conservation.
%
%       'AXES_HANDLE'   An axis handle to plot the seqlogo into.
%       
%       'INDS'          A set of indices to to plot.  This overrides any
%                       CUTOFF or TOP-N that were provided
%
%
%
%

%% Parse the input arguements
ALPHA = 'nt';
MAX_BITS = 2.5;
RES = [200 80];
CUTOFF = [];
TOPN = [];
rm_inds = [];
colors = [];
handle = [];
npos = [];


for i = 1:2:length(varargin)
    if strcmpi(varargin{i}, 'alphabet')
        ALPHA = varargin{i+1};

    elseif strcmpi(varargin{i}, 'cutoff')
        CUTOFF = varargin{i+1};
        %we need to remove these so seqlogo doesn't get confused
        rm_inds = [rm_inds i, i+1]; %#ok<*AGROW>

    elseif strcmpi(varargin{i}, 'colors')
        colors = varargin{i+1};
        rm_inds = [rm_inds i, i+1]; 
    elseif strcmpi(varargin{i}, 'axes_handle')
        handle = varargin{i+1};
        rm_inds = [rm_inds i, i+1]; 
    elseif strcmpi(varargin{i}, 'top-n')
        TOPN = varargin{i+1};
        rm_inds = [rm_inds i, i+1];
    elseif strcmpi(varargin{i}, 'inds')
        npos = varargin{i+1};
        rm_inds = [rm_inds i, i+1];
    end
end

if ~isempty(rm_inds)
    varargin(rm_inds) = [];
end

if isempty(colors)
    colors = GetColors(ALPHA);
end

if strcmpi(ALPHA, 'nt')
    MAX_BITS = 2.5;
elseif strcmpi(ALPHA, 'aa')
    MAX_BITS = 4.5;
end

if isempty(CUTOFF)
    CUTOFF = 0.5*MAX_BITS;
end


%% Calculate the actual seqlogo.
wm = seqlogo(SEQ, varargin{:}, 'displaylogo', false);


%% Generate the letters
letters = wm{1};
letter_wins = cell(size(letters));
[~, loc] = ismember(letters, colors(:,1));
loc(loc == 0) = size(colors,1);
clr = cell2mat(colors(loc, 2)); % corresponding colors
for t = 1:numel(letters)
    hf = figure('position',[200 200 100 110],'color','w');
    ha = axes('parent',hf, 'visible','off','position',[0 0 1 1]);
    ht = text(50,55,letters(t),'color',clr(t,:),'units','pixels',...
        'fontsize',100,'fontweight','norm',...
        'vertical','mid','horizontal','center');
    F = getframe(hf); % rasterize the letter
    img = F.cdata;
    m = any(img < 255,3); % convert to binary image
    m(any(m,2),any(m,1))=1; % mask to cut white borders
    letter_wins{t} = reshape(img(repmat(m,[1 1 3])),[sum(any(m,2)) sum(any(m,1)) 3]);
    close(hf);
end


%% Use the letters to generate a figure

%create a "image" that will hold the final data
wmat = wm{2};
if isempty(npos)
    if isempty(TOPN)
        npos = find(any(wmat>CUTOFF,1));
    else
        [~, i] = sort(max(wmat,[],1), 'descend');
        npos = sort(i(1:TOPN));
    end
end

fig_data = 255*ones(RES(1), RES(2)*(length(npos)+1)+length(npos)*2,3);
bitscores = linspace(0, MAX_BITS, size(fig_data,1));
tick_pos = zeros(length(npos),1);
% place images of letters
for i=1:length(npos)
    [wms idx] = sort(wmat(:,npos(i)), 'descend'); % largest on the top
    bits = [flipud(cumsum(flipud(wms))); 0];
    let_data = letter_wins(idx(wms>0));
    for s=1:length(let_data)
        start_pos = find(bitscores>=bits(s),1);
        end_pos = find(bitscores<=bits(s+1),1, 'last');
        if isempty(start_pos) || isempty(end_pos) || end_pos > start_pos
            continue
        end
        img_win = imresize(let_data{s}, [start_pos-end_pos, RES(2)]);

        fig_data(start_pos-1:-1:end_pos, (i*RES(2)-RES(2)*.5:i*RES(2)+RES(2)*.5-1)+2*i,:) = img_win;
    end
    tick_pos(i) = i*RES(2)+2*i;
end
if ~isempty(handle)
    image(handle,[0 size(fig_data,2)], [0 MAX_BITS],fig_data./255)
else
    handle = image([0 size(fig_data,2)], [0 MAX_BITS],fig_data./255);
end
set(gca, 'ydir', 'normal', 'xtick', tick_pos, ...
        'userdata', tick_pos, 'xticklabel', npos);
xlabel('position')
ylabel('bits')


function colors = GetColors(alpha)
% get the standard colors for the sequence logo
if strcmpi(alpha, 'nt')
    colors = cell(6,2);
    colors(1,:) = {'A', [0 1 0]};
    colors(2,:) = {'C', [0 0 1]};
    colors(3,:) = {'G', [1 1 0]};
    colors(4,:) = {'T', [1 0 0]};
    colors(5,:) = {'U', [1 0 0]};
    colors(6,:) = {'', [1 0 1]};
elseif strcmpi(alpha, 'aa')
    colors = cell(21,2);
    colors(1,:) = {'G', [0 1 0]};
    colors(2,:) = {'S', [0 1 0]};
    colors(3,:) = {'T', [0 1 0]};
    colors(4,:) = {'Y', [0 1 0]};
    colors(5,:) = {'C', [0 1 0]};
    colors(6,:) = {'Q', [0 1 0]};
    colors(7,:) = {'N', [0 1 0]};
    colors(8,:) = {'A', [1 165/255 0]};
    colors(9,:) = {'V', [1 165/255 0]};
    colors(10,:) = {'L', [1 165/255 0]};
    colors(11,:) = {'I', [1 165/255 0]};
    colors(12,:) = {'P', [1 165/255 0]};
    colors(13,:) = {'W', [1 165/255 0]};
    colors(14,:) = {'F', [1 165/255 0]};
    colors(15,:) = {'M', [1 165/255 0]};
    colors(16,:) = {'D', [1 0 0]};
    colors(17,:) = {'E', [1 0 0]};
    colors(18,:) = {'K', [0 0 1]};
    colors(19,:) = {'R', [0 0 1]};
    colors(20,:) = {'H', [0 0 1]};
    colors(21,:) = {'', [210/255 180/255 140/255]};
else
    error('SeqLogoFigure:BADALPHA', ...
            'An unknown alphabet was provided: %s', alpha)
end

我已将其提交给 Mathworks FileExchange ...获得批准后,我将发布一个链接。

我唯一烦恼的是,当它创建字母图像时,它会快速显示小图形窗口。如果有人知道可以避免的技巧,我很想听听。

编辑:Mathworks 已批准我提交的文件...您可以在 FileExchange 上下载:http: //www.mathworks.com/matlabcentral/fileexchange/27124

So I've created another solution using pieces of both yuk and gnovice's solution. As I played around with the solution I realized I would really like to be able to use the output as "subplots" and be able to change the color of letters arbitrarily.

Since yuk used programitically placed axes objects with the letter embedded it would have been very annoying (although not impossible) to modify his code to plot into an arbitrary axes object. Since gnovice's solution read the letters from a pre-created file it would have been difficult to modify the code to run against arbitrary color schemes or letter choices. So my solution uses the "letter generation" code from yuk's solution and the "image superimposing" method from gnovice's solution.

There is also a significant amount of argument parsing and checking. Below is my combined solution ... I'm including it only for completeness, I obviously can't win my own bounty. I'll let the community decide the award and give the bounty to whoever has the highest rating at the end of the time-limit ... in the event of a tie I'll give it to the person with the lowest rep (they probably "need" it more).

function [npos, handle] = SeqLogoFig(SEQ, varargin)
%   SeqLogoFig
%       A function which wraps around the bioinformatics SeqLogo command
%       and creates a figure which is actually a MATLAB figure.  All
%       agruements for SEQLOGO are passed along to the seqlogo calculation.
%       It also supports extra arguements for plotting.
%
%   [npos, handle] = SeqLogoFig(SEQ);
%
%       SEQ             A multialigned set of sequences that is acceptable
%                       to SEQLOGO.
%       npos            The positions that were actually plotted
%       handle          An axis handle to the object that was plotted.
%
%   Extra Arguements:
%       
%       'CUTOFF'        A bit-cutoff to use for deciding which columns to
%                       plot.  Any columns that have a MAX value which is
%                       greater than CUTOFF will be provided.  Defaults to
%                       1.25 for NT and 2.25 for AA.
%
%       'TOP-N'         Plots only the top N columns as ranked by thier MAX
%                       bit conservation.
%
%       'AXES_HANDLE'   An axis handle to plot the seqlogo into.
%       
%       'INDS'          A set of indices to to plot.  This overrides any
%                       CUTOFF or TOP-N that were provided
%
%
%
%

%% Parse the input arguements
ALPHA = 'nt';
MAX_BITS = 2.5;
RES = [200 80];
CUTOFF = [];
TOPN = [];
rm_inds = [];
colors = [];
handle = [];
npos = [];


for i = 1:2:length(varargin)
    if strcmpi(varargin{i}, 'alphabet')
        ALPHA = varargin{i+1};

    elseif strcmpi(varargin{i}, 'cutoff')
        CUTOFF = varargin{i+1};
        %we need to remove these so seqlogo doesn't get confused
        rm_inds = [rm_inds i, i+1]; %#ok<*AGROW>

    elseif strcmpi(varargin{i}, 'colors')
        colors = varargin{i+1};
        rm_inds = [rm_inds i, i+1]; 
    elseif strcmpi(varargin{i}, 'axes_handle')
        handle = varargin{i+1};
        rm_inds = [rm_inds i, i+1]; 
    elseif strcmpi(varargin{i}, 'top-n')
        TOPN = varargin{i+1};
        rm_inds = [rm_inds i, i+1];
    elseif strcmpi(varargin{i}, 'inds')
        npos = varargin{i+1};
        rm_inds = [rm_inds i, i+1];
    end
end

if ~isempty(rm_inds)
    varargin(rm_inds) = [];
end

if isempty(colors)
    colors = GetColors(ALPHA);
end

if strcmpi(ALPHA, 'nt')
    MAX_BITS = 2.5;
elseif strcmpi(ALPHA, 'aa')
    MAX_BITS = 4.5;
end

if isempty(CUTOFF)
    CUTOFF = 0.5*MAX_BITS;
end


%% Calculate the actual seqlogo.
wm = seqlogo(SEQ, varargin{:}, 'displaylogo', false);


%% Generate the letters
letters = wm{1};
letter_wins = cell(size(letters));
[~, loc] = ismember(letters, colors(:,1));
loc(loc == 0) = size(colors,1);
clr = cell2mat(colors(loc, 2)); % corresponding colors
for t = 1:numel(letters)
    hf = figure('position',[200 200 100 110],'color','w');
    ha = axes('parent',hf, 'visible','off','position',[0 0 1 1]);
    ht = text(50,55,letters(t),'color',clr(t,:),'units','pixels',...
        'fontsize',100,'fontweight','norm',...
        'vertical','mid','horizontal','center');
    F = getframe(hf); % rasterize the letter
    img = F.cdata;
    m = any(img < 255,3); % convert to binary image
    m(any(m,2),any(m,1))=1; % mask to cut white borders
    letter_wins{t} = reshape(img(repmat(m,[1 1 3])),[sum(any(m,2)) sum(any(m,1)) 3]);
    close(hf);
end


%% Use the letters to generate a figure

%create a "image" that will hold the final data
wmat = wm{2};
if isempty(npos)
    if isempty(TOPN)
        npos = find(any(wmat>CUTOFF,1));
    else
        [~, i] = sort(max(wmat,[],1), 'descend');
        npos = sort(i(1:TOPN));
    end
end

fig_data = 255*ones(RES(1), RES(2)*(length(npos)+1)+length(npos)*2,3);
bitscores = linspace(0, MAX_BITS, size(fig_data,1));
tick_pos = zeros(length(npos),1);
% place images of letters
for i=1:length(npos)
    [wms idx] = sort(wmat(:,npos(i)), 'descend'); % largest on the top
    bits = [flipud(cumsum(flipud(wms))); 0];
    let_data = letter_wins(idx(wms>0));
    for s=1:length(let_data)
        start_pos = find(bitscores>=bits(s),1);
        end_pos = find(bitscores<=bits(s+1),1, 'last');
        if isempty(start_pos) || isempty(end_pos) || end_pos > start_pos
            continue
        end
        img_win = imresize(let_data{s}, [start_pos-end_pos, RES(2)]);

        fig_data(start_pos-1:-1:end_pos, (i*RES(2)-RES(2)*.5:i*RES(2)+RES(2)*.5-1)+2*i,:) = img_win;
    end
    tick_pos(i) = i*RES(2)+2*i;
end
if ~isempty(handle)
    image(handle,[0 size(fig_data,2)], [0 MAX_BITS],fig_data./255)
else
    handle = image([0 size(fig_data,2)], [0 MAX_BITS],fig_data./255);
end
set(gca, 'ydir', 'normal', 'xtick', tick_pos, ...
        'userdata', tick_pos, 'xticklabel', npos);
xlabel('position')
ylabel('bits')


function colors = GetColors(alpha)
% get the standard colors for the sequence logo
if strcmpi(alpha, 'nt')
    colors = cell(6,2);
    colors(1,:) = {'A', [0 1 0]};
    colors(2,:) = {'C', [0 0 1]};
    colors(3,:) = {'G', [1 1 0]};
    colors(4,:) = {'T', [1 0 0]};
    colors(5,:) = {'U', [1 0 0]};
    colors(6,:) = {'', [1 0 1]};
elseif strcmpi(alpha, 'aa')
    colors = cell(21,2);
    colors(1,:) = {'G', [0 1 0]};
    colors(2,:) = {'S', [0 1 0]};
    colors(3,:) = {'T', [0 1 0]};
    colors(4,:) = {'Y', [0 1 0]};
    colors(5,:) = {'C', [0 1 0]};
    colors(6,:) = {'Q', [0 1 0]};
    colors(7,:) = {'N', [0 1 0]};
    colors(8,:) = {'A', [1 165/255 0]};
    colors(9,:) = {'V', [1 165/255 0]};
    colors(10,:) = {'L', [1 165/255 0]};
    colors(11,:) = {'I', [1 165/255 0]};
    colors(12,:) = {'P', [1 165/255 0]};
    colors(13,:) = {'W', [1 165/255 0]};
    colors(14,:) = {'F', [1 165/255 0]};
    colors(15,:) = {'M', [1 165/255 0]};
    colors(16,:) = {'D', [1 0 0]};
    colors(17,:) = {'E', [1 0 0]};
    colors(18,:) = {'K', [0 0 1]};
    colors(19,:) = {'R', [0 0 1]};
    colors(20,:) = {'H', [0 0 1]};
    colors(21,:) = {'', [210/255 180/255 140/255]};
else
    error('SeqLogoFigure:BADALPHA', ...
            'An unknown alphabet was provided: %s', alpha)
end

I've submitted this to the Mathworks FileExchange ... when its approved I'll post a link.

The only nagging annoyance I have is that as it creates the letter images it displays little figure windows in rapid speed. If anyone knows a trick that could get avoid that I'd love to hear it.

EDIT: Mathworks has approved my submitted file ... you can download it at the FileExchange here: http://www.mathworks.com/matlabcentral/fileexchange/27124

无可置疑 2024-09-02 22:26:10

关于 x 轴,该图似乎不包含标准轴(findobj(handle,'type','axes') 为空),而是类 com. mathworks.toolbox.bioinfo.sequence.SequenceLogo ...

在不相关的注释中,您可以用更简单的调用替换第一行:

wide_seqs = reshape(randseq(200*500),[],200);

About the x-axis, it seems that the figure contains no standard axis (findobj(handle,'type','axes') is empty), rather a custom object of class com.mathworks.toolbox.bioinfo.sequence.SequenceLogo ...

On an unrelated note, you can replace your first line with a simpler call to:

wide_seqs = reshape(randseq(200*500),[],200);
荒路情人 2024-09-02 22:26:10

如果轴是 java 对象,那么您可能需要使用 uiinspect。这可能会让您知道应该编辑什么以获得您想要的行为(不幸的是,我没有工具箱,所以我无法为您查找)。

If the axes are a java object, then you may want to have a look at its methods and properties with uiinspect. This might give you an idea what you should edit to get the behavior you want (unfortunately, I don't have the toolbox, so I can't look it up for you).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文